Talks and Videos

Coherence in Statistical Modeling of Networks

October 26, 2017 at Johns Hopkins, Applied Mathematics and Statistics Department


George Box famously said, "All models are wrong, but some are useful." Classic texts define a statistical model as "a set of distributions on the sample space" (Cox and Hinkley, 1976; Lehman, 1983; Barndorff-Nielson and Cox, Bernardo and Smith, 1994). Motivated by some longstanding questions in the analysis of network data, I will examine both of these statements, first from a general point of view, and then in the context of some recent developments in network analysis. The confusion caused by these statements is clarified by the realization that the definition of statistical model must be refined --- it must be more than just a set. With this, the ambiguity in Box's statement --- e.g., what determines whether a model is 'wrong' or 'useful'? --- can be clarified by a logical property that I call 'coherence'. After clarification, a model is deemed useful as long as it is coherent, i.e., inferences from it 'make sense'. I will then discuss some implications for the statistical modeling of network data.

Probabilities as Shapes

April 10, 2017 at Rutgers, Philosophy Department, Foundations of Probability Seminar


In mathematics, statistics, and perhaps even in our intuition, it is conventional to regard probabilities as numbers, but I prefer instead to think of them as shapes. I'll explain how and why I prefer to think of probabilities as shapes instead of numbers, and will discuss how these probability shapes can be formalized in terms of infinity groupoids (or homotopy types) from homotopy type theory (HoTT).

Markov process models for time-varying networks

December 13, 2016 at Isaac Newton Institute, Cambridge University


Many models for dynamic networks, such as the preferential attachment model, describe evolution by sequential addition of vertices and/or edges. Such models are not suited to networks whose connectivity varies over time, as in social relationships and other kinds of temporally varying interactions. For modeling in this latter setting, I develop the general theory of exchangeable Markov processes for time-varying networks and discuss relevant consequences.

Tamara Broderick (MIT) Research Misconduct - Edge Exchangeability

November 20, 2016


A NIPS 2016 paper by Tamara Broderick, Diana Cai and Trevor Campbell claims credit for the idea of edge exchangeability with a misleading and improper attribution of my prior work with Walter Dempsey. Edge exchangeability was introduced over a year ago in a paper by Harry Crane and Walter Dempsey. Much of the material in the Broderick-Cai-Campbell paper resembles that of Crane-Dempsey without acknowledgement.

Edge exchangeability: a new foundation for modeling network data

July 25, 2016 at Isaac Newton Institute, Cambridge University


Exchangeable models for countable vertex-labeled graphs can- not replicate the large sample behaviors of sparsity and power law degree distribution observed in many network datasets. Out of this mathematical impossibility emerges the question of how network data can be modeled in a way that reflects known empirical behaviors and respects basic statistical principles. We address this question by observing that edges, not vertices, act as the statistical units in networks constructed from interaction data, making a theory of edge-labeled networks more natural for many applications. In this context we introduce the concept of edge exchangeability, which unlike its vertex exchangeable counterpart admits models for net- works with sparse and/or power law structure. Our characterization of edge exchangeable networks gives rise to a class of nonparametric models, akin to graphon models in the vertex exchangeable setting. Within this class, we identify a tractable family of distributions with a clear interpretation and suitable theoretical properties, whose significance in estimation, prediction, and testing we demonstrate.

Pattern avoidance for random permutations

February 18, 2016 at Rutgers University, Experimental Mathematics Seminar


A classic question of enumerative combinatorics is: How many permutations of {1,...,n} avoid a given pattern? I recast this question in probabilistic terms: What is the probability that a randomly generated permutation of {1,...,n} avoids a given pattern?
I consider this question for the Mallows distribution on permutations, of which the uniform distribution is a special case. I discuss how the probabilistic technique of Poisson approximation can be applied to bound the total variation distance between the Poisson distribution and the distribution of the number of occurrences of a fixed pattern in a random permutation. In the special case of the uniform distribution, we obtain bounds on the number of pattern avoiding permutations of all finite sizes.

Random partitions and permutations

November 20, 2014 at Rutgers University, Experimental Mathematics Seminar


Historically, enumerative combinatorics and discrete probability theory are closely related through uniform probability distributions on finite sets. I will first explain why the uniform distribution is unnatural in some modern applications and then survey several aspects of non-uniform random partitions and permutations. The discussion touches on ideas from enumerative combinatorics, algebra, probability, and statistics. I assume no prior knowledge.

Random partitions and permutations (Part 1) from Experimental Mathematics on Vimeo.

Random partitions and permutations (Part 2) from Experimental Mathematics on Vimeo.