Harry Crane

Some experimental observations and open questions about the alpha-permanent

April 18, 2019 at Rutgers, Math Department, Experimental Mathematics Seminar

Abstract

The alpha-permanent is a matrix function that has a similar algebraic form to the determinant but exhibits very different computational behavior. The permanent (alpha=1) is known to be #P-complete, and is a fundamental object in computational complexity theory. The more general alpha-permanent appears in statistical models for point pattern data and combinatorial data (e.g., partition and permutation data), where its computational complexity limits its applied use in many cases. I'll discuss some algebraic properties of the alpha-permanent which suggest natural connections to familiar concepts in probability theory and statistics. I'll also describe some immediate research problems that arise out of these observations.

Some experimental observations and open questions about the alpha-permanent from Experimental Mathematics on Vimeo.

A Formal Model for Intuitive Probabilistic Reasoning

March 25, 2019 at Rutgers, Philosophy Department, Foundations of Probability Seminar

Abstract

I propose a formal framework for intuitive probabilistic reasoning (IPR). The proposed system aims to capture the informal, albeit logical, process by which individuals justify beliefs about uncertain claims in legal argument, mathematical conjecture, scientific theorizing, and common sense reasoning. The philosophical grounding and mathematical formalism of IPR takes root in Brouwer's mathematical intuitionism, as formalized in intuitionistic Martin-Lof type theory (MLTT) and homotopy type theory (HoTT). Formally, IPR is distinct from more conventional treatments of subjective belief, such as Bayesianism and Dempster-Shafer theory. Conceptually, the approaches share some common motivations, and can be viewed as complementary.

Assuming no prior knowledge of intuitionistic logic, MLTT, or HoTT, I discuss the conceptual motivations for this new system, explain what it captures that the Bayesian approach does not, and outline some intuitive consequences that arise as theorems. Time permitting, I also discuss a formal connection between IPR and more traditional theories of decision under uncertainty, in particular Bayesian decision theory, Dempster--Shafer theory, and imprecise probability.

References:
H. Crane. (2018). The Logic of Probability and Conjecture. Researchers.One. https://www.researchers.one/article/2018-08-5.

H. Crane. (2019). Imprecise probabilities as a semantics for intuitive probabilistic reasoning. Researchers.One. https://www.researchers.one/article/2018-08-8.

H. Crane and I. Wilhelm. (2019). The Logic of Typicality. In Statistical Mechanics and Scientific Explanation: Determinism, Indeterminism and Laws of Nature (V. Allori, ed.). Available at Researchers.One.

Complex Data Analysis, the Replication Crisis, and Prediction Markets

December 6, 2018 at MITRE Corporation, Complexity Science Lecture Series

Abstract

This wide-ranging talk will discuss a number of technical, conceptual, and practical issues that limit our ability to understand complex systems.

On the technical side, I will first discuss how the predominant mathematical paradigm in which statistical theory and methods are currently implemented is not equipped to handle the complexity of modern data structures. I will then discussion how conceptual issues in probability theory and the role of centralization shed light on the emerging scientific replication crisis. Finally, I will discuss some practical solutions for how to leverage the decentralized mechanisms provided by prediction markets to design better systems.

Cognitum Episode 5 // Replication + Probability

December 28, 2018 on Cognitum with Iosif M Gershteyn

Abstract

Cognitum's Iosif M Gershteyn discusses the scientific replication crisis with Harry Crane, Professor at Rutgers University and Co-founder of Researchers.ONE. Crane provides a systems-level explanation of statistical errors and disincentives in the conventional paradigm, and the solution offered by the new peer review initiative at Researchers.One.

Discussion of Researchers.One

November 5, 2018, Department 12 IO Psych Podcast

Abstract

Harry Crane and Ryan Martin discuss Researchers.One on the Department 12 IO Psychology Podcast with Ben Butina.

Replication Crisis, Prediction Markets, and the Fundamental Principle of Probability

October 22, 2018 at Rutgers, Philosophy Department, Foundations of Probability Seminar

Abstract

I will discuss how ideas from the foundations of probability can be applied to resolve the current scientific replication crisis. I focus on two specific approaches:

1. Prediction Markets to incentivize more accurate assessments of the scientific community’s confidence that a result will replicate.
2. The Fundamental Principle of Probability to incentivize more accurate assessments of the author’s confidence that their own results will replicate.

I compare and contrast the merits and drawbacks of these two approaches.

The article associated to this talk can be found at https://researchers.one/article/2018-08-16.

Lecture Handout

Slides

The Shape of Probability

May 14, 2018 on The Jolly Swagmen podcast

Description

In this episode, Joe catches up with Harry in New Jersey to discuss the history of probability, how experts routinely misuse statistics, “The First Rule of Probability”, the replication crisis in science, and how Harry thinks about probability in shapes rather than numbers.

Link to Episode

Why "redefining statistical significance" will make the reproducibility crisis worse

November 27, 2017 at Rutgers, Philosophy Department, Foundations of Probability Seminar

Abstract

A recent proposal to "redefine statistical significance" (Benjamin, et al. Nature Human Behaviour, 2017) claims that false positive rates "would immediately improve" by factors greater than two and fall as low as 5%, and replication rates would double simply by changing the conventional cutoff for 'statistical significance' from P<0.05 to P<0.005. I will survey several criticisms of this proposal and also analyze the veracity of these major claims, focusing especially on how Benjamin, et al neglect the effects of P-hacking in assessing the impact of their proposal. My analysis shows that once P-hacking is accounted for the perceived benefits of the lower threshold all but disappear, prompting two main conclusions:

(i) The claimed improvements to false positive rate and replication rate in Benjamin, et al (2017) are exaggerated and misleading.

(ii) There are plausible scenarios under which the lower cutoff will make the replication crisis worse.

My full analysis can be downloaded here.

Coherence in Statistical Modeling of Networks

October 26, 2017 at Johns Hopkins, Applied Mathematics and Statistics Department

Abstract

George Box famously said, "All models are wrong, but some are useful." Classic texts define a statistical model as "a set of distributions on the sample space" (Cox and Hinkley, 1976; Lehman, 1983; Barndorff-Nielson and Cox, Bernardo and Smith, 1994). Motivated by some longstanding questions in the analysis of network data, I will examine both of these statements, first from a general point of view, and then in the context of some recent developments in network analysis. The confusion caused by these statements is clarified by the realization that the definition of statistical model must be refined --- it must be more than just a set. With this, the ambiguity in Box's statement --- e.g., what determines whether a model is 'wrong' or 'useful'? --- can be clarified by a logical property that I call 'coherence'. After clarification, a model is deemed useful as long as it is coherent, i.e., inferences from it 'make sense'. I will then discuss some implications for the statistical modeling of network data.

Probabilities as Shapes

April 10, 2017 at Rutgers, Philosophy Department, Foundations of Probability Seminar

Abstract

In mathematics, statistics, and perhaps even in our intuition, it is conventional to regard probabilities as numbers, but I prefer instead to think of them as shapes. I'll explain how and why I prefer to think of probabilities as shapes instead of numbers, and will discuss how these probability shapes can be formalized in terms of infinity groupoids (or homotopy types) from homotopy type theory (HoTT).

Markov process models for time-varying networks

December 13, 2016 at Isaac Newton Institute, Cambridge University

Abstract

Many models for dynamic networks, such as the preferential attachment model, describe evolution by sequential addition of vertices and/or edges. Such models are not suited to networks whose connectivity varies over time, as in social relationships and other kinds of temporally varying interactions. For modeling in this latter setting, I develop the general theory of exchangeable Markov processes for time-varying networks and discuss relevant consequences.

Tamara Broderick (MIT) Research Misconduct - Edge Exchangeability

November 20, 2016

Abstract

A NIPS 2016 paper by Tamara Broderick, Diana Cai and Trevor Campbell claims credit for the idea of edge exchangeability with a misleading and improper attribution of my prior work with Walter Dempsey. Edge exchangeability was introduced over a year ago in a paper by Harry Crane and Walter Dempsey. Much of the material in the Broderick-Cai-Campbell paper resembles that of Crane-Dempsey without acknowledgement.

Edge exchangeability: a new foundation for modeling network data

July 25, 2016 at Isaac Newton Institute, Cambridge University

Abstract

Exchangeable models for countable vertex-labeled graphs can- not replicate the large sample behaviors of sparsity and power law degree distribution observed in many network datasets. Out of this mathematical impossibility emerges the question of how network data can be modeled in a way that reflects known empirical behaviors and respects basic statistical principles. We address this question by observing that edges, not vertices, act as the statistical units in networks constructed from interaction data, making a theory of edge-labeled networks more natural for many applications. In this context we introduce the concept of edge exchangeability, which unlike its vertex exchangeable counterpart admits models for net- works with sparse and/or power law structure. Our characterization of edge exchangeable networks gives rise to a class of nonparametric models, akin to graphon models in the vertex exchangeable setting. Within this class, we identify a tractable family of distributions with a clear interpretation and suitable theoretical properties, whose significance in estimation, prediction, and testing we demonstrate.

Pattern avoidance for random permutations

February 18, 2016 at Rutgers University, Experimental Mathematics Seminar

Abstract

A classic question of enumerative combinatorics is: How many permutations of {1,...,n} avoid a given pattern? I recast this question in probabilistic terms: What is the probability that a randomly generated permutation of {1,...,n} avoids a given pattern?
I consider this question for the Mallows distribution on permutations, of which the uniform distribution is a special case. I discuss how the probabilistic technique of Poisson approximation can be applied to bound the total variation distance between the Poisson distribution and the distribution of the number of occurrences of a fixed pattern in a random permutation. In the special case of the uniform distribution, we obtain bounds on the number of pattern avoiding permutations of all finite sizes.

Random partitions and permutations

November 20, 2014 at Rutgers University, Experimental Mathematics Seminar

Abstract

Historically, enumerative combinatorics and discrete probability theory are closely related through uniform probability distributions on finite sets. I will first explain why the uniform distribution is unnatural in some modern applications and then survey several aspects of non-uniform random partitions and permutations. The discussion touches on ideas from enumerative combinatorics, algebra, probability, and statistics. I assume no prior knowledge.

Random partitions and permutations (Part 1) from Experimental Mathematics on Vimeo.

Random partitions and permutations (Part 2) from Experimental Mathematics on Vimeo.

Rutgers University

Talks and Videos

Some experimental observations and open questions about the alpha-permanent

Abstract

A Formal Model for Intuitive Probabilistic Reasoning

Abstract

Complex Data Analysis, the Replication Crisis, and Prediction Markets

Abstract

Cognitum Episode 5 // Replication + Probability

Abstract

Discussion of Researchers.One

Abstract

Replication Crisis, Prediction Markets, and the Fundamental Principle of Probability

Abstract

The Shape of Probability

Description

Why "redefining statistical significance" will make the reproducibility crisis worse

Abstract

Coherence in Statistical Modeling of Networks

Abstract

Probabilities as Shapes

Abstract

Markov process models for time-varying networks

Abstract

Tamara Broderick (MIT) Research Misconduct - Edge Exchangeability

Abstract

Edge exchangeability: a new foundation for modeling network data

Abstract

Pattern avoidance for random permutations

Abstract

Random partitions and permutations

Abstract