This article details how to calculate cumulative statistics for data with scores (independent variables in a regression) that may not be unique by instead calculating the cumulative statistics for a weighted data set with scores that are unique by construction.
This article shows the dramatic superiority of the cumulative metrics of Kuiper and of Kolmogorov and Smirnov over the canonical empirical calibration errors (also known as "estimated calibration errors").
This article details calibration of attained significance levels (also known as "P-values") for formal tests of significance related to recently developed analogues of the Kolmogorov-Smirnov and Kuiper metrics. The latest version merges in another paper, "Ties in ranking scores can be treated as weighted samples."
This article proposes a fully non-parametric method for conditioning on multiple covariates when assessing differences between subpopulations (or between a subpopulation and the full population).
This article concerns the ethics and morality of algorithms and computational systems, and has been circulating internally at Facebook for the past couple years. The paper reviews many Nobel laureates' work, as well as the work of other prominent scientists such as Richard Dawkins, Andrei Kolmogorov, Vilfredo Pareto, and John Von Neumann. The article argues that the standard approach to modern machine learning and artificial intelligence is bound to be biased and unfair, and that longstanding traditions in the professions of law, justice, politics, and medicine should help.
This article advocates starting with a higher-order method and finishing with a lowest-order method in many settings for machine learning, when generalization is important.
This monograph collects together all our work on significance testing.
This article resolves many issues with the standard Hosmer-Lemeshow tests.
This article has major antecedents in the work of D. R. Cox.
This article provides a guide to choosing between the discrete Kolmogorov-Smirnov statistic and the root-mean-square.
This article provides an efficient numerical method for plotting the asymptotic power function of a root-mean-square test for goodness-of-fit in the limit of large numbers of observations (as a function of the significance level). This follows up on our earlier paper, "χ2 and classical exact tests often wildly misreport significance; the remedy lies in computers," which is available below.
This article analyzes homogeneity in contingency-tables/cross-tabulations using the approach of our earlier paper, "χ2 and classical exact tests often wildly misreport significance; the remedy lies in computers," which is available below.
This article is the leading and largest salvo in our crusade against the Pearson χ2 test. This is the place to start.
This article extends its predecessor (which is available here); the models in the new paper involve parameter estimation.
Many thanks to Professor F. W. J. Olver (R.I.P.) of the University of Maryland for pointing out formula 57.21.1 in E. R. Hansen's A Table of Series and Products, which provides a more general formulation of one of the analogues (thus obviating the need for publishing this technical report).