This article defines an easy-to-use metric of success or figure of merit for classification in which the classes come endowed with a hierarchical taxonomy.
This article describes an implementation for Spark of principal component analysis, singular value decomposition, and low-rank approximation.
This article advocates starting with a higher-order method and finishing with a lowest-order method in many settings for machine learning, when generalization is important.
This monograph collects together all our work on significance testing.
This article resolves many issues with the standard Hosmer-Lemeshow tests.
This article has major antecedents in the work of D. R. Cox.
This article provides a guide to choosing between the discrete Kolmogorov-Smirnov statistic and the root-mean-square.
This article provides an efficient numerical method for plotting the asymptotic power function of a root-mean-square test for goodness-of-fit in the limit of large numbers of observations (as a function of the significance level). This follows up on our earlier paper, "χ2 and classical exact tests often wildly misreport significance; the remedy lies in computers," which is available below.
This article analyzes homogeneity in contingency-tables/cross-tabulations using the approach of our earlier paper, "χ2 and classical exact tests often wildly misreport significance; the remedy lies in computers," which is available below.
This article is the leading and largest salvo in our crusade against the Pearson χ2 test. This is the place to start.
This article extends its predecessor (which is available here); the models in the new paper involve parameter estimation.
Many thanks to Professor F. W. J. Olver (R.I.P.) of the University of Maryland for pointing out formula 57.21.1 in E. R. Hansen's A Table of Series and Products, which provides a more general formulation of one of the analogues (thus obviating the need for publishing this technical report).