Mark Tygert's homepage

Selected technical reports

  1. Mark Tygert, "Ties in ranking scores can be treated as weighted samples," Technical Report 2207.13632, arXiv, July 2022: pdf.

    This article details how to calculate cumulative statistics for data with scores (independent variables in a regression) that may not be unique by instead calculating the cumulative statistics for a weighted data set with scores that are unique by construction.

  2. Imanol Arrieta Ibarra, Paman Gujral, Jonathan Tannen, Mark Tygert, and Cherie Xu, "Metrics of calibration for probabilistic predictions," Technical Report 2205.09680, arXiv, May 2022: pdf.

    This article shows the dramatic superiority of the cumulative metrics of Kuiper and of Kolmogorov and Smirnov over the canonical empirical calibration errors (also known as "estimated calibration errors").

  3. Mark Tygert, "Calibration of P-values for calibration and for deviation of a subpopulation from the full population," Technical Report 2202.00100, arXiv, February 2022: pdf.

    This article details calibration of attained significance levels (also known as "P-values") for formal tests of significance related to recently developed analogues of the Kolmogorov-Smirnov and Kuiper metrics.

  4. Mark Tygert, "Controlling for multiple covariates," Technical Report 2112.00672, arXiv, December 2021: pdf.

    This article proposes a fully non-parametric method for conditioning on multiple covariates when assessing differences between subpopulations (or between a subpopulation and the full population).

  5. Isabel Kloumann and Mark Tygert, "An optimizable scalar objective value cannot be objective and should not be the sole objective," Technical Report 2006.02577, arXiv, June 2020: pdf.

    This article concerns the ethics and morality of algorithms and computational systems, and has been circulating internally at Facebook for the past couple years. The paper reviews many Nobel laureates' work, as well as the work of other prominent scientists such as Richard Dawkins, Andrei Kolmogorov, Vilfredo Pareto, and John Von Neumann. The article argues that the standard approach to modern machine learning and artificial intelligence is bound to be biased and unfair, and that longstanding traditions in the professions of law, justice, politics, and medicine should help.

  6. Aaron Defazio and Mark Tygert, "Methods of interpreting error estimates for grayscale image reconstructions," Technical Report 1902.00608, arXiv, February 2019: pdf.

    This article explores visualizations of error estimates for medical imaging.

  7. Mark Tygert, Rachel Ward, and Jure Zbontar, "Compressed sensing with a jackknife and a bootstrap," Technical Report 1809.06959, arXiv, September 2018: pdf.

    This article simulates on a computer what could have happened with measurements that we might have taken but actually did not, given what happened with the measurements that we did in fact make.

  8. Mark Tygert, "Poor starting points in machine learning," Technical Report 1602.02823, arXiv, January 2016: pdf.

    This article advocates starting with a higher-order method and finishing with a lowest-order method in many settings for machine learning, when generalization is important.

  9. William Perkins, Mark Tygert, and Rachel Ward, "Computer-enabled metrics of statistical significance for discrete data," May 2014: pdf.

    This monograph collects together all our work on significance testing.

  10. Mark Tygert and Rachel Ward, "Testing goodness-of-fit for logistic regression," Technical Report 1306.0959, arXiv, June 2013: pdf.

    This article resolves many issues with the standard Hosmer-Lemeshow tests.

  11. William Perkins, Mark Tygert, and Rachel Ward, "Significance testing without truth," Technical Report 1301.1208, arXiv, January 2013: pdf.

    This article has major antecedents in the work of D. R. Cox.

  12. Jacob Carruth, Mark Tygert, and Rachel Ward, "A comparison of the discrete Kolmogorov-Smirnov statistic and the Euclidean distance," Technical Report 1206.6367, arXiv, June 2012: pdf.

    This article provides a guide to choosing between the discrete Kolmogorov-Smirnov statistic and the root-mean-square.

  13. William Perkins, Gary Simon, and Mark Tygert, "Computing the asymptotic power of a Euclidean-distance test for goodness-of-fit," Technical Report 1206.6378, arXiv, June 2012: pdf.

    This article provides an efficient numerical method for plotting the asymptotic power function of a root-mean-square test for goodness-of-fit in the limit of large numbers of observations (as a function of the significance level). This follows up on our earlier paper, "χ2 and classical exact tests often wildly misreport significance; the remedy lies in computers," which is available below.

  14. Mark Tygert, "Testing the significance of assuming homogeneity in contingency-tables/cross-tabulations," Technical Report 1201.1421, arXiv, January 2012: pdf.

    This article analyzes homogeneity in contingency-tables/cross-tabulations using the approach of our earlier paper, "χ2 and classical exact tests often wildly misreport significance; the remedy lies in computers," which is available below.

  15. William Perkins, Mark Tygert, and Rachel Ward, "χ2 and classical exact tests often wildly misreport significance; the remedy lies in computers," Technical Report 1108.4126, arXiv, August 2011; updated, abbreviated version: pdf; extended version: pdf.

    This article is the leading and largest salvo in our crusade against the Pearson χ2 test. This is the place to start.

  16. William Perkins, Mark Tygert, and Rachel Ward, "Computing the confidence levels for a root-mean-square test of goodness-of-fit, II," Technical Report 1009.2260, arXiv, September 2010: pdf, ps.

    This article extends its predecessor (which is available here); the models in the new paper involve parameter estimation.

  17. Mark Tygert, "Analogues for Bessel functions of the Christoffel-Darboux identity," Technical Report 1351, Yale University, Department of Computer Science, March 2006: pdf, ps.

    Many thanks to Professor F. W. J. Olver (R.I.P.) of the University of Maryland for pointing out formula 57.21.1 in E. R. Hansen's A Table of Series and Products, which provides a more general formulation of one of the analogues (thus obviating the need for publishing this technical report).