[1] Barnet, N. S., Dragomir, S.:
A survey of recent inequalities for $\phi$-divergences of discrete probability distributions. In: Advances in Inequalities from Probability Theory and Statistics (N. S. Barnett and S. S. Dragomir, eds.), Nova Science Publishing, New York 2008, pp. 1-85.
DOI |
MR 2459969
[2] Basseville, M.:
Divergence measures for statistical data processing -- An annotated bibliography. Signal Processing 93 (2013), 621-633.
DOI
[5] Böhm, U., Dahm, P. F., McAllister, B. F., Greenbaum, I. F.:
Identifying chromosomal fragile sites from individuals: a multinomial statistical model. Human Genetics 95 (1995), 249-256.
DOI 10.1007/BF00225189
[6] Chan, H., Darwiche, A.:
A distance measure for bounding probabilistic belief change. Int. J. Approx. Reasoning 38 (2005), 149-174.
DOI |
MR 2116782
[8] Charalambous, C. D., Tzortzis, I., Loyka, S., Charalambous, T.:
Extremum problems with total variation distance and their applications. IEEE Trans. Automat. Control 59 (2014), 2353-2368.
DOI |
MR 3254531
[9] Corander, J., Fraser, C., Gutmann, M. U., Arnold, B., Hanage, W. P., Bentley, S. D., Lipsitch, M., Croucher, N. J.:
Frequency-dependent selection in vaccine-associated pneumococcal population dynamics. Nature Ecology Evolution 1 (2017), 1950-1960.
DOI
[10] Cover, Th. M., Thomas, J. A.:
Elements of Information Theory. Second edition. John Wiley and Sons, New York 2012.
MR 2239987
[11] Cranmer, K., Brehmer, J., Louppe, G.:
The frontier of simulation-based inference. Proc. Natl. Acad. Sci. USA 117 (2020), 30055-30062.
DOI |
MR 4263287
[12] Csiszár, I., Talata, Z.:
Context tree estimation for not necessarily finite memory processes, via BIC and MDL. IEEE Trans. Inform. Theory 52 (2006), 1007-1016.
DOI |
MR 2238067
[13] Csiszár, I., Shields, P. C.: Information Theory and Statistics: A tutorial. Now Publishers Inc, Delft 2004.
[14] Devroye, L.:
The equivalence of weak, strong and complete convergence in $ L_1 $ for kernel density estimates. Ann. Statist. 11 (1983), 896-904.
DOI 10.1214/aos/1176346255 |
MR 0707939
[15] Diggle, P. J., Gratton, R. J.:
Monte Carlo methods of inference for implicit statistical models. J. R. Stat. Soc. Ser. B. Stat. Methodol. 46, (1984), 193-212.
MR 0781880
[16] M.Endre, D., Schindelin, J. E.:
A new metric for probability distributions. IEEE Trans. Inform. Theory 49 (2003), 1858-1860.
DOI |
MR 1985590
[17] Fedotov, A. A., Harremoës, P., Topsøe, F.:
Refinements of Pinsker's inequality. IEEE Trans. Inform. Theory 49 (2003), 1491-1498.
DOI |
MR 1984937
[18] Gibbs, A. L., Su, F. E.:
On choosing and bounding probability metrics. Int. Stat. Rev. 70 (2002), 419-435.
DOI
[19] Guntuboyina, A.:
Lower bounds for the minimax risk using $ f $-divergences, and applications. IEEE Trans. Inform. Theory 57 (2011), 2386-2399.
DOI |
MR 2809097
[20] Gutmann, M. U., Corander, J.:
Bayesian optimization for likelihood-free inference of simulator-based statistical models. J. Mach. Learn. Res. 17, (2016), 4256-4302.
MR 3555016
[21] Gyllenberg, M., Koski, T., Reilink, E., Verlaan, M.:
Non-uniqueness in probabilistic numerical identification of bacteria. J. App. Prob. 31 (1994), 542-548.
DOI |
MR 1274807
[22] Gyllenberg, M., Koski, T.:
Numerical taxonomy and the principle of maximum entropy. J. Classification 13 (1996), 213-229.
DOI |
MR 1421666
[23] Holopainen, I.: Evaluating Uncertainty with Jensen-Shannon Divergence. Master's Thesis, Faculty of Science, University of Helsinki 2021.
[24] Hou, C-D., Chiang, J., Tai, J. J.:
Identifying chromosomal fragile sites from a hierarchical-clustering point of view. Biometrics 57 (2001), 435-440.
DOI |
MR 1855677
[25] Janžura, M., Boček, P.:
A method for knowledge integration. Kybernetika 34 (1998), 41-55.
MR 1619054
[26] Jardine, N., Sibson, R.:
Mathematical Taxonomy. J. Wiley and Sons, London 1971.
MR 0441395
[27] Khosravifard, M., Fooladivanda, D., Gulliver, T. A.: Exceptionality of the variational distance. In: 2006 IEEE Information Theory Workshop-ITW'06 Chengdu 2006, pp. 274-276.
[28] Koski, T.: Probability Calculus for Data Science. Studentlitteratur, Lund 2020.
[29] Kůs, V.:
Blended $\phi $-divergences with examples. Kybernetika 39 (2003), 43-54.
MR 1980123
[30] Kůs, V., Morales, D., Vajda, I.:
Extensions of the parametric families of divergences used in statistical inference. Kybernetika 44 (2008), 95-112.
DOI |
MR 2405058
[31] LeCam, L.:
On the assumptions used to prove asymptotic normality of maximum likelihood estimates. Ann. Math. Statist. 41 (1970), 802-828.
DOI |
MR 0267676
[32] Liese, F., Vajda, I.:
On divergences and informations in statistics and information theory. IEEE Trans. Inform. Theory 52 (2006), 4394-4412.
DOI |
MR 2300826
[33] Li, K., Mitendra, J.: Implicit maximum likelihood estimation. arXiv preprint arXiv:1809.09087, 2018).
[34] Lin, J.:
Divergence measures based on the Shannon entropy. IEEE Trans. Inform. Theory 37 (1991), 145-151.
DOI |
MR 1087893
[35] Lintusaari, J., Gutmann, M. U, Dutta, R., Kaski, S., Corander, J.: Fundamentals and recent developments in approximate Bayesian computation. Systematic Biology 66 (2017), e66-e82.
[36] Lintusaari, J., Vuollekoski, H., Kangasrääsiö, A., Skytén, K., Järvenpää, M., Marttinen, P., Gutmann, M. U., Vehtari, A., Corander, J., Kaski, S.:
ELFI: Engine for likelihood-free inference. J. Mach. Learn. Res. 19 (2018), 1-7.
MR 3862423
[37] Morales, D., Pardo, L., Vajda, I.:
Asymptotic divergence of estimates of discrete distributions. J. Statist. Plann. Inference 48 (1995), 347-369.
DOI |
MR 1368984
[38] Nowozin, S., Cseke, B., Tomioka, R.: f-gan: Training generative neural samplers using variational divergence minimization. Advances Neural Inform. Process. Systems (2016), 271-279.
[39] Okamoto, M.:
Some inequalities relating to the partial sum of binomial probabilities. Ann. Inst.of Statist. Math. 10 (1959), 29-35.
DOI |
MR 0099733
[40] Sason, I.:
On f-divergences: Integral representations, local behavior, and inequalities. Entropy 20 (2018), 383-405.
DOI |
MR 3862573
[41] Sason, I., Verdu, S.:
$f$-divergence inequalities. IEEE Trans. Inform. Theory 62 (2016), 5973-6006.
DOI |
MR 3565096
[42] Shannon, M.: Properties of f-divergences and f-GAN training. arXiv preprint arXiv:2009.00757, 2020.
[43] Sibson, R.:
Information radius. Z. Wahrsch. Verw. Geb. 14 (1969), 149-160.
DOI |
MR 0258198
[44] Sinn, M., Rawat, A.: Non-parametric estimation of Jensen-Shannon divergence in generative adversarial network training. In: International Conference on Artificial Intelligence and Statistics 2018, pp. 642-651.
[45] Taneja, I. J.:
On mean divergence measures. In: Advances in Inequalities from Probability Theory and Statistics (N. S. Barnett and S. S. Dragomir, eds.), Nova Science Publishing, New York 2008, pp. 169-186.
MR 2459974
[46] Topsøe, F.:
Information-theoretical optimization techniques. Kybernetika 15 (1979), 8-27.
MR 0529888
[47] Topsøe, F.:
Some inequalities for information divergence and related measures of discrimination. IEEE Trans. Inform. Theory 46 (2000), 1602-1609.
DOI |
MR 1768575
[48] Vajda, I.:
Note on discrimination information and variation (Corresp.). IEEE Trans. Inform. Theory 16 (1970), 771-773.
DOI |
MR 0275575
[49] Vajda, I.: Theory of Statistical Inference and Information. Kluwer Academic Publ., Delft 1989.
[50] Vajda, I.:
On metric divergences of probability measures. Kybernetika 45 (2009), 885-900.
DOI |
MR 2650071
[51] Jr., J. I. Yellott:
The relationship between Luce's choice axiom, Thurstone's theory of comparative judgment, and the double exponential distribution. J. Math. Psych. 15 (1977), 109-144.
DOI |
MR 0449795
[52] Österreicher, F., Vajda, I.:
Statistical information and discrimination. IEEE Trans. Inform. Theory 39 (1993), 1036-1039.
DOI |
MR 1237725