Previous |  Up |  Next

Article

Title: Exploring the impact of post-training rounding in regression models (English)
Author: Kalina, Jan
Language: English
Journal: Applications of Mathematics
ISSN: 0862-7940 (print)
ISSN: 1572-9109 (online)
Volume: 69
Issue: 2
Year: 2024
Pages: 257-271
Summary lang: English
.
Category: math
.
Summary: Post-training rounding, also known as quantization, of estimated parameters stands as a widely adopted technique for mitigating energy consumption and latency in machine learning models. This theoretical endeavor delves into the examination of the impact of rounding estimated parameters in key regression methods within the realms of statistics and machine learning. The proposed approach allows for the perturbation of parameters through an additive error with values within a specified interval. This method is elucidated through its application to linear regression and is subsequently extended to encompass radial basis function networks, multilayer perceptrons, regularization networks, and logistic regression, maintaining a consistent approach throughout. (English)
Keyword: supervised learning
Keyword: trained model
Keyword: perturbations
Keyword: effect of rounding
Keyword: low-precision arithmetic
MSC: 62H12
MSC: 62M45
MSC: 68Q87
idZBL: Zbl 07893334
idMR: MR4728194
DOI: 10.21136/AM.2024.0090-23
.
Date available: 2024-04-04T12:11:27Z
Last updated: 2024-12-13
Stable URL: http://hdl.handle.net/10338.dmlcz/152315
.
Reference: [1] Agresti, A.: Foundations of Linear and Generalized Linear Models.Wiley Series in Probability and Statistics. John Wiley & Sons, Hoboken (2015). Zbl 1309.62001, MR 3308143
Reference: [2] Blokdyk, G.: Artificial Neural Network: A Complete Guide.5STARCooks, Toronto (2021).
Reference: [3] Carroll, R. J., Ruppert, D., Stefanski, L. A., Crainiceanu, C. M.: Measurement Error in Nonlinear Models: A Modern Perspective.Monographs on Statistics and Applied Probability 105. Chapman & Hall/CRC, Boca Raton (2006). Zbl 1119.62063, MR 2243417, 10.1201/9781420010138
Reference: [4] Croci, M., Fasi, M., Higham, N. J., Mary, T., Mikaitis, M.: Stochastic rounding: Implementation, error analysis and applications.R. Soc. Open Sci. 9 (2022), Article ID 211631, 25 pages. 10.1098/rsos.211631
Reference: [5] Egrioglu, E., Bas, E., Karahasan, O.: Winsorized dendritic neuron model artificial neural network and a robust training algorithm with Tukey's biweight loss function based on particle swarm optimization.Granul. Comput. 8 (2023), 491-501. 10.1007/s41066-022-00345-y
Reference: [6] Fasi, M., Higham, N. J., Mikaitis, M., Pranesh, S.: Numerical behavior of NVIDIA tensor cores.PeerJ Computer Sci. 7 (2021), Article ID e330, 19 pages. 10.7717/peerj-cs.330
Reference: [7] Gao, F., Li, B., Chen, L., Shang, Z., Wei, X., He, C.: A softmax classifier for high-precision classification of ultrasonic similar signals.Ultrasonics 112 (2021), Article ID 106344, 8 pages. 10.1016/j.ultras.2020.106344
Reference: [8] Greene, W. H.: Econometric Analysis.Pearson Education, Harlow (2018).
Reference: [9] Hastie, T., Tibshirani, R., Wainwright, R.: Statistical Learning with Sparsity: The Lasso and Generalizations.Monographs on Statistics and Applied Probability 143. CRC Press, Boca Raton (2015). Zbl 1319.68003, MR 3616141, 10.1201/b18401
Reference: [10] Hoefler, T., Alistarh, D., Ben-Nun, T., Dryden, N., Peste, A.: Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks.J. Mach. Learn. Res. 22 (2021), Article ID 241, 124 pages. Zbl 07626756, MR 4329820
Reference: [11] Kalina, J., Tichavský, J.: On robust estimation of error variance in (highly) robust regression.Measurement Sci. Rev. 20 (2020), 6-14. 10.2478/msr-2020-0002
Reference: [12] Kalina, J., Vidnerová, P., Soukup, L.: Modern approaches to statistical estimation of measurements in the location model and regression.Handbook of Metrology and Applications Springer, Singapore (2023), 2355-2376. 10.1007/978-981-99-2074-7_125
Reference: [13] Louizos, C., Reisser, M., Blankevoort, T., Gavves, E., Welling, M.: Relaxed quantization for discretized neural networks.Available at https://arxiv.org/abs/1810.01875 (2018), 14 pages. 10.48550/arXiv.1810.01875
Reference: [14] Maddox, W. J., Potapczynski, A., Wilson, A. G.: Low-precision arithmetic for fast Gaussian processes.Proc. Mach. Learn. Res. 180 (2022), 1306-1316.
Reference: [15] Nagel, M., Fournarakis, M., Amjad, R. A., Bondarenko, Y., Baalen, M. van, Blankevoort, T.: A white paper on neural network quantization.Available at https://arxiv.org/abs/2106.08295 (2021), 27 pages. 10.48550/arXiv.2106.08295
Reference: [16] Park, J.-H., Kim, K.-M., Lee, S.: Quantized sparse training: A unified trainable framework for joint pruning and quantization in DNNs.ACM Trans. Embedded Comput. Syst. 21 (2022), Article ID 60, 22 pages. 10.1145/3524066
Reference: [17] Pillonetto, G.: System identification using kernel-based regularization: New insights on stability and consistency issues.Automatica 93 (2018), 321-332. Zbl 1400.93316, MR 3810919, 10.1016/j.automatica.2018.03.065
Reference: [18] Riazoshams, H., Midi, H., Ghilagaber, G.: Robust Nonlinear Regression with Applications Using R.John Wiley & Sons, Hoboken (2019). Zbl 1407.62022, MR 3839600, 10.1002/9781119010463
Reference: [19] Saleh, A. K. M. E., Picek, J., Kalina, J.: R-estimation of the parameters of a multiple regression model with measurement errors.Metrika 75 (2012), 311-328. Zbl 1239.62081, MR 2909549, 10.1007/s00184-010-0328-2
Reference: [20] Seghouane, A.-K., Shokouhi, N.: Adaptive learning for robust radial basis function networks.IEEE Trans. Cybernetics 51 (2021), 2847-2856. 10.1109/TCYB.2019.2951811
Reference: [21] Shultz, K. S., Whitney, D., Zickar, M. J.: Measurement Theory in Action: Case Studies and Exercises.Routledge, New York (2020). 10.4324/9781315869834
Reference: [22] Šíma, J., Vidnerová, P., Mrázek, V.: Energy complexity model for convolutional neural networks.Artificial Neural Networks and Machine Learning -- ICANN 2023 Lecture Notes in Computer Science 14263. Springer, Cham (2023), 186-198. MR 4776700, 10.1007/978-3-031-44204-9_16
Reference: [23] Smucler, E., Yohai, V. J.: Robust and sparse estimators for linear regression models.Comput. Stat. Data Anal. 111 (2017), 116-130. Zbl 1464.62164, MR 3630222, 10.1016/j.csda.2017.02.002
Reference: [24] Sze, V., Chen, Y.-H., Yang, T.-J., Emer, J. S.: Efficient processing of deep neural networks: A tutorial and survey.Proc. IEEE 105 2017 (2295-2329). MR 3784727, 10.1109/JPROC.2017.2761740
Reference: [25] Víšek, J.Á.: Consistency of the least weighted squares under heteroscedasticity.Kybernetika 47 (2011), 179-206. Zbl 1220.62064, MR 2828572
Reference: [26] Wang, N., Choi, J., Brand, D., Chen, C.-Y., Gopalakrishnan, K.: Training deep neural networks with 8-bit floating point numbers.NIPS'18: Proceedings of the 32nd International Conference on Neural Information Processing Systems Curran Associates, New York (2018), 7686-7695. 10.5555/3327757.3327866
Reference: [27] Yan, W. Q.: Computational Methods for Deep Learning: Theory, Algorithms, and Implementations.Texts in Computer Science. Springer, Singapore (2023). Zbl 7783714, MR 4660076, 10.1007/978-981-99-4823-9
Reference: [28] Yu, J., Anitescu, M.: Multidimensional sum-up rounding for integer programming in optimal experimental design.Math. Program. 185 (2021), 37-76 \99999DOI99999 10.1007/s10107-019-01421-z . Zbl 1458.62158, MR 4201708, 10.1007/s10107-019-01421-z
Reference: [29] Zhang, R., Wilson, A. G., Sa, C. De: Low-precision stochastic gradient Langevin dynamics.Proc. Mach. Learn. Res. 162 (2022), 26624-26644.
.

Fulltext not available (moving wall 24 months)

Partner of
EuDML logo