# The optimization and statistical models and methods in recognizing properties of data sets measured with errors

OSMoMeSIP-IP-2016-06-6545

This project is financed by Croatian Science Foundation.

Project duration: 1. 3. 2017. - 28. 2. 2021.

### PROJECT DESCRIPTION

As a part of an attractive and active area of research known as big data analysis, optimization and statistical aspects of recognizing data sets properties will be analyzed. Research will be focused on clustering problems, deconvolution models and applications. The assumption is that the observed data sets represent the measured values of the variables to be analyzed but also that they contain a measurement error. In large data sets it is often appropriate to cluster data sets on the basis of certain characteristics and then apply specific models for each group that can describe variable properties such as relationships among them, possibility of separation, edges, specific form of the set of values, dimensions (length, surface or volume) of the set of values or general parameter vector which determines them. The problem in many practical situations can be formulated as an optimization problem for which the objective function is generally neither differentiable nor convex. In order to solve such problems effectively, rapid and accurate numerical procedures will be developed. Also, due to errors in the data, in order to understand and correctly interpret the results, statistical models will be used and important statistical properties will be characterized.

### PROJECT RESEARCHERS

**Principal investigator: **Prof. Rudolf Scitovski, Department of Mathematics, University of Osijek, Croatia

**Project members:** Prof. Andrew Barron (Yale University, USA), Prof. Mirta Benšić (Department of Mathematics, University of Osijek, Croatia), Prof. Dragan Jukić (Department of Mathematics, University of Osijek, Croatia), Prof. Kristian Sabo (Department of Mathematics, University of Osijek, Croatia), Assistant Prof. Karlo Emanuel Nyarko (Faculty of Electrical Engineering, Computer Science and Information Technology Osijek, University of Osijek, Croatia), Safet Hamedović (Faculty of Metallurgy and Materials, University of Zenica, BIH), Dr.Petar Taler (Department of Mathematics, University of Osijek, Croatia) and Una Radojičić (PhD student, Department of Mathematics, University of Osijek, Croatia).

### ACTIVITIES

**Journal Publications (published or accepted)**

1. R. Scitovski, A new global optimization method for a symmetric Lipschitz continuous function and the application to searching for a globally optimal partition of a one-dimensional set, Journal of Global Optimization** 68** (2017), 713-727, DOI: 10.1007/s10898-017-0510-4, (SCIE, Mathematics, Applied)

2. D. Jukić, An elementary proof of the quadratic envelope characterization of zero-derivative points, Optimization Letters,12 (2018), 1155 - 1156 (SCIE, Mathematics, Applied)

3. M. Benšić, P. Taler, S. Hamedović, E.K. Nyarko, K. Sabo, LeArEst: Length and Area Estimation from Data Measured with Additive Error, The R Journal, **9**/2 (2017), 461-473 (SCIE, Probability & Statistics)

4. S. Hamedović, M. Benšić, K. Sabo, P. Taler, Estimating the size of an object captured with error, Cent Eur J Oper Res **26**/3 (2018), 771-781, (SCIE, Operations Research & Management Science)

5. R. Scitovski, M. Vinković, K. Sabo, A. Kozić, A research project ranking method based on independent reviews by using the principle of the distance to the perfectly

assessed project, Croatian Operational Research Review, **8** (2017), 429-442) (Web of Science Emerging Sources Citation Index (ESCI))

6. L. Jakobek, P. Matić, V. Krešić, A. R. Barron, Adsorption of Apple Polyphenols onto β-Glucan, Czech J. Food Sci. **6** (2017), 476–482 (SCIE, )

7. A. Barron, M. Benšić, K. Sabo, A Note on Weighted Least Square Distribution Fitting and Full Standardization of the Empirical Distribution Function, TEST **27**/4 (2018), 946-967 (SCIE, Statistics&Probability)

8. R. Scitovski, K. Sabo, Application of the DIRECT algorithm to searching for an optimal k-partition of the set $\A\subset\R^n$ and its application to the multiple circle detection problem, Journal of Global Optimization (SCIE, Mathematics, Applied), 74/1 (2019), 63-77

9. R. Scitovski, U. Radojičić, K. Sabo, A fast and efficient method for solving the multiple line detection problem, Rad HAZU, Matematičke znanosti (Web of Science Emerging Sources Citation Index (ESCI), MRcc), **23** (2019), 123-140

10. R. Scitovski, K. Sabo, DBSCAN-like clustering method for various data densities, Pattern Analysis and Applications (SCIE, ), 2019, accepted

11. M. Zekić Sušac, M. Knežević, R. Scitovski, Modeling the cost of energy in public sector buildings by linear regression and deep learning (SCIE, Operations Research & Management Science), 2019, accepted

12. R.Scitovski, K.Sabo, The adaptation of the k-means algorithm to solving the multiple ellipses detection problem by using an initial approximation obtained by the DIRECT global optimization algorithm, Applications of Mathematics 64/6 (2019), 663-678 (SCIE, Mathematics, Applied), 2019

13. S.Hamedović, M.Benšić, K. Sabo, Estimating the width of a uniform distribution under symmetric measurement errors, Journal of the Korean Statistical Society (SCIE, Statistics & Probability), 2019, accepted

14. R. Scitovski, K. Sabo, A combination of k-means and DBSCAN algorithm for solving the multiple generalized circle detection problem, Advances in Data Analysis and Classification (SCIE, Statistics & Probability), 2020, accepted

15. D. Jukić, A necessary and sufficient criterion for the existence of the global minima of a continuous lower bounded function on a noncompact set, Journal of Computational and Applied Mathematics, (SCIE, Mathematics, Applied), 2020, accepted

**Software**

M. Benšić, S. Hamedović, K. Sabo, P. Taler, LeArEst R software package (published on CRAN).

**Conference proceedings**

1. P. Taler, S. Hamedović, M. Benšić, E.K. Nyarko, LeArEst - The Software for Border and Area Estimation of Data Measured with Additive Error, 59th International Symposium ELMAR-2017, Zadar, 2017, 259-263

2. D. Jukić, K. Sabo, An existence criterion for the sum of squares. In: Zadnik Stirn L, Kljajić Borštnar M, Žerovnik J, Drobne S, Povh J (eds) Proceedings of the 15th International Symposium on Operational Research SOR'19 (Bled, September 25-27, 2019), 500-505

3. U. Radojičić, R. Scitovski, K. Sabo, A Fast and Efficient Method for Solving the Multiple Closed Curve Detection Problem, In Proceedings of the 8th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM, 269-276, 2019, Prague, Czech Republic

**Technical Reports (published or accepted papers)**

1. R. Scitovski, A new global optimization method for a symmetric Lipschitz continuous function and the application to searching for a globally optimal partition of a one-dimensional set (pdf)

2. D. Jukić, An elementary proof of the quadratic envelope characterization of zero-derivative points (pdf)

3. P. Taler, S. Hamedović, M. Benšić, K. E. Nyarko, LeArEst - The Software for Border and Area Estimation of Data Measured with Additive Error (pdf)

4. S. Hamedović, M. Benšić, K.Sabo, P. Taler, Estimating the size of an object captured with error (pdf)

5. R. Scitovski, M. Vinković, K. Sabo, A. Kozić, A research project ranking method based on independent reviews by using the principle of the distance to the perfectly assessed project (pdf)

6. L. Jakobek, P. Matić, V. Krešić, A. R. Barron, Adsorption of Apple Polyphenols onto β-Glucan (pdf)

7. A. Barron, M. Benšić, K. Sabo, A Note on Weighted Least Square Distribution Fitting and Full Standardization of the Empirical Distribution Function (pdf)

8. R. Scitovski, K. Sabo, Application of the DIRECT algorithm to searching for an optimal k-partition of the set $\A\subset\R^n$ and its application to the multiple circle detection problem (pdf)

9. R. Scitovski, U. Radojičić, K. Sabo, A fast and efficient method for solving the multiple line detection problem (pdf)

10. R. Scitovski, K. Sabo, DBSCAN-like clustering method for various data densities (pdf)

11. R.Scitovski, K.Sabo, Application of the DIRECT algorithm to solving the multiple ellipse detection problem (pdf)

12. S.Hamedović, M.Benšić, K. Sabo, Estimating the width of a uniform distribution under symmetric measurement errors (pdf)

**Conferences and seminars**

1. ELMAR 2017, Zadar, September 2017: P. Taler, S. Hamedović, M. Benšić, K. E. Nyarko, LeArEst - The Software for Border and Area Estimation of Data Measured with Additive Error (slides)

2. Seminar for optimization and applications, Department of Mathematics, University of Osijek, December, 2017: R. Scitovski, A method for solving the multiple ellipses detection problem

3. Seminar for optimization and applications, Department of Mathematics, University of Osijek, January, 2018: M. Benšić, Procjena distribucijskih parametara generaliziranom metodom najmanjih kvadrata i standardizacija empirijske distribucije

4. Statistical seminar, Department of Mathematics, University of Osijek, March, 2018: A. Barron, Proper Statistical Fitting of Adsorption Isotherms

5. ISSCRO'18 (2nd International Statistical Conference in CROatia), Opatija, May 2018: U. Radojičić, Application of Adaptive Annealing method to generalized incremental algorithm (slides)

6. ISSCRO'18 (2nd International Statistical Conference in CROatia), Opatija, May 2018: M. Benšić, K. Sabo, S. Hamedović, The width of a uniform distribution: estimation in additive error models (slides)

7. Euro-Global Conference on Food Science, Agronomy and Technology Food ScienceEuro-Global Conference on Food Science, Agronomy and Technology Food Science, 20th and 22nd September 2018, Rome, Italy, L. Jakobek, P. Matić, A. R. Barron, The Application Of Adsorprion Isotherms With Proper Fitting To Interpret Polyphenol Bioaccessibility In Vitro (poster)

8. KOI2018 (17th International Conference on Operational Research), Zadar, September 26–28, 2018, P. Nikić, R. Scitovski, K. Sabo, S. Majstorović, A fast algorithm for solving the multiple ellipse detection problem (slides)

9. ICAMCS2018 (International Conference on Applied Mathematics & Computational Science), Budapest, October 6–8, 2018, R. Scitovski, K. Sabo, A fast and efficient method for solving the multiple generalized circle detection problem (slides)

10. Statistical seminar, Department of Mathematics, University of Osijek, December, 2018: S. Jelić, K. Sabo, Modeli kratkoročne prognoze koncentracije peludi bazirani na strojnom učenju (slides)

11. Seminar for optimization and applications, Department of Mathematics, University of Osijek, December, 2018: R. Scitovski, Prepoznavanje nekih geometrijskih objekata u ravnini (slides)

12. Statistical seminar, Department of Mathematics, University of Osijek, February, 2019: U. Radojičić, *Algoritmi za inicijalizaciju Gaussovih miješanih modela I, II*

13. Women in data science conference Croatia Osijek, March, 2019: M. Benšić, K. Sabo, P. Taler, S. Hamedović, *Određivanje preciznih mjera objekta iz zašumljenih podataka (slides)*

14. BIOSTAT 2019, June, 2019, Andrew R. Barron, Lidija Jakobek Barron, Mirta Benšić, Petra Matić, *Statistical fitting of adsorption isotherms* (slides)

15. The 18th Conference of the Applied Stochastic Models and Data Analysis International Society (ASMDA2019), Mirta Benšić, Kristian Sabo, Safet Hamedović, E*stimating the width of uniform distribution under measurement errors *(slides)

16. 21st European Young Statisticians Meeting, Belgrade 29 July - 02 August 2019, Una Radojičić, *Algorithms for initialization of Gaussian **Mixture Models *(slides)

17. The 15th International Symposium on Operations Research in Slovenia | 25th – 27th September 2019, Bled, Slovenia, Dragan Jukić, Kristian Sabo, *An existence criterion for the sum of squares *(slides)

**Project promotion**

Sveučilišni glasnik, No 27, July, 14th, 2017 (page 11): Optimizacijski i statistički modeli prepoznavanje svojstava skupova podataka izmjerenih s pogreškama