Data segmentation methods and algorithms
Main Article Content
Abstract
With the rapid growth of information volume, almost all databases have data redundancy or insufficiency. In the process of data matching, quality issues are particularly important. In this case, the combination of data properties such as accuracy, completeness, relevance, viability, availability, and reliability determines the quality indicators in the database. As a result of the study of segmentation methods and algorithms, it can be used to match or remove duplicate attribute values. One of the main parts of segmentation is the implementation of names and addresses using rulebased methods. Implementing segmentation based on this technology significantly increases productivity.
Article Details

This work is licensed under a Creative Commons Attribution 4.0 International License.
References
Akhatov A., Nurmamatov M., Nazarov F. “Intelligent modeling and optimization of processes in the
labour market” Artificial Intelligence, Blockchain, Computing and Security - Proceedings of the
International Conference on Artificial Intelligence, Blockchain, Computing and Security, ICABCS
, 2024, 2, страницы 694–699.
Batini, C., Scannapieco, M.: Data quality: Concepts, methodologies and techniques. Data-Centric
Systems and Applications. Springer (2006).
Nurmamatov M.Q., Sariyev Sh.N., Genetik algoritmlar asosida turli sinfli ma’lumotlarni o‘zaro
moslashtirish algoritmlari. Sh.Rashidov nomidagi Samarqand Davlat Universiteti Ilmiy
axborotnomasi. 3-son (145/1) aniq va tabiy fanlar yo‘nalishi. 77-83 b.
Axatov A.R., Nurmamatov M.Q., Nazarov F.M. 2022. “Mathematical Models of Coordination of
Population Employment in the Labor Market” // Ra journal of applied research. India / –Vol. 8, Issue
– Pp. 111–119. doi:https://doi.org/10.47191/rajar/v8i2.09
Sarawagi, S.: Information extraction. Foundations and Trends in Databases 1(3), 261–377. (2008)
Prasad, K., Faruquie, T., Joshi, S., Chaturvedi, S., Subramaniam, L., Mohania, M.: Data cleansing
techniques for large enterprise datasets. In: SRII Global Conference, pp. 135–144. San Jose, USA
(2009)
Rabiner, L.: A tutorial on hidden Markov models and selected applications in speech recognition.
Proceedings of the IEEE 77(2), 257–286 (1989)
Christen, P.: Probabilistic data generation for deduplication and data linkage. In: IDEAL, Springer
LNCS, vol. 3578, pp. 109–116. Brisbane (2005)
Churches, T., Christen, P., Lim, K., Zhu, J.X.: Preparation of name and address data for record
linkage using hidden Markov models. BioMed Central Medical Informatics and Decision Making 2(9)
(2002)