Segyeong Bang#, Soeun Kim#, Gaeun Ahn, Hyemin Hong, Junhyoung Oh*
CMC-Computers, Materials & Continua, Vol.85, No.3, pp. 4629-4643, 2025, DOI:10.32604/cmc.2025.068221
- 23 October 2025
Abstract This study analyzes the risks of re-identification in Korean text data and proposes a secure, ethical approach to data anonymization. Following the ‘Lee Luda’ AI chatbot incident, concerns over data privacy have increased. The Personal Information Protection Commission of Korea conducted inspections of AI services, uncovering 850 cases of personal information in user input datasets, highlighting the need for pseudonymization standards. While current anonymization techniques remove personal data like names, phone numbers, and addresses, linguistic features such as writing habits and language-specific traits can still identify individuals when combined with other data. To address this,… More >