Open Access
ARTICLE
Using Outlier Detection to Identify Grey-Sheep Users in Recommender Systems: A Comparative Study
Center for Decision Making and Optimization, Department of Information Technology and Management, Illinois Institute of Technology, Chicago, IL 60616, USA
* Corresponding Author: Yong Zheng. Email:
Computers, Materials & Continua 2025, 83(3), 4315-4328. https://doi.org/10.32604/cmc.2025.063498
Received 16 January 2025; Accepted 28 March 2025; Issue published 19 May 2025
Abstract
A recommender system is a tool designed to suggest relevant items to users based on their preferences and behaviors. Collaborative filtering, a popular technique within recommender systems, predicts user interests by analyzing patterns in interactions and similarities between users, leveraging past behavior data to make personalized recommendations. Despite its popularity, collaborative filtering faces notable challenges, and one of them is the issue of grey-sheep users who have unusual tastes in the system. Surprisingly, existing research has not extensively explored outlier detection techniques to address the grey-sheep problem. To fill this research gap, this study conducts a comprehensive comparison of 12 outlier detection methods (such as LOF, ABOD, HBOS, etc.) and introduces innovative user representations aimed at improving the identification of outliers within recommender systems. More specifically, we proposed and examined three types of user representations: 1) the distribution statistics of user-user similarities, where similarities were calculated based on users’ rating vectors; 2) the distribution statistics of user-user similarities, but with similarities derived from users represented by latent factors; and 3) latent-factor vector representations. Our experiments on the MovieLens and Yahoo!Movie datasets demonstrate that user representations based on latent-factor vectors consistently facilitate the identification of more grey-sheep users when applying outlier detection methods.Keywords
Cite This Article

This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.