Home / Journals / CMC / Online First / doi:10.32604/cmc.2025.073155
Special Issues
Table of Content

Open Access

ARTICLE

Defending against Topological Information Probing for Online Decentralized Web Services

Xinli Hao1, Qingyuan Gong2, Yang Chen1,*
1 Shanghai Key Lab of Intelligent Information Processing, College of Computer Science and Artificial Intelligence, Fudan University, Shanghai, 200433, China
2 Research Institute of Intelligent Complex Systems, Fudan University, Shanghai, 200433, China
* Corresponding Author: Yang Chen. Email: email
(This article belongs to the Special Issue: Cyberspace Mapping and Anti-Mapping Techniques)

Computers, Materials & Continua https://doi.org/10.32604/cmc.2025.073155

Received 11 September 2025; Accepted 31 October 2025; Published online 02 December 2025

Abstract

Topological information is very important for understanding different types of online web services, in particular, for online social networks (OSNs). People leverage such information for various applications, such as social relationship modeling, community detection, user profiling, and user behavior prediction. However, the leak of such information will also pose severe challenges for user privacy preserving due to its usefulness in characterizing users. Large-scale web crawling-based information probing is a representative way for obtaining topological information of online web services. In this paper, we explore how to defend against topological information probing for online web services, with a particular focus on online decentralized web services such as Mastodon. Different from traditional centralized web services, the federated nature of decentralized web services makes the identification of distributed crawlers even more difficult. We analyze the behavioral differences between legitimate users and crawlers in decentralized web services and highlight two key behavioral attributes that distinguish crawlers from legitimate users: instance interaction preferences and hop count in profile viewing patterns. Based on these insights: we propose a supervised machine learning-based framework for crawler detection, which is able to learn the federation-aware feature representations for users. To validate the framework’s effectiveness, we construct a labeled dataset that integrates real users with real-trace driven simulated crawlers in Mastodon. We use this dataset to train various supervised classifiers for crawler detection. Experimental results demonstrate that our framework can achieve an excellent classification performance. Moreover, it is observed that federation-aware features are effective in improving detection performance.

Keywords

Anti-mapping; crawler detection; machine learning; decentralized online social networks
  • 69

    View

  • 10

    Download

  • 0

    Like

Share Link