Open Access iconOpen Access

ARTICLE

crossmark

Webpage Matching Based on Visual Similarity

Mengmeng Ge1, Xiangzhan Yu1,*, Lin Ye1,2, Jiantao Shi1

1 School of Cyberspace Science, Harbin Institute of Technology, Harbin, 150001, China
2 Department of Computer and Information Science, Temple University, Philadelphia, 42101, USA

* Corresponding Author: Xiangzhan Yu. Email: email

Computers, Materials & Continua 2022, 71(2), 3393-3405. https://doi.org/10.32604/cmc.2022.017220

Abstract

With the rapid development of the Internet, the types of webpages are more abundant than in previous decades. However, it becomes severe that people are facing more and more significant network security risks and enormous losses caused by phishing webpages, which imitate the interface of real webpages and deceive the victims. To better identify and distinguish phishing webpages, a visual feature extraction method and a visual similarity algorithm are proposed. First, the visual feature extraction method improves the Vision-based Page Segmentation (VIPS) algorithm to extract the visual block and calculate its signature by perceptual hash technology. Second, the visual similarity algorithm presents a one-to-one correspondence based on the visual blocks’ coordinates and thresholds. Then the weights are assigned according to the tree structure, and the similarity of the visual blocks is calculated on the basis of the measurement of the visual features’ Hamming distance. Further, the visual similarity of webpages is generated by integrating the similarity and weight of different visual blocks. Finally, multiple pairs of phishing webpages and legitimate webpages are evaluated to verify the feasibility of the algorithm. The experimental results achieve excellent performance and demonstrate that our method can achieve 94% accuracy.

Keywords


Cite This Article

M. Ge, X. Yu, L. Ye and J. Shi, "Webpage matching based on visual similarity," Computers, Materials & Continua, vol. 71, no.2, pp. 3393–3405, 2022. https://doi.org/10.32604/cmc.2022.017220



cc This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 1349

    View

  • 898

    Download

  • 0

    Like

Share Link