(c) Larry Ewing, Simon Budig, Garrett LeSage
Ó 1994 Ç.

Department of Computer Science

PetrSU | Software projects | AMICT | Staff | News archive | Contact | Search

Web Spam Characterization and Automating Detection

Pavel Vanag (Petrozavodsk State University)

Web spamming refers to actions intended to mislead search engines into ranking some pages higher than they deserve. Recently, the amount of web spam has increased dramatically, leading to a degradation of search results. Ranking in search engines become more and more commercial because there are some people trying to mislead search engines, so that their pages would rank high in search results, and thus, capture user attention. Just as with emails, we can talk about the phenomenon of spamming the Web. The primary consequence of web spamming is that the quality of search results decreases.

In this work we perform several known metrics of spam detection, such parameters as degree correlations, number of neighbors and others. This metrics are available and based on statistical analysis of a large collection of Web pages, focusing on spam detection. And also we offer mathematical model of automating detection web spam.

Recent search systems implement some schemes of spam pages detection, but as we know they are far from ideal and sometimes spam disturb as surfing internet. So we try investigating, analyzing and making better search engines.