Detecting spam blogs from blog search results

Linhong Zhu, Aixin Sun*, Byron Choi

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

15 Citations (Scopus)
15 Downloads (Pure)

Abstract

Blogging has been an emerging media for people to express themselves. However, the presence of spam blogs (also known as splogs) may reduce the value of blogs and blog search engines. Hence, splog detection has recently attracted much attention from research. Most existing works on splog detection identify splogs using their content/link features and target on spam filters protecting blog search engines' index from spam. In this paper, we propose a splog detection framework by monitoring the on-line search results. The novelty of our splog detection is that our detection capitalizes on the results returned by search engines. The proposed method therefore is particularly useful in detecting those splogs that have successfully slipped through the spam filters that are also actively generating spam-posts. More specifically, our method monitors the top-ranked results of a sequence of temporally-ordered queries and detects splogs based on blogs' temporal behavior. The temporal behavior of a blog is maintained in a blog profile. Given blog profiles, splog detecting functions have been proposed and evaluated using real data collected from a popular blog search engine. Our experiments have demonstrated that splogs could be detected with high accuracy. The proposed method can be implemented on top of any existing blog search engine without intrusion to the latter.

Original languageEnglish
Pages (from-to)246-262
Number of pages17
JournalInformation Processing and Management
Volume47
Issue number2
DOIs
Publication statusPublished - Mar 2011

Scopus Subject Areas

  • Information Systems
  • Media Technology
  • Computer Science Applications
  • Management Science and Operations Research
  • Library and Information Sciences

User-Defined Keywords

  • Blog profile
  • Blog search
  • Blog temporal behavior
  • Spam blog
  • Spam blog detection
  • Splog

Fingerprint

Dive into the research topics of 'Detecting spam blogs from blog search results'. Together they form a unique fingerprint.

Cite this