Temporal Query Log Profiling to Demote Search Spam and Improve Ranking
October 26th, 2010Our paper on using temporal signals for search spam is now available for download. The work was carried out by Alex Kotov (UIUC) while interning in our group, and is being presented at CIKM 2010. From the abstract:
Temporal information can be leveraged and incorporated to improve web search ranking. In this work, we propose a method to improve the ranking of search results by identifying the fundamental properties of temporal behavior of low-quality hosts and spam-prone queries in search logs and modeling those properties as quantifiable features. In particular, we introduce the concepts of host churn, a measure of changes in host visibility for user queries, and query volatility, a measure of semantic instability of query results, and propose the methods for construction of temporal profiles from search query logs that can be used for estimation of a set of features based on the introduced concepts. The utility of the proposed concepts has been experimentally demonstrated for two language-independent search tasks: the regression-based ranking of search results and a novel classification problem of detecting spam-prone queries introduced in this work.