The Web is the biggest repository of documents humans have ever built. Even more, it is increasingly growing in size every day. Users rely on Web search engines (WSEs) for finding information on the Web. By submitting a textual query expressing their information need, WSE users obtain a list of documents that are highly relevant to the query. Moreover, WSEs tend to store such huge amount of users activities in "query logs". Query log mining is the set of techniques aiming at extracting valuable knowledge from query logs. This knowledge represents one of the most used ways of enhancing the users’ search experience. According to this vision, in this thesis we firstly prove that the knowledge extracted from query logs suffer aging effects and we thus propose a solution to this phenomenon. Secondly, we propose new algorithms for query recommendation that overcome the aging problem. Moreover, we study new query recommendation techniques for efficiently producing recommendations for rare queries. Finally, we study the problem of diversifying Web search engine results. We define a methodology based on the knowledge derived from query logs for detecting when and how query results need to be diversified and we develop an efficient algorithm for diversifying search results.

Query Log Mining to Enhance User Experience in Search Engines

2011

Abstract

The Web is the biggest repository of documents humans have ever built. Even more, it is increasingly growing in size every day. Users rely on Web search engines (WSEs) for finding information on the Web. By submitting a textual query expressing their information need, WSE users obtain a list of documents that are highly relevant to the query. Moreover, WSEs tend to store such huge amount of users activities in "query logs". Query log mining is the set of techniques aiming at extracting valuable knowledge from query logs. This knowledge represents one of the most used ways of enhancing the users’ search experience. According to this vision, in this thesis we firstly prove that the knowledge extracted from query logs suffer aging effects and we thus propose a solution to this phenomenon. Secondly, we propose new algorithms for query recommendation that overcome the aging problem. Moreover, we study new query recommendation techniques for efficiently producing recommendations for rare queries. Finally, we study the problem of diversifying Web search engine results. We define a methodology based on the knowledge derived from query logs for detecting when and how query results need to be diversified and we develop an efficient algorithm for diversifying search results.
5-mag-2011
Italiano
Simoncini, Luca
Silvestri, Fabrizio
Università degli Studi di Pisa
File in questo prodotto:
File Dimensione Formato  
PhDThesis_FrancoMariaNardini.pdf

accesso aperto

Tipologia: Altro materiale allegato
Dimensione 6.48 MB
Formato Adobe PDF
6.48 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/130744
Il codice NBN di questa tesi è URN:NBN:IT:UNIPI-130744