Recommender Systems (RSs) are software tools and techniques providing to users suggestions for items, such as what movies to watch, what music to listen to, or what items to buy. These suggestions are usually personalized, i.e., they are adapted to the user’s known preferences, which are either explicitly expressed, for example, in the form of ratings for items, or are inferred by interpreting online user activity. In many real-world scenarios obtaining preference information explicitly can be either difficult or even impossible. Hence, increasingly more RSs are built by leveraging abundant implicit feedback data, such as the log of user actions (e.g., clicks, views, or purchases), which only indirectly signal users’ preferences or opinions. Recently, various RS techniques employing implicit feedback datasets have emerged. They implicitly assume that if a user performs a domain dependent target action on an item, then she likes it. Therefore, implicit feedback models are designed to predict on which items the user will perform a target action. These items are then used to create recommendations. To make these predictions, the majority of existing models consider only actions of a single type, such as video views. However, the interactions between a user and an item are rarely limited to actions of a single type. Thus, the single action models, while providing efficient results, ignore information that could be extracted from other types of actions, for instance, clicks or bookmarks. The information about actions of multiple types can be jointly used to build more comprehensive user models that will better describe users’ preferences and as a consequence will allow creating better RSs. The use of this type of information in RSs requires addressing challenges related to the identification of correlations between action types, and the elicitation of predictive actions for a target action type. Although some preliminary works have been done in this direction, a general model exploiting a combination of action of various types has not been created yet. This thesis explores some challenges and questions that have not been addressed in previous research. It starts with an overview of the historical categorization of implicit feedback types and the current state of research on single and multi action prediction models for RSs. After that, different multi action prediction models, designed by the author of the thesis, are presented in the order of increasing complexity. The first models, which are based on implicit matrix factorization, exploit correlations between different action types through heuristics determined by exploratory analysis of the data. The heuristics are usually domain dependent and may overlook subtle relations between actions of different types. This makes models less flexible and difficult to reuse. For this reason, we have then analyzed models that use machine learning and sequence mining techniques to find correlations between different action types automatically. The automation makes the process of finding correlations between action types easier and more precise, which in turn increases the prediction accuracy of the models. It also allows considering information about the ordering of the actions and the time delays between them. Finally, more sophisticated deep learning based models have been examined. These models take into account not only the sequence of actions performed by users, but also additional information, such as an action context and user and item features. The comprehensive empirical evaluation, which was conducted on large real-world datasets, shows that using multiple actions is beneficial and it can outperform state-of-the-art single-type implicit feedback models. The evaluation also shows that models which can be used out-of-the-box in different domains and which utilize multiple action types and contextual information are superior to the other models that we have studied. These findings are also confirmed with online experiments. Finally, the analysis of models’ predictions helps to explain the peculiarities of the models and provide hints about when and which models is better to use.
Heterogeneous User Actions in Recommender Systems
2019
Abstract
Recommender Systems (RSs) are software tools and techniques providing to users suggestions for items, such as what movies to watch, what music to listen to, or what items to buy. These suggestions are usually personalized, i.e., they are adapted to the user’s known preferences, which are either explicitly expressed, for example, in the form of ratings for items, or are inferred by interpreting online user activity. In many real-world scenarios obtaining preference information explicitly can be either difficult or even impossible. Hence, increasingly more RSs are built by leveraging abundant implicit feedback data, such as the log of user actions (e.g., clicks, views, or purchases), which only indirectly signal users’ preferences or opinions. Recently, various RS techniques employing implicit feedback datasets have emerged. They implicitly assume that if a user performs a domain dependent target action on an item, then she likes it. Therefore, implicit feedback models are designed to predict on which items the user will perform a target action. These items are then used to create recommendations. To make these predictions, the majority of existing models consider only actions of a single type, such as video views. However, the interactions between a user and an item are rarely limited to actions of a single type. Thus, the single action models, while providing efficient results, ignore information that could be extracted from other types of actions, for instance, clicks or bookmarks. The information about actions of multiple types can be jointly used to build more comprehensive user models that will better describe users’ preferences and as a consequence will allow creating better RSs. The use of this type of information in RSs requires addressing challenges related to the identification of correlations between action types, and the elicitation of predictive actions for a target action type. Although some preliminary works have been done in this direction, a general model exploiting a combination of action of various types has not been created yet. This thesis explores some challenges and questions that have not been addressed in previous research. It starts with an overview of the historical categorization of implicit feedback types and the current state of research on single and multi action prediction models for RSs. After that, different multi action prediction models, designed by the author of the thesis, are presented in the order of increasing complexity. The first models, which are based on implicit matrix factorization, exploit correlations between different action types through heuristics determined by exploratory analysis of the data. The heuristics are usually domain dependent and may overlook subtle relations between actions of different types. This makes models less flexible and difficult to reuse. For this reason, we have then analyzed models that use machine learning and sequence mining techniques to find correlations between different action types automatically. The automation makes the process of finding correlations between action types easier and more precise, which in turn increases the prediction accuracy of the models. It also allows considering information about the ordering of the actions and the time delays between them. Finally, more sophisticated deep learning based models have been examined. These models take into account not only the sequence of actions performed by users, but also additional information, such as an action context and user and item features. The comprehensive empirical evaluation, which was conducted on large real-world datasets, shows that using multiple actions is beneficial and it can outperform state-of-the-art single-type implicit feedback models. The evaluation also shows that models which can be used out-of-the-box in different domains and which utilize multiple action types and contextual information are superior to the other models that we have studied. These findings are also confirmed with online experiments. Finally, the analysis of models’ predictions helps to explain the peculiarities of the models and provide hints about when and which models is better to use.I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/128680
URN:NBN:IT:UNIBZ-128680