In this study we propose a class of hedonic regression models to predict prices of fashion products using attributes obtained featurizing text. Using the internet as a source of data, we developed web-scrapers to collect data on prices and product descriptions of items sold in the websites of five famous fashion retailers and producers. For a set of scraped items, given the pair (price, description) our goal is to estimate hedonic regression models by leveraging the information about the product contained in the description. After each description is mapped to a point in a high-dimensional vector space, our estimation strategy uses sparse modelling, as well as text mining techniques of dimensionality reduction and topic modelling to find the model with the best out-of-sample predictive performance. We refer to this approach as Hedonic Text-Regression modelling. With this approach, we estimate the implicit price of words that are used in descriptions. To the best of our knowledge no previous work has been conducted in the Fashion industry. Empirically, the proposed models outperform the traditional hedonic pricing models in terms of predictive accuracy while performing also consistent variable selection.

Text based pricing modelling: an application to the fashion industry

2020

Abstract

In this study we propose a class of hedonic regression models to predict prices of fashion products using attributes obtained featurizing text. Using the internet as a source of data, we developed web-scrapers to collect data on prices and product descriptions of items sold in the websites of five famous fashion retailers and producers. For a set of scraped items, given the pair (price, description) our goal is to estimate hedonic regression models by leveraging the information about the product contained in the description. After each description is mapped to a point in a high-dimensional vector space, our estimation strategy uses sparse modelling, as well as text mining techniques of dimensionality reduction and topic modelling to find the model with the best out-of-sample predictive performance. We refer to this approach as Hedonic Text-Regression modelling. With this approach, we estimate the implicit price of words that are used in descriptions. To the best of our knowledge no previous work has been conducted in the Fashion industry. Empirically, the proposed models outperform the traditional hedonic pricing models in terms of predictive accuracy while performing also consistent variable selection.
5-nov-2020
Inglese
Freo, Marzia
Università degli Studi di Bologna
File in questo prodotto:
File Dimensione Formato  
crescenzi_federico_tesi.pdf

accesso aperto

Tipologia: Altro materiale allegato
Dimensione 9.75 MB
Formato Adobe PDF
9.75 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/151774
Il codice NBN di questa tesi è URN:NBN:IT:UNIBO-151774