Predicting innovation is a peculiar problem in data science. Following its definition, an innovation is always a never-seen-before event, making the usual approach of learning patterns from the past a useless exercise. Here we propose a strategy to address the problem in the context of innovative patents, by defining innovation as never-seen-before associations of technologies. We think of technological codes present in patents as a vocabulary and the whole technological corpus as written in a specific, evolving language. We leverage such structure with techniques borrowed from Natural Language Processing by embedding technologies in a high dimensional euclidean space where relative positions are representative of learned semantics. Dynamics on this space predicts specific innovation events, that are tested against null models. These methods provide a completely new way of understanding and forecasting innovation, by tackling it from a revealing perspective and opening interesting scenarios for a number of applications and further analytical approaches.

The Language of Innovation

2018

Abstract

Predicting innovation is a peculiar problem in data science. Following its definition, an innovation is always a never-seen-before event, making the usual approach of learning patterns from the past a useless exercise. Here we propose a strategy to address the problem in the context of innovative patents, by defining innovation as never-seen-before associations of technologies. We think of technological codes present in patents as a vocabulary and the whole technological corpus as written in a specific, evolving language. We leverage such structure with techniques borrowed from Natural Language Processing by embedding technologies in a high dimensional euclidean space where relative positions are representative of learned semantics. Dynamics on this space predicts specific innovation events, that are tested against null models. These methods provide a completely new way of understanding and forecasting innovation, by tackling it from a revealing perspective and opening interesting scenarios for a number of applications and further analytical approaches.
2018
Inglese
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/321833
Il codice NBN di questa tesi è URN:NBN:IT:BNCF-321833