An increasing number of studies have sprung up in recent years seeking to identify individual inventors from patent data. Different heuristics have been suggested to use their names and other information disclosed in patent documents in order to find out “who is who” in patents. This paper contributes to this literature by setting forth a methodology to identify them using patents applied to the European Patent Office (EPO hereafter). As in the large part of this literature, we basically follow a three-steps procedure: (1) the parsing stage, aimed at reducing the noise in the inventor’s name and other fields of the patent; (2) the matching stage, where name matching algorithms are used to group possible similar names; (3) the filtering stage, where additional information and different scoring schemes are used to filter out these potential same inventors. The paper includes some figures resulting of applying the algorithms to the set of European inventors applying to the EPO for a large period of time.
Miguelez, E. (AQR-IREA), Gómez-Miguélez, I.