Combining Large Datasets of Patents and Trademarks
Grid Thoma
Computer Science Division, School of Science & Technology
University of Camerino 14th Italian STATA User Annual Meeting Florence, 16 Nov 2017
I-SUG, Florence, Grid Thoma Nov 16, 2017
Combining Large Datasets of Patents and Trademarks Grid Thoma - - PowerPoint PPT Presentation
Combining Large Datasets of Patents and Trademarks Grid Thoma Computer Science Division, School of Science & Technology University of Camerino 14 th Italian STATA User Annual Meeting Florence, 16 Nov 2017 Nov 16, 2017 I-SUG, Florence,
I-SUG, Florence, Grid Thoma Nov 16, 2017
location, industry, cohort, size, listing, VC, …
Patents, trademarks, and designs EPO, WIPO, USPTO, … , families of priority links Citations / self-citations
I-SUG, Florence, Grid Thoma Nov 16, 2017
Nov 16, 2017 I-SUG, Florence, Grid Thoma
Nov 16, 2017 I-SUG, Florence, Grid Thoma
Nov 16, 2017 I-SUG, Florence, Grid Thoma
Nov 16, 2017 I-SUG, Florence, Grid Thoma
I-SUG, Florence, Grid Thoma Nov 16, 2017
I-SUG, Florence, Grid Thoma Nov 16, 2017
I-SUG, Florence, Grid Thoma Nov 16, 2017
I-SUG, Florence, Grid Thoma Nov 16, 2017
I-SUG, Florence, Grid Thoma Nov 16, 2017
I-SUG, Florence, Grid Thoma Nov 16, 2017
I-SUG, Florence, Grid Thoma Nov 16, 2017
Typically used to account for spelling variations Similarity of two strings x and y of length nx
I-SUG, Florence, Grid Thoma Nov 16, 2017
I-SUG, Florence, Grid Thoma Nov 16, 2017
I-SUG, Florence, Grid Thoma Nov 16, 2017
1 ∩ 𝑈2
1 ∪ 𝑈2
I-SUG, Florence, Grid Thoma Nov 16, 2017
1 ∩ 𝑈2
1 + 𝑈2
1 ∩ 𝑈2
1 ∪ 𝑈2
I-SUG, Florence, Grid Thoma Nov 16, 2017
Inversely weighted by the frequency ni of a
I-SUG, Florence, Grid Thoma Nov 16, 2017
𝑙|𝑦𝑙∈𝑌∩𝑍
𝑗|𝑦𝑗∈𝑌
𝑘 𝑘 |𝑧𝑘 ∈𝑍
I-SUG, Florence, Grid Thoma Nov 16, 2017
Reference dictionary (NBER Patent Data Project) A unique ID code for a patentee (file: patassg.dta)
www.uspto.gov/economics (file: owner.dta)
Patents: 1976-2006; Trademarks: 1977-2015
117,443 unique ID codes from the reference dictionary 3,462,601 (unharmonized) trademarking entity names
I-SUG, Florence, Grid Thoma Nov 16, 2017
5 digit zip codes in 98.5% of the US addresses
Removing numbers & non standard chars
Nov 16, 2017 I-SUG, Florence, Grid Thoma
I-SUG, Florence, Grid Thoma Nov 16, 2017
Nov 16, 2017 I-SUG, Florence, Grid Thoma
Nov 16, 2017 I-SUG, Florence, Grid Thoma
Nov 16, 2017 I-SUG, Florence, Grid Thoma
Nov 16, 2017 I-SUG, Florence, Grid Thoma 0% 20% 40% 60% 80% 100%
IL MA WI MO MN DE OH IN PA NC CT NY GA NJ CA TN KS VA WA OR MD UT CO TX FL MI AZ OK
state code – 2 digits
Figure 1: Share of US business patentees matched with trademarks (Notes: States with 1000+ patentees; Source: USPTO)
Share of patentees Weighted by patents
Nov 16, 2017 I-SUG, Florence, Grid Thoma 0% 20% 40% 60% 80% 100%
IL MA WI MO MN DE OH IN PA NC CT NY GA NJ CA TN KS VA WA OR MD UT CO TX FL MI AZ OK
state code – 2 digits
Figure 1: Share of US business patentees matched with trademarks (Notes: States with 1000+ patentees; Source: USPTO)
Share of patentees Weighted by patents Weighted by marks
Kruskal-Wallis rank test accepted (p=0.998)
I-SUG, Florence, Grid Thoma Nov 16, 2017
Nov 16, 2017 I-SUG, Florence, Grid Thoma
Nov 16, 2017 I-SUG, Florence, Grid Thoma
Thresholds defined through manual scrutiny
Nov 16, 2017 I-SUG, Florence, Grid Thoma
16.7% 56.1% 0.0% 4.7% 14.7% 0.0% 7.5% 0.0% 0.0% 0.1% 0% 10% 20% 30% 40% 50% 60% 70% 1 2 3 4 5 6 7 8 9
Matching score values (lower is better)
Figure 2 Distribution of the matching score of the matched names: US business patentees matched to the trademarking entity names
With priority links and manually matched
Copatentees of a patent/trademark Entity name changes (synonymies) Subsidiaries Distinct entity names Entity address changes
I-SUG, Florence, Grid Thoma Nov 16, 2017
Nov 16, 2017 I-SUG, Florence, Grid Thoma
Nov 16, 2017 I-SUG, Florence, Grid Thoma
Nov 16, 2017 I-SUG, Florence, Grid Thoma
0% 5% 10% 15% 20% 25%
1 2 3 4 5 or more
Lag in years
Figure 3. Time lag of the first trademark since year of the first patent
(Notes: US business patentees active with patenting & trademarking during 1981–2003; Source: USPTO)
small firms (less than 500 employees)