When trying to consolidate data on inventors in PATSTAT, a part from name, we need some toponomastic data.
An innovator named "J. Smith" would be difficult to connect either with "John Smith" or with "James Smith", even if we have some further data like address, city...
So I investigated with some queries on patstat ediction 10/2009 in what percentuage the fields address, city, zip code and country are filled, by patent office.
I used tables TLS201_APPLN for linking application authority, and TLS206_ASCII (that is the ascii/parsed version of TLS206, included in patstat) for person ids and data.
For selecting distinct person id for inventors (A_I_FLAG = 'I') I used this SQL
Select
T1.APPLN_AUTH, Count(Distinct t6.PERSON_ID)
From
tls201_appln T1 Inner Join tls206_ascii t6
On t6.APPLN_ID = T1.APPLN_ID
Where t6.A_I_FLAG = 'I';
Here below I'm pasting the resulting table for the TOP 20 by inventor count, the full table can be downloaded @ this link.
It might be argued that such a count could have more sense if done by publication authority instead of application authority, since data are taken from search report and it can be noticed that FI data applied @ EPO and published @ WIPO/PCT differ in quality from those applied and published @ EPO.
Maybe it could be topic of a further post...
An innovator named "J. Smith" would be difficult to connect either with "John Smith" or with "James Smith", even if we have some further data like address, city...
So I investigated with some queries on patstat ediction 10/2009 in what percentuage the fields address, city, zip code and country are filled, by patent office.
I used tables TLS201_APPLN for linking application authority, and TLS206_ASCII (that is the ascii/parsed version of TLS206, included in patstat) for person ids and data.
For selecting distinct person id for inventors (A_I_FLAG = 'I') I used this SQL
Select
T1.APPLN_AUTH, Count(Distinct t6.PERSON_ID)
From
tls201_appln T1 Inner Join tls206_ascii t6
On t6.APPLN_ID = T1.APPLN_ID
Where t6.A_I_FLAG = 'I';
Here below I'm pasting the resulting table for the TOP 20 by inventor count, the full table can be downloaded @ this link.
APPLN_AUTH | inventors | no state | no zip | no country | no address | no city |
US | 5960856 | 86% | 98% | 21% | 97% | 25% |
EP | 3705123 | 100% | 100% | 0% | 1% | 1% |
DE | 2750079 | 100% | 100% | 33% | 100% | 100% |
JP | 1798271 | 100% | 100% | 98% | 99% | 100% |
CN | 1537587 | 100% | 100% | 2% | 100% | 100% |
CA | 1120490 | 100% | 100% | 45% | 100% | 100% |
AU | 1087573 | 100% | 100% | 98% | 100% | 100% |
SU | 968915 | 100% | 100% | 41% | 100% | 100% |
AT | 653048 | 100% | 100% | 29% | 100% | 100% |
KR | 637296 | 100% | 100% | 14% | 100% | 100% |
FR | 565254 | 100% | 100% | 98% | 99% | 100% |
GB | 531087 | 100% | 100% | 70% | 65% | 100% |
RU | 394691 | 100% | 100% | 29% | 100% | 100% |
CH | 338739 | 100% | 100% | 11% | 100% | 100% |
BR | 292047 | 100% | 100% | 89% | 100% | 100% |
SE | 256248 | 100% | 100% | 85% | 98% | 100% |
FI | 212722 | 100% | 100% | 11% | 43% | 100% |
IT | 192460 | 100% | 100% | 74% | 100% | 100% |
ES | 133471 | 100% | 100% | 17% | 100% | 100% |
DD | 129845 | 100% | 100% | 7% | 97% | 100% |
It might be argued that such a count could have more sense if done by publication authority instead of application authority, since data are taken from search report and it can be noticed that FI data applied @ EPO and published @ WIPO/PCT differ in quality from those applied and published @ EPO.
Maybe it could be topic of a further post...
No comments:
Post a Comment