Even if PATSTAT is based on application ID it may be useful (and much people do) to have a publication number based database.
Publication number is easier to use, it may be bridged to other patent data and so on.
In some previous posts I highlightened the link among appln_id and punr is not painless.
Here I suggest some test we may use to check if our data implemented correctly and consistently publication number/application id linkage.
Publication number is easier to use, it may be bridged to other patent data and so on.
In some previous posts I highlightened the link among appln_id and punr is not painless.
Here I suggest some test we may use to check if our data implemented correctly and consistently publication number/application id linkage.
Since this relation may be m x n some examples are listed in order to enable you to see in critical cases if implemented data work correctly.
Cases examined are multiple publication numbers related to the same application ids, and on the other hand, multiple application_id related to the same publication number.
At the end of each paragraph there is a test to make on your DB.
Appln_ids and data are from patstat 2010/10 ediction.
1) double punr
We have several cases where publication number changes with the status of the application.
This is a count by publication authority, excluding D2 application kind and 9999 filing date applications.
auth | appl # | # multiple punr | share |
'AP' | 5138 | 2032 | 39,55% |
'AT' | 1062556 | 87749 | 8,26% |
'AU' | 1533392 | 19363 | 1,26% |
'BE' | 366138 | 1 | 0,00% |
'BG' | 53556 | 1076 | 2,01% |
'BR' | 507674 | 2 | 0,00% |
'CA' | 1163197 | 308 | 0,03% |
'CH' | 1053680 | 5 | 0,00% |
'CN' | 3194875 | 593155 | 18,57% |
'CS' | 162230 | 32723 | 20,17% |
'CU' | 2997 | 100 | 3,34% |
'CZ' | 68576 | 21007 | 30,63% |
'DD' | 177131 | 78 | 0,04% |
'DE' | 6024317 | 11 | 0,00% |
'DK' | 378544 | 36042 | 9,52% |
'EA' | 18778 | 4074 | 21,70% |
'EE' | 6422 | 1913 | 29,79% |
'ES' | 919193 | 60551 | 6,59% |
'FI' | 230851 | 70950 | 30,73% |
'GB' | 3282728 | 343413 | 10,46% |
'GE' | 136 | 1 | 0,74% |
'GR' | 97851 | 2286 | 2,34% |
'HR' | 12015 | 1 | 0,01% |
'HU' | 136668 | 46219 | 33,82% |
'IE' | 91747 | 22197 | 24,19% |
'IL' | 190468 | 11 | 0,01% |
'IN' | 68199 | 2 | 0,00% |
'IS' | 7813 | 2438 | 31,20% |
'IT' | 674327 | 264070 | 39,16% |
'JP' | 16002805 | 4406789 | 27,54% |
'KR' | 1898155 | 189594 | 9,99% |
'LT' | 3773 | 2553 | 67,66% |
'LV' | 4968 | 2 | 0,04% |
'MC' | 2757 | 1 | 0,04% |
'MD' | 4685 | 691 | 14,75% |
'MX' | 185994 | 720 | 0,39% |
'NL' | 422670 | 53729 | 12,71% |
'NO' | 221330 | 69807 | 31,54% |
'PL' | 232387 | 91078 | 39,19% |
'PT' | 86822 | 1 | 0,00% |
'RO' | 65905 | 11 | 0,02% |
'RU' | 444798 | 69317 | 15,58% |
'SE' | 551022 | 397 | 0,07% |
'SI' | 19604 | 32 | 0,16% |
'SK' | 23432 | 8558 | 36,52% |
'SM' | 642 | 72 | 11,21% |
'SU' | 1232392 | 50 | 0,00% |
'TJ' | 374 | 84 | 22,46% |
'TW' | 373289 | 7 | 0,00% |
'US' | 10320935 | 1049026 | 10,16% |
'YU' | 33649 | 12316 | 36,60% |
'ZA' | 268184 | 1 | 0,00% |
This issue has impact both on count of application and on count of citations.
1a) a case @ USPTO
Us patents get a publication number like YYYYXXXXX whith the first publication, and a different publication number when granted.
FI appln_id 58139710 is published both as
'US', ' 7285137', 'B2'
'US', ' 2005166335', 'A1'
In our DB the two publications should be somehow the same (maybe keeping only granted patents).
This issue should be verified also on citations.
another sinthomatic case is appln_id 48363687
61180019, 'US', ' 7345336', 'B2', 48363687, '2008-03-18', 'EN', 1
61180020, 'US', ' 2005133849', 'A1', 48363687, '2005-06-23', 'EN', 0
that has also a lot of record in citation table TLS212
PAT_PUBLN_ID | CITN_ID | CITED_PAT_PUBLN_ID | PAT_CITN_SEQ_NR | NPL_CITN_SEQ_NR | CITN_ORIGIN |
17472674 | 2 | 61180020 | 2 | 0 | '0 ' |
61180019 | 1 | 69488265 | 1 | 0 | '0 ' |
61180019 | 2 | 68438301 | 2 | 0 | '0 ' |
61180019 | 3 | 66754652 | 3 | 0 | '0 ' |
61180019 | 4 | 70514475 | 4 | 0 | '0 ' |
61180019 | 5 | 65067355 | 5 | 0 | '0 ' |
61180019 | 6 | 66578342 | 6 | 0 | '0 ' |
61180019 | 9 | 73815080 | 7 | 0 | '1 ' |
61180019 | 10 | 73815081 | 8 | 0 | '1 ' |
61180019 | 11 | 73815082 | 9 | 0 | '1 ' |
61180019 | 12 | 73815083 | 10 | 0 | '1 ' |
66070794 | 3 | 61180019 | 3 | 0 | '0 ' |
66070794 | 4 | 61180020 | 4 | 0 | '0 ' |
and we see how strange is that publication 66070794 cites both punr!
66070794, 'US', ' 7470592', 'B2', 52789971, '2008-12-30', 'EN', 1
TEST: see if PUNR 7470592 cites 7345336 and/or 2005133849 [should cite only one]
See if in our punr US 7345336 and/or 2005133849 exist [should exist only one]
1b) same for japan, but 5
appln_id 27213431
PAT_PUBLN_ID | PUBLN_AUTH | PUBLN_NR | PUBLN_KIND | APPLN_ID | PUBLN_DATE | PUBLN_FIRST_GRANT |
33698332 | 'JP' | ' 1427889' | 'C' | 27213431 | '1988-02-25' | 0 |
33698333 | 'JP' | ' 62034618' | 'B2' | 27213431 | '1987-07-28' | 1 |
33698334 | 'JP' | ' 62034618' | 'T3' | 27213431 | '1987-07-28' | 0 |
33698335 | 'JP' | ' 58134863' | 'A' | 27213431 | '1983-08-11' | 0 |
33698336 | 'JP' | ' 58134863' | 'T1' | 27213431 | '1983-08-11' | 0 |
We go to citations and we see
PAT_PUBLN_ID | CITN_ID | CITED_PAT_PUBLN_ID | PAT_CITN_SEQ_NR | CITN_ORIGIN |
18796442 | 5 | 33698335 | 5 | '4 ' |
18796442 | 6 | 33698336 | 6 | '4 ' |
18796442 | 7 | 33698333 | 7 | '4 ' |
18796442 | 8 | 33698334 | 8 | '4 ' |
33698333 | 1 | 38471795 | 1 | '0 ' |
33698333 | 2 | 42774760 | 2 | '0 ' |
33698333 | 3 | 35800209 | 3 | '0 ' |
48589428 | 3 | 33698335 | 3 | '0 ' |
52762211 | 3 | 33698335 | 3 | '0 ' |
That is EP1182223
But if we look into citations in espacenet we see only JP58134863 (A)
http://worldwide.espacenet.com/allCitations?compact=true&page=0&KC=A1&NR=1182223A1&DB=EPODOC&locale=en_EP&CC=EP&FT=D
TEST: see if PUNR EP1182223 cites JP 62034618 and/or 58134863 [should cite only one]
No comments:
Post a Comment