Wednesday, April 5, 2017

CPC coverage by patent office in patstat

In this post I make a breif analisys of CPC patent classification coverage by patent office.
This classification, jointly developed by EPO and USPTO, is in tanle TLS224_appln_cpc and it's expected to have a good coverage in the two concerned patent offices, but what about the rest?

I ran this simple query, giving a cut for appln_auth with less than 10K applications (if you want all just remove last line in the sql)




SELECT
  a.APPLN_KIND,
  a.APPLN_AUTH,
  Count(distinct a.APPLN_ID) AS Count_APPLN_ID, count(distinct b.appln_id) count_app_with_cpc, count(distinct b.appln_id)/Count(distinct a.APPLN_ID) as ratio
FROM
  patstat.tls201_appln a
  LEFT JOIN   patstat.tls224_appln_cpc b  ON a.APPLN_ID = b.appln_id
WHERE
  a.APPLN_KIND in ('A','W', 'U')
GROUP BY
  a.APPLN_KIND,
  a.APPLN_AUTH
having Count_APPLN_ID > 10000


APPLN_KIND
APPLN_AUTH
Count_APPLN_ID
count_app_with_cpc
ratio
A
AR
143854
103392
0,7187
A
AT
586654
174188
0,2969
A
AU
987424
723214
0,7324
A
BE
646087
551562
0,8537
A
BG
43815
10902
0,2488
A
BR
528463
357156
0,6758
A
CA
3192685
1253286
0,3925
A
CH
1048056
570376
0,5442
A
CL
11826
9228
0,7803
A
CN
5810469
1977959
0,3404
A
CO
23351
18157
0,7776
A
CS
163205
32824
0,2011
A
CZ
51254
35776
0,698
A
DD
235473
47646
0,2023
A
DE
4590544
3836312
0,8357
A
DK
318552
118879
0,3732
A
EA
40793
36499
0,8947
A
EG
14049
10349
0,7366
A
EP
3150486
3039800
0,9649
A
ES
421955
201679
0,478
A
FI
250436
111626
0,4457
A
FR
3091545
2380576
0,77
A
GB
3376290
2110469
0,6251
A
GR
69161
24458
0,3536
A
HK
127831
114540
0,896
A
HR
11539
6773
0,587
A
HU
128298
70045
0,546
A
ID
12502
10306
0,8243
A
IE
91640
42965
0,4688
A
IL
214488
120983
0,5641
A
IN
104686
45932
0,4388
A
IT
604939
327372
0,5412
A
JP
13783554
4347397
0,3154
A
KR
2729625
1113216
0,4078
A
LU
68593
59739
0,8709
A
MA
18584
13679
0,7361
A
MX
255693
229828
0,8988
A
MY
49897
39764
0,7969
A
NL
593500
526752
0,8875
A
NO
221586
170704
0,7704
A
NZ
139141
108558
0,7802
A
OA
13410
13136
0,9796
A
PE
19116
17210
0,9003
A
PH
26156
20769
0,794
A
PL
244106
79350
0,3251
A
PT
41819
33927
0,8113
A
RO
71993
15081
0,2095
A
RU
642467
192483
0,2996
A
SE
857962
329536
0,3841
A
SG
97293
85477
0,8786
A
SK
23906
19096
0,7988
A
SU
1363162
100320
0,0736
A
TR
29672
14789
0,4984
A
TW
697410
469009
0,6725
A
UA
53803
17525
0,3257
A
US
12495559
11412288
0,9133
A
UY
10210
8238
0,8069
A
YU
43063
24681
0,5731
A
ZA
293233
191248
0,6522
U
AT
15709
10549
0,6715
U
BR
102317
4534
0,0443
U
CN
5420771
230697
0,0426
U
CZ
29388
2229
0,0758
U
DE
1399753
611580
0,4369
U
ES
325927
31774
0,0975
U
FI
12437
1473
0,1184
U
IT
139494
12912
0,0926
U
JP
4286792
116688
0,0272
U
KR
503468
17394
0,0345
U
PL
23218
913
0,0393
U
RU
162859
5533
0,034
U
TR
23205
436
0,0188
U
TW
398483
31273
0,0785
U
UA
98814
1900
0,0192
W
AT
10289
10257
0,9969
W
AU
38912
38611
0,9923
W
CA
39496
39314
0,9954
W
CH
13751
13566
0,9865
W
CN
143897
140148
0,9739
W
DE
64921
64671
0,9961
W
DK
17546
17422
0,9929
W
EP
446196
444400
0,996
W
ES
16506
16379
0,9923
W
FI
24782
24678
0,9958
W
FR
80138
79859
0,9965
W
GB
112553
112219
0,997
W
IB
129773
128258
0,9883
W
IL
21456
21332
0,9942
W
IT
12395
12255
0,9887
W
JP
480898
475730
0,9893
W
KR
112245
111369
0,9922
W
NL
20444
20164
0,9863
W
RU
13810
13702
0,9922
W
SE
52713
52544
0,9968
W
US
974794
972636
0,9978


Results are, as many other in PATSTAT, very biased toward EU/US and PCT procedure; FI China, Japan have a small coverage.

IF I limit the analisys to the last 10 years (>2005) we might expect an improvement, maybe some data are not available because CPC classification is very recent.

In reality results do not improve, showing that very likely the issue is about data transmission to EPO of patent classification, from other authorities.



SELECT
  a.APPLN_KIND,
  a.APPLN_AUTH,
  Count(distinct a.APPLN_ID) AS Count_APPLN_ID, count(distinct b.appln_id) count_app_with_cpc, count(distinct b.appln_id)/Count(distinct a.APPLN_ID) as ratio
FROM
  patstat.tls201_appln a
  LEFT JOIN   patstat.tls224_appln_cpc b ON a.APPLN_ID = b.appln_id
WHERE
  a.APPLN_KIND in ('A','W', 'U') and a.earliest_filing_year >2005
GROUP BY
  a.APPLN_KIND,
  a.APPLN_AUTH
having Count_APPLN_ID > 10000



APPLN_KIND
APPLN_AUTH
Count_APPLN_ID
count_app_with_cpc
ratio
A
AR
37718
30567
0,8104
A
AT
17481
12258
0,7012
A
AU
246785
190383
0,7715
A
BE
204142
192894
0,9449
A
BR
118530
88414
0,7459
A
CA
1841118
297277
0,1615
A
CH
15308
11809
0,7714
A
CL
10342
8322
0,8047
A
CN
4751372
1369961
0,2883
A
CO
11994
9692
0,8081
A
DD
59182
1291
0,0218
A
DE
784461
736800
0,9392
A
DK
60805
2562
0,0421
A
EA
25419
22211
0,8738
A
EP
1154319
1090219
0,9445
A
ES
24405
22123
0,9065
A
FI
56764
9281
0,1635
A
FR
179749
175792
0,978
A
GB
257096
108321
0,4213
A
HK
41888
34428
0,8219
A
IL
43564
35846
0,8228
A
IN
44875
21109
0,4704
A
IT
83093
57989
0,6979
A
JP
3051373
1031840
0,3382
A
KR
1488091
644638
0,4332
A
MX
98924
91765
0,9276
A
MY
12219
7964
0,6518
A
NL
128117
113945
0,8894
A
NO
13834
11155
0,8063
A
NZ
27915
21952
0,7864
A
PL
28766
4635
0,1611
A
RU
332559
107907
0,3245
A
SE
516187
78022
0,1512
A
SG
50077
45657
0,9117
A
TR
11142
1607
0,1442
A
TW
420134
270784
0,6445
A
UA
26091
8445
0,3237
A
US
3611958
3525507
0,9761
A
ZA
35141
24542
0,6984
U
BR
22582
2877
0,1274
U
CN
4536473
205099
0,0452
U
CZ
13110
1178
0,0899
U
DE
450225
142846
0,3173
U
ES
24676
3740
0,1516
U
IT
20400
2419
0,1186
U
JP
76949
6040
0,0785
U
KR
119275
9100
0,0763
U
RU
107186
4734
0,0442
U
TR
16253
286
0,0176
U
TW
216656
15305
0,0706
U
UA
90212
1687
0,0187
W
AU
15372
15120
0,9836
W
CA
18968
18836
0,993
W
CN
132211
128494
0,9719
W
DE
15566
15495
0,9954
W
EP
269767
268788
0,9964
W
ES
10324
10248
0,9926
W
FI
10036
9949
0,9913
W
FR
31816
31649
0,9948
W
GB
39827
39618
0,9948
W
IB
82317
81189
0,9863
W
IL
11361
11270
0,992
W
JP
321399
316659
0,9853
W
KR
88622
87809
0,9908
W
SE
16835
16736
0,9941
W
US
470474
468811
0,9965