Monday, December 18, 2017

Maxmind dataset for geolocation

 At web page:

https://www.maxmind.com/en/open-source-data-and-api-for-ip-geolocation

Maxmind delivers some dataset useful for geolocation:

GeoLite2 databases are free IP geolocation databases; The GeoLite2 Country and City databases are updated on the first Tuesday of each month. The GeoLite2 ASN database is updated every Tuesday.

IP Geolocation Usage

IP geolocation is inherently imprecise. Locations are often near the center of the population. Any location provided by a GeoIP database should not be used to identify a particular address or household.
Use the Accuracy Radius as an indication of geolocation accuracy for the latitude and longitude coordinates we return for an IP address. The actual location of the IP address is likely within the area defined by this radius and the latitude and longitude coordinates.

Includes city, region, country, latitude and longitude. This product doesn't contain any IP addresses.

Includes the following fields: ( Technical Details )

  • Country Code
  • ASCII City Name
  • City Name
  • Region
  • Population
  • Latitude (The approximate latitude of the postal code, city, subdivision or country associated with the IP address.*)
  • Longitude (The approximate latitude of the postal code, city, subdivision or country associated with the IP address.*)

Another interseting dataset, even if no more mantained is World cities with population:

City/state/country text as it appears in source files is algorithmically matched against a master geocode file from Google and MaxMind open source files.

Tuesday, December 12, 2017

Google Patents public dataset

Since october 31st 2017 are available in google cloud and BigQuery platform patents data that stand behind google patents, that means worldwide bibliographic information on more than 90 million patent publications from 17 countries and US full text, provided by IFI CLAIMS Patent Services.

https://cloud.google.com/blog/big-data/2017/10/google-patents-public-datasets-connecting-public-paid-and-private-patent-data

in below page you can also find more details and examples

https://console.cloud.google.com/launcher/details/google_patents_public_datasets/google-patents-public-data

Tuesday, December 5, 2017

Addresses quality checks in PATSTAT2017b by application authority

After uploading new PATSTAT version I ran a check on coverage of addresses and at what level, also comparing with 2017a version;
I show here two tables for inventors and applicants, for authorities with more than 100K addresses.
So, for example column street ratio means the percentage of person ids in that application authority that have a street level address.
SRchg means the change betwwen 2017b and 2017a at street level coverage.
You'll note a big difference in coverage for EP between inventors and applciants; that is due to PCT patents.


For inventors.



npersons
streetratio
cityratio
countryratio
srchg
crchg
crchg
AT
950789
0
0.0022
0.7225
0
1
0.995
AU
1531395
0
0.0039
0.0524
0
1.05
0.919
BE
110226
0
0.0572
0.4439
0
1
0.998
BR
872265
0
0.1876
0.3284
0
0.96
1.112
CA
2141815
0
0.0025
0.6627
0
1.04
1.003
CH
466504
0
0.0032
0.8701
0
1
0.998
CN
4266447
0
0.0084
0.4163
0
1.12
0.916
CS
152592
0
0.0007
0.63
0
1
0.977
CZ
110426
0
0.0027
0.9955
0
1
0.999
DD
191187
0
0.026
0.7255
0
1
0.968
DE
3994768
0
0.0018
0.6704
0
1
0.998
DK
504473
0
0.119
0.6812
0
1.23
1
EP
6330666
0.92
0.9257
0.9965
1
1
1
ES
802033
0
0.0006
0.6699
0
1
0.967
FI
311973
0.0021
0.4677
0.8749
1
1
0.995
FR
1014297
0
0.0079
0.3407
0
0.99
0.959
GB
1085279
0
0.2004
0.4916
0
0.99
1.004
GR
105370
0
0.0011
0.4617
0
1
0.973
HK
252844
0
0
0.0929
0
0
0.815
HU
190601
0
0.0186
0.5856
0
0.95
1.022
IB
188565
0
0.0603
0.9237
0
1.06
1.002
IN
176692
0.001
0.3954
0.5362
1
1
0.961
IT
382664
0
0.0026
0.3065
0
1
0.921
JP
2859655
0
0.0203
0.1535
0
1.06
0.865
KR
1611359
0
0.0116
0.8534
0
1.14
0.982
MX
441343
0
0
0.4538
0
0
0.974
NO
316623
0
0.0041
0.6192
0
1
0.995
NZ
223382
0
0.0022
0.0642
0
1.05
0.719
PL
287538
0
0.097
0.7545
0
1.34
1.016
PT
168647
0
0.0005
0.9983
0
1
1
RO
106423
0
0.0293
0.6197
0
1.74
1.004
RU
809149
0.0007
0.0787
0.8349
1.75
1.85
1.012
SE
367199
0
0.0178
0.2765
0
1.01
0.984
SG
231036
0
0.3751
0.6641
0
1.11
0.978
SU
1350765
0
0.0026
0.5436
0
1
0.964
TW
852269
0
0
0.9993
0
0
1
UA
143286
0
0.0008
0.9985
0
1
1
US
10656438
0.0134
0.6699
0.8471
0.98
1
1.002
ZA
395683
0
0.0015
0.049
0
1.88
0.664


For applicants



appln_auth
npersons
streetratio
cityratio
countryratio
srchg
crchg
crchg
AT
341922
0
0.0052
0.6823
0
1
0.9912829
AU
503096
0
0.0066
0.1304
0
1.03125
0.956713133
BE
101431
0
0.0664
0.3201
0
0.998496241
0.983712354
BR
296562
0.0002
0.1405
0.8121
0.666666667
0.98943662
0.980797101
CA
862535
0.0001
0.0485
0.6535
1
0.995893224
0.998014661
CH
284867
0
0.0033
0.8876
0
1
0.997527534
CN
2278384
0
0.0054
0.3803
0
1.038461538
0.924854086
CS
117511
0
0.0003
0.5866
0
1
0.95165477
DE
1513829
0
0.0014
0.5525
0
1
0.977184294
DK
192204
0.0001
0.0722
0.805
1
1.199335548
0.994686766
EP
1060347
0.6264
0.6311
0.9973
1.004651163
1.005256451
0.999498898
ES
511244
0
0.0005
0.502
0
1
0.90515687
FI
122141
0.0015
0.3993
0.9551
1.071428571
0.996257485
0.99614101
FR
648475
0
0.0055
0.7175
0
1
0.997220292
GB
1201337
0.0002
0.0751
0.3621
1
0.997343958
0.993143171
IB
149468
0.0001
0.0278
0.9911
1
1.07751938
0.999798245
IL
139467
0
0.0269
0.4949
0
1.735483871
0.994573955
IT
380740
0
0.2215
0.7166
0
1
0.993208593
JP
885468
0
0.0082
0.3509
0
1.037974684
0.955870335
KR
513961
0
0.0085
0.8975
0
1.118421053
0.998553627
NL
105990
0
0.0346
0.4917
0
1.459915612
1.010273269
RU
242143
0.0005
0.0567
0.7203
1.666666667
1.75
1.015078918
SE
158039
0.0001
0.0184
0.7743
1
1
0.986746527
SU
598130
0
0.0017
0.5591
0
1
0.885632821
TW
214743
0
0
0.9993
0
0
1
US
6478266
0.0101
0.5979
0.8035
0.990196078
1.001339809
0.938448961
ZA
104475
0
0.1294
0.2184
0
1.172101449
1.016286645