Showing posts with label mysql. Show all posts
Showing posts with label mysql. Show all posts

Wednesday, November 6, 2019

PATSTAT 2019b MySQL upload scripts


At this link are available my new scripts for MySQL to upload PATSTAT 2019b data.


Only change in this data ediction is the removal of table TLS906_PERSON that is fully integrated in TLS206.
My scripts have been updated to remove MyIsam as default engine and to make them fully compatible with MySQL 8.0.

Thursday, June 6, 2019

Patent familiarity calc scripts

I made available on Github a set of MySQL and python scripts to create familiarity indicator by IPC class, on NBER patents dataset.

https://github.com/gtarasconi/NBER-familiarity-indicators

Inventor’s familiarity with components of the invention measured by the (a) recent and (b) frequent usage of focal patent’s classes across all US patents. Thus, we calculate a measure of familiarity for each separate class of a focal patent. Therefore, the more recently and frequently a class has been used, its individual measure will be higher.

Based on Fleming 2001
https://funginstitute.berkeley.edu/wp-content/uploads/2012/10/Recombinant-Uncertainty-in-Technological-Search.pdf


Sunday, May 19, 2019

PATSTAT 2019a MySQL upload scripts


At this link are available my new scripts for MySQL to upload PATSTAT 2019a data.

New features in this ediction:


Person names in the original language (PERSON_NAME_ORIG_LG)
Field added in tables TLS206, 226, 906. This creates an inflation of records and lack of retrocompatibility with old personwise data.

“RELEVANT_CLAIM” attribute in the TLS215_CITN_CATEG table
A new attribute “RELEVANT_CLAIM” has been added to the TLS215_CITN_CATEG table. This attribute contains a single number referring to the claim to which the citation refers to.

Friday, April 20, 2018

PATSTAT spring 2018 MySQL upload scripts

at this link is possible to download a batch of scripts for MySQL that will allow you to upload new PATSTAT edition spring 2018.

This release has some improvements as:

* Table TLS201_APPLN: a new attribute name “RECEIVING_OFFICE” has been added. This means now PCT applications have "WO" as application authority, and the previous value is shifted to the new field.

* Table TLS212_CITATION: Euro-PCT applications did not have the citations from the international search report linked to the respective application (and publication). These are the so called A0 publications. To avoid this, EPO simply duplicated the citations from the international search report, and linked them to the respective EP publications.

* Table TLS231_INPADOC_LEGLA_STATUS: a new attribute EVENT_ID has been added to serve as a primary key to this table.



Monday, November 27, 2017

PATSTAT 2017b MySql upload scripts

From link:

https://www.dropbox.com/s/hkkn0xapxfymdl8/patstat2017b.zip?dl=0

it's possible to download my scripts for mysql allowing to upload the majority of tables from ediction 2017b of PATSTAT;

changes from previous ediction:
Table TLS214_NPL_PUBLN
Some attributes are now populated for more NPL types.
Attribute ONLINE_CLASSIFICATION may hold more than one Derwent class.
Attribute ONLINE_AVAILABILITY can now hold up to 500 characters.
Attribute NPL_AUTHOR can now can hold up to 1 000 characters.

Table TLS231_INPADOC_LEGAL_STATUS:
New attribute EVENT_FILING_DATE

Note also The OECD harmonized name (cf. attributes HAN_ID, HAN_NAME, HAN_HARMONIZED) have not been updated since the 2016 Autumn Edition. As a consequence, persons which have been added since then will have default values in these attributes.

Tuesday, June 27, 2017

Splitting huge dump files

When importing a very big sql dump, it is possible that some configurations do not allow a correct upload in one's db (ie myisam vs innodb) or you may need to import only part of the tables.
If the dump is big enough to make any edit impossible, you may need to split it.

I found a very handy tool SQLdumpsplitter.


It has a very simple inteface and it allows to chose the size of split files and if to skip comment lines;
it also creates a separate file for the create tbales statements.
Then you can reassemble the parts you need with a dos line:
copy *.sql new.sql


Thursday, March 2, 2017

Getting Started with PATSTAT Register

Just published on Australian Economic Review, this article made by GaƩtan de Rassenfosse (EPFL), Martin Kracker (EPO) and me provides a technical introduction to the PATSTAT Register database, which contains bibliographical, procedural and legal status data on patent applications handled by the European Patent Office. It presents eight MySQL queries that cover some of the most relevant aspects of the database for research purposes. It targets academic researchers and practitioners who are familiar with the PATSTAT database and the MySQL language.

link: http://onlinelibrary.wiley.com/doi/10.1111/1467-8462.12214/full

Wednesday, December 9, 2015

Mysql upload scripts for Patstat 2015b

Following the many changes that took place in 2015b, at this link you can download the new scripts

https://dl.dropboxusercontent.com/u/3004945/rawpatentdata/2015bmysqlscripts.zip

It's possible to download my scripts for mysql allowing to upload the majority of tables from ediction 2015a of patstat;

Please note I uploaded the new TLS906 in TLS206 in order to make smoother the db steps that follow the import of patstat data.

For a list of changes, please see the post at this link

Sunday, June 21, 2015

Patstat 2015a upload scripts for mysql

As from EPO cover letter 2015a ediction of patstat brings a lot of changes. (you may find here a resume)

At link:

https://www.dropbox.com/sh/ut0xpimwacy8y0u/AADLrqnb5pj3UvlKUkwRe6nka?dl=0



it's possible to download my scripts for mysql allowing to upload the majority of tables from ediction 2015a of patstat;

Please note I uploaded the new TLS906 in TLS206 in order to make smoother the db steps that follow the import of patstat data.

Tuesday, November 26, 2013

Patstat october 2013 mysql upload scripts

October 2013 patstat ediction contains a lot of changes and new tables (especially from persons' side)
Here you can download the mysql scripts I made for uploading the files from DVDs data.

remember to customize them for your own upload idrectory / target library
also note I did not upload TLS 203, 222 and 223 but scripts should work fine

Here below a comparison with expected records and october 2012 version
(odd: TLS209 has decreased, probably due to introduction of TLS224?)


table
date of files
declared
patstat Y-1
imported
delta imp
delta y-1
TLS201_APPLN
201310
76594275
73177050
76594275
0
4.67%
TLS202_APPLN_TITLE
201310
57335762
54184267
57335737
25
5.82%
TLS203_APPLN_ABSTR
201310
33870394
31187723
0
33,870,394
8.60%
TLS204_APPLN_PRIOR
201310
33615459
32125776
33,615,459
0
4.64%
TLS205_TECH_REL
201310
2138085
2131770
2,138,085
0
0.30%
TLS206_PERSON
201310
44730405
41478412
44,730,372
33
7.84%
TLS207_PERS_APPLN
201310
163534542
152682496
163,534,542
0
7.11%
TLS208_DOC_STD_NMS
201310
19409591
18766111
19,409,591
0
3.43%
TLS209_APPLN_IPC
201310
176153334
177915634
176,153,334
0
-0.99%
TLS210_APPLN_N_CLS
201310
21884148
21844519
21,884,148
0
0.18%
TLS211_PAT_PUBLN
201310
85733781
82561787
85,733,781
0
3.84%
TLS212_CITATION
201310
142247157
125658281
142,247,157
0
13.20%
TLS214_NPL_PUBLN
201310
22095090
19128778
22,095,084
6
15.51%
TLS215_CITN_CATEG
201310
22390889
20850431
22,390,891
-2
7.39%
TLS216_APPLN_CONTN
201310
2272520
2104686
2,272,520
0
7.97%
TLS218_DOCDB_FAM
201310
67766434
64571193
67,766,434
0
4.95%
TLS219_INPADOC_FAM
201310
74865923
73177050
74,865,923
0
2.31%
TLS221_INPADOC_PRS
201310
135555494
111194984


21.91%
TLS222_APPLN_JP_CLASS
201310
294109156
285734975

294,109,156
2.93%
TLS223_APPLN_DOCUS
201310
36668997
33590420

36,668,997
9.17%
tls224_appln_cpc
201310
137478291
not existing
137,478,291
0

tls226_person_orig        
201310
47699382
not existing
47,699,382
0

tls227_pers_publn         
201310
193982180
not existing
193,982,180
0