Tuesday, January 10, 2012

EEE-PPAT: Released version oct 2011



EEE-PPAT table is an extension of the PERSON TABLE produced by ECOOM (Catholic University of Leuven) and Eurostat. The extension concerns sector allocation and name harmonization of applicants.


It can be required at no cost by contacting technoinfo@ecoom.be; some documentation @ this link 


It contains the following columns with original PATSTAT person_id:
        - PERSON_ID
        - HRM_LEVEL1: harmonized name level1
        - SECTOR : Sector of assignee name 


The file is coded in UTF-8, tab-delimited.
The definition of table is as follow:
        - PERSON_ID                       number(9)
        - HRM_LEVEL1                   char  (400)
        - SECTOR                             char  (50)


The file, tab delimited, contains 12488647 records.


The 'compression rate' is good: out of 10.324.068 distinct applicant names contained in TLS206, EE-PPAT reduces them into 8.227.328, and sector allocation is distributed as follows:


COMPANY
2173055
COMPANY GOV HOSPITAL
1
COMPANY GOV NON-PROFIT
35307
COMPANY GOV UNIVERSITY
176
COMPANY HOSPITAL
1601
COMPANY UNIVERSITY
1544
GOV NON-PR0FIT
4
GOV NON-PROFIT
105750
GOV NON-PROFIT UNIVERSITY
669
HOSPITAL
5028
INDIVIDUAL
4774071
UNIVERSITY
45506
UNKNOWN
1252219

[Lines below are cancelled since data have been corrected with updated the data on FTP on 2012 Jan. 10 CET 13:51.]


(some look like small mistakes like GOV NON-PR0FIT)



While loading the data you can have an error with person_id  4264883, 9883343, 8758108 that have sector null since the text COMPANY was 'taken' in the name harmonized name that finished with a slash, making problem to the recognition of tab field delimiter in the three records.


@ this link you can download a script for loading the table EEE-PPAT into mysql; it also has a patch for the 3 wrong records. 
[do not use the patch for 2012 Jan. 10 CET 13:51. data]


2 comments:

Eugenia Shevtsova said...

Hi Gianluca, sorry to bother you. I can't manage to access your MySQL script to upload the EEE-PPAT table. is there any chance you could share with me? Thanks a lot in advance! Eugenia.

GL said...

Eugenia, the updatel link is:
https://www.dropbox.com/s/tmwfdjs5g2uduh3/ee_ppat_loader201110.zip

Post a Comment