For lovers of bibliometry patstat, along with patent citations, contains also non patent literature referenced by patents.
You con get those references from citations table (TLS212_CITATION) linking the table containing the fulltext of NPL citation (TLS214_NPL_PUBLN) by NPL_PUBLN_ID where NPL_CITN_SEQ_NR (the progressive for NPL citations) is different from 0
An issue on these tables is that TLS214 contains a lot of duplicates!!!
With an easy
select distinct trim(NPL_BIBLIO) from TLS214_NPL_PUBLN
I reduce the figures from 12.139.696 to 9.449.779 (a 23% less...))
[I added a trim cause many records start with a space]
maybe later on I'll post some SQL to deduplicate the data...
No comments:
Post a Comment