Sunday, December 20, 2009

Underscore vs dash in excel sheet names

this could be funny, if it were not a big issue in the elaboration of a questionnaire.
As I previously mentioned I'm working on building a DB for DG-Infso (EU) from some thousands of answers made by project leaders, collected via Excel.
The bad thing was that apparently many people had changed some of the worksheets, changing dashes (-) to underscores (_) (that was like publications-papers becoming publications_papers).

Even if I understand that for somebody's taste underscore may look nicer than a sign that resemble a minus (so attracting negativity) I did not feel like dozens of people would be taking the effort of manual substitutions just to avoid bad karma.

So I suspected "maybe they are mac users?" (you know mac users are more estetic-oriented...) until I found this post on OPENOFFICE.ORG explaining how openoffice calc changes dashes to underscore in worksheets names.

So cool...

Thursday, December 17, 2009

IPC to OST reclassification

After last post I decided to add something of my own, by releasing a conversion table from IPC to OST (7 & 30) reclassification.
The table can be downloaded here : it's a CSV file (separated with semicolon) with an xls file containing class descriptions for 7 & 30 reclassifications.

You will find something like

M_class OST30 OST7
A01B% ; 20 ; 5
A01C% ; 20 ; 5

Where % means in SQL "anything after..."
So you may add your reclassification with an update where YOUR_IPC is like M_CLASS

At the end of the table you will find some IPC without reclassification

C99Z%
H04N101/00

But they are very few in the overall patents population (2 and 328 patents in EPO)

Wednesday, December 16, 2009

OST reclassification vs H04W

This post is the result of a joint effort of Emmanuelle Fortune (OST), Lorenzo Cassi (CES) and Francesco Lissoni (KITeS) who gave the solution. I just raised the question.

I asked those guys a help since OST reclassification of IPC codes is not containing H04W class, that comes with patstat 2009/10.

UPto now only @EPo ypou can find 9023 patents with H04W and more than 50.000 worldwide.

The group H04W has been introduced into the IPC-8 only with the version January 2009 (2009.01). As with the IPC-8 system also old documents are re-classified, by using a PATSTAT edition 2009 or later you will find out that also documents from 2008 backwards carry the classificaiton H04W in the databases (see FI EP1685687) , but when you look on the printed document on Espacenet you will not find the symbol there. Also, in case where a patent has been published before 2009 as an application and then granted after 2009, then indeed you will see a difference in the published classification, but in the database you should find that both have the more uptodate on (H04W).

The H04W used to be H04L, H04Q, H04B... H04Q, H04B are in the field Telecommunications and H04L in Digital communication.

So H04W could fall under Telecommunications or Digital communication.

Ulli Schmoch defines "Digital Communications" as follows:
"Digital communication: in the ISI-OST-INPI classification, this field was part of telecommunications. At present, it is a self-contained technology at the border between telecommunications and computer technology. A core application of this technology is the Internet"

While telecommunication is huge (G08C, H01P, H01Q, H04B, H04H, H04J, H04K, H04M, H04N-001, H04N-007, H04N-011, H04Q), Digital communication is made only of H04L

So, in the absence of a more accurate judgement, we decided to keep H04W in Telecommunications, both because Digital Communications is almost a "residual" class and because any mistake would be less noticeable if we dilute HO4W in a big class instead of a small one. Finally, by taking a look at examples of patents with H04W IPC code, their title do not seem to place them squarely into the Internet realm.

Sunday, December 6, 2009

Paris Reloaded

This is about Name Game workshop held in Paris last 25, 26 november.

The workshop has been organized by my roommate Francesco Lissoni in the frame of APE-INV whose goals are measuring the extent of academic patenting, and studying its determinants, in order to improve our understanding of university–industry relationships, and of their influence on academic researchers’ choice of scientific targets and norms of conduct.
So an important issue is building a database about inventors, and the first step consists in cleaning and standardizing the information on inventors we can derive from patent data.
The Name Game workshop aimed at convening as many researchers as possible with experience in this field, in order to allow them to share their methodologies and possibly reach a consensus on how to harmonize their future efforts.
It has been hosted by OST, the Observatoire des sciences et des techniques.

There eventually I had the chance of making a demi-decent presentation of the job done about EPO inventors in the last few years. If you want you can find it here