Tuesday, February 23, 2010

Origin of citations in European Patent Office

Whenever you'd need to understand who introduces a citation to a patent in PATSTAT, you may look at  table TLS212_citation, where you may find the field CITN_ORIGIN.

The content of such field may be any of these

0 - SEA- citations introduced during search
1 - APP- citations introduced by the applicant
2 - EXA- citations introduced during examination
3 - OPP - citations introduced during opposition
4 - 115- citations introduced according to Art 115 EPC
5,6 not public

Let's see the content of this field in EPO patents (count made on PATSTAT 10/2009 grouping by publication numbers)

0                    22985072
1                     6175203
2                      178744
3                       15839
4                       13533

While in in USPTO patents..
(count made with same criteria as above)

0              37425072 
1              15534910

The higher number of applicant introduced citations is due to the "duty of candor" rule which imposes the disclosure of all prior art.

Tuesday, February 9, 2010

Patstat and patent families

As wikipedia says, a patent family is "a set of patents taken in various countries to protect a single invention". Another way to say it is to call the patents belonging to the same family as 'equivalents'.

Infact when using ESPACENET and searching FI patent EP100000 you will see on rightmost side of your browser a list of "also published as" (1) that can be listed more in detail by going on "View INPADOC patent family" (2).

But we must be careful cause with patents, as in real  life, is very hard to give a unique definiton of what a family is...

In patstat from april 2009 two new tables have been introduced in order to help users to build a table of equivalents: tls218_DOCDB_FAM and tls219_INPADOC_FAM.

the first, tls218_DOCDB_FAM (SIMPLE FAMILY) gives the same family id to applications claiming exactly the same prior applications as priorities (these can be Paris Convention priorities or just technical relation priorities).
As PATSTAT documentation says: "The EPO reserve the right to classify an application into a particular simple family irrespective of this general rule" This is done by creating artificial priorities for an application to force it to match the priorities of a family.
The simple family is also at times used to attribute automatically the same IPC classification symbols and other attributes to corresponding applications.

The latter, tls219_INPADOC_FAM, (extended priority family) was developed by the INPADOC organisation then integrated by EPO.
In this case the linkage among applications can come from connections in tables TLS204_appln_prior (PARIS convention priorities) , TLS205_TECH_REL (patents which have been technically linked by patent examiners on the basis of similar content) and table TLS216_appln_contn (continuations, divisions etc).
The artificial PATSTAT applications due to priorities which have no entry in DOCDB are also included in this family.
The artificial PATSTAT applications due to unknown cited publications are included in this family table , but they all appear as a family with 1 member only.

Maybe an example can make things clearer: if we consider the group of applications D1..D5

Document D1    Priority P1
Document D2    Priority P1    Priority P2
Document D3    Priority P1    Priority P2
Document D4                   Priority P2    Priority P3
Document D5                                  Priority P3

While for INPADOC all documents will belong to one family only, for EPODOC we will have


Or if you want a real case, I selected to which DOCDB and INPADOC family was belonging the application with patstat appln_id = 1; the difference is very evident.

DOCDB family           
1            25590760            'AL'     '        9600001'
889876       25590760            'AT'     '       96931674'
1806521      25590760            'AU'     '        7081996'
18755226     25590760            'ES'     '       96931674'
1            3960163              'AL'     '        9600001'
889876       3960163              'AT'     '       96931674'
1806521      3960163              'AU'     '        7081996'
14573559     3960163              'DE'     '       69602451'
14573560     3960163              'DE'     '       69602451'
17633931     3960163              'EP'     '       96931674'
18755226     3960163              'ES'     '       96931674'
24291685     3960163              'GR'     '       99401928'
47191406     3960163              'US'     '        2963998'
57000038     3960163              'AL'     '           4195'
59131347     3960163              'EP'     '        9503551'

Limits of DOCDB and INPADOC

We may say the DOCDB families are sometimes too restrictive, since they do not put divisionals, continuations etc. into the same family although they should. Also the same appln_id may occur in more than one DOCDB family.

The INPADOC on the other side shows highly related patent documents, even if they are not necessarily on the same aspect of the invention. The same application id only occurs in one family, but the family is typically very comprehensive.

Other definitions:

Dietmar Harhoff at LMU developed a definition of equivalents allocating patents into one group/family of equivalents if they have the same priorities, and then it aggregates those groups across which members occur more than once.

Also Derwent developed a definition of patent family and here you can find some details.

(Thanks to Fabio Montobbio and Raffaele Conti for helping me in information collection for this post)