Pubmed: sorting the articles on the number of times they've been cited
In 2008 I used www.eigenfactor.org/ to sort a set of Pubmed articles on the impact factor of the journal. In the current post I will show I've used NCBI ELink to sort the articles on the number of times they've have been cited in some other articles in pubmed-central.
The NCBI ELink API checks for the existence of an external or Related Articles link from a list of one or more primary IDs. It can be used to retrieve the article in pubmed central citing a given PMID.
For example, the the following uri: http://eutils.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?retmode=xml&dbfrom=pubmed&id=19755503&cmd=neighbor returns the 3 articles that cited the Gene Wiki paper:
<DbTo>pubmed</DbTo>
<LinkName>pubmed_pubmed_citedin</LinkName>
<Link>
<Id>21516242</Id>
</Link>
<Link>
<Id>21062808</Id>
</Link>
<Link>
<Id>20334642</Id>
</Link>
</LinkSetDb>
(...)
I wrote a java program using this resource to sort the articles on the number of time they have been cited. The program is available on github at: .
Example
Let's sort the articles published in the 2005 NAR-Database Issue:The output is a sorted set of XML pubmed records.
The most cited article (290 references) is The Universal Protein Resource (UniProt)..
Some articles have never been cited: e.g.: Metagrowth: a new resource for the building of metabolic hypotheses in microbiology.
The '-c' option in the command line enables the program to insert a new XML node containing the PMID of the articles citing one article:
<ArticleId IdType="pubmed">15608167</ArticleId>
<ArticleId IdType="pmc">PMC540024</ArticleId>
</ArticleIdList>
</PubmedData>
<CitedBy count="290">
<PMID>15608199</PMID>
<PMID>15608238</PMID>
<PMID>15608243</PMID>
<PMID>15769290</PMID>
<PMID>15888679</PMID>
<PMID>15980452</PMID>
(...)
<PMID>21450054</PMID>
<PMID>21453542</PMID>
<PMID>21544166</PMID>
</CitedBy>
</PubmedArticle>
That's it,
Pierre