Showing posts with label impact. Show all posts
Showing posts with label impact. Show all posts

14 October 2012

Calculating time from submission to publication / Degree of burden in submitting a paper

After "404 not found": a database of non-functional resources in the NAR database collection, I've uploaded my second dataset on figshare:
Calculating time from submission to publication / Degree of burden in submitting a paper
.

Calculating time from submission to publication / Degree of burden in submitting a paper. Pierre Lindenbaum,  Ryan Delahanty.
figshare.
Retrieved 10:13, Oct 14, 2012 (GMT)
http://dx.doi.org/10.6084/m9.figshare.96403

This dataset was inspired by this post on biostar, initialy asked by Ryan Delahanty: I was wondering if it would be possible to calculate some kind of a metric for the speed-of-publication for each journal. I'm not sure submitted and accepted dates are available for all papers, but I noticed in XML data there are fields like the following:
<PubmedData>
        <History>
            <PubMedPubDate PubStatus="received">
                <Year>2011</Year>
                <Month>11</Month>
                <Day>29</Day>
                <Hour>6</Hour>
                <Minute>0</Minute>
            </PubMedPubDate>
            <PubMedPubDate PubStatus="accepted">
                <Year>2011</Year>
                <Month>12</Month>
                <Day>20</Day>
                <Hour>6</Hour>
                <Minute>0</Minute>
            </PubMedPubDate>
           (...)

In this dataset, the script 'pubmed.sh" downloads the the journals from http://www.ncbi.nlm.nih.gov/books/NBK3827/table/pubmedhelp.pubmedhelptable45/ , the 'eigenfactors' from http://www.eigenfactor.org.

For each journal , It scans pubmed (starting from year=2000) and get the difference between the date[@PubStatus='received'] and the date[@PubStatus='accepted'].

titleissneigenfactordays
"Acta biochimica Polonica"0001-527X0.003996119.770935960591
"Acta biomaterialia"1742-70610.02152129.682692307692
"Acta biotheoretica"0001-53420.000844161.897058823529
"Acta cirurgica brasileira / Sociedade Brasileira para Desenvolvimento Pesquisa em Cirurgia"0102-86500.00128122.038461538462
"Acta cytologica"0001-55470.00230565.3006134969325
"Acta diabetologica"0940-54290.001851299.6
"Acta haematologica"0001-57920.002825118.654676258993
"Acta histochemica"0065-12810.002162110.471204188482
"Acta histochemica et cytochemica"0044-59910.00067781.6455696202532
"Acta neurochirurgica"0001-62680.009685204.371830985916
"Acta neuropathologica"0001-63220.02347169.7277882797732
"Acta theriologica"0001-70510.000901147.0
"Acta tropica"0001-706X0.01011196.577777777778
"Acta veterinaria Scandinavica"0044-605X0.00161282.0
"Addictive behaviors"0306-46030.017915163.049731182796
"Advances in space research "0273-11770.021217205.0
Ambio0044-74470.007463181.878048780488
"American journal of human genetics"0002-92970.12015667.1898928024502
"American journal of hypertension"0895-70610.017359104.074576271186
(....)

Here is the kind of figure I got:

As far as I remember, "Cell" is the point having the highest eigenfactor.


Note: pubmed contains some errors: e.g. received > accepted (http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&id=20591334&retmode=xml) or some dates in the future: ( http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&id=12921703&retmode=xml )


That's it,

Pierre

10 June 2008

Pubmed, impact factors, sorting and FriendFeed

I recently said on twitter that I wished I could sort the articles on pubmed using the impact factors of the journals. What followed was a demonstration of the power of friendfeed and was also observed under some other circumstances by Deepak Singh, Pedro Beltrao and some others... Within a day several persons joined the conversation on friendfeed and among them, Lars Juhl Jensen and Deepak suggested me to have a look at http://www.eigenfactor.org where the Eigenfactor is a measure of the journal's total importance to the scientific community. I must also cite Euan who was inspired by this discussion and created PubmedFaceoff, a photorealistic variant of the Chernoff Faces visualization technique based on pubmed.

Now let's go back to my sorting problem: I've joined the data from www.eigenfactor.org (with the kind permission of Carl Bergstrom) and from http://www.ncbi.nlm.nih.gov/entrez/citmatch_help.html#JournalLists and I've uploaded this new dataset on IBM-ManyEyes:



I wrote a java program reading a set of pubmed articles formatted in XML and using the scoring dataset. The algorithm is trivial: the XML element of the articles are removed from their parent node, sorted on their 'eigenfactors' retrieved from the journal <NlmId>, and then inserted back.

The source is available here

The executable jar (containing the scoring dataset) is available here:


Here is an example: I want to sort the articles about Charles Darwin. I've feched all the 372 articles in XML from this query
java -jar lindenb/build/sortpubmed.jar ~/pubmed_result.txt > result.xml

Here are the first articles:

    * Schmidhuber, Jürgen (Apr. 2008). "Comparing the legacies of Gauss, Pasteur and Darwin". Nature 452 (7187): 530. doi:10.1038/452530b. PMID 18256649. 
* Padian, Kevin (Feb. 2008). "Darwin's enduring legacy". Nature 451 (7179): 632-4. doi:10.1038/451632a. PMID 18305520.
* Odling-Smee, Lucy (Mar. 2007). "Darwin and the 20-year publication gap". Nature 446 (7135): 478-9. doi:10.1038/446478a. PMID 17392756.
* Oliveira, João Gama; Barabási Albert-László (Oct. 2005). "Human dynamics: Darwin and Einstein correspondence patterns". Nature 437 (7063): 1251. doi:10.1038/4371251a. PMID 16724015.
* Kohn, David; Murrell Gina, Parker John, Whitehorn Mark (Aug. 2005). "What Henslow taught Darwin". Nature 436 (7051): 643-5. doi:10.1038/436643a. PMID 16079834.
* Ridley, Matt (Sep. 2004). "Crick and Darwin's shared publication in Nature". Nature 431 (7006): 244. doi:10.1038/431244a. PMID 15372004.
* Gruber, J W (Oct. 2001). "Owen was right, as Darwin's work continues". Nature 413 (6857): 669. doi:10.1038/35099725. PMID 11449244.
* Padian, K (Jul. 2001). "Owen's Parthian shot". Nature 412 (6843): 123-4. doi:10.1038/35084289. PMID 11606991.
* Rhodes, F H (. 1983). "Gradualism, punctuated equilibrium and the Origin of Species". Nature 305 (5932): 269-72. PMID 6353241.
* Maynard-Smith, J (Apr. 1982). "The century since Darwin". Nature 296 (5858): 599-601. PMID 7040979.
* "Darwin's questions" (Jan. 1969). Nature 221 (5178): 313. PMID 4884839.
* Hector, Andy; Hooper Rowan (Jan. 2002). "Ecology. Darwin and the first ecological experiment". Science 295 (5555): 639-40. doi:10.1126/science.1064815. PMID 11809960.
* Corsi (May. 1987). "Further Letters of Darwin: The Correspondence of Charles Darwin". Science 236 (4804): 988-989. doi:10.1126/science.236.4804.988. PMID 17812771.
* Schweber (May. 1985). "Darwin's Earliest Letters: The Correspondence of Charles Darwin". Science 228 (4701): 838-841. doi:10.1126/science.228.4701.838. PMID 17815024.
* Lewin, R (Aug. 1982). "Darwin died at a most propitious time". Science 217 (4561): 717-8. PMID 7048528.
* Gould, S J (Apr. 1982). "Darwinism and the expansion of evolutionary theory". Science 216 (4544): 380-7. PMID 7041256.
* Zirkle (May. 1964). "Charles Darwin". Science 144 (3619): 724-725. doi:10.1126/science.144.3619.724-a. PMID 17807061.
* Cholodny (Nov. 1937). "CHARLES DARWIN AND THE MODERN THEORY OF TROPISMS". Science 86 (2238): 468. doi:10.1126/science.86.2238.468. PMID 17815459.
* Leidy (Sep. 1929). "CEREMONY ATTENDING THE OPENING OF DOWN HOUSE, THE HOME OF CHARLES DARWIN". Science 70 (1810): 228-231. doi:10.1126/science.70.1810.228. PMID 17775389.
* Osborn (Jun. 1929). "GIFT TO DOWN HOUSE OF THE ORIGINAL LETTERS OF CHARLES DARWIN TO FRITZ MULLER". Science 69 (1799): 645. doi:10.1126/science.69.1799.645. PMID 17791947.
* Osborn (Dec. 1926). "A CONTEMPORARY OF CHARLES DARWIN". Science 64 (1669): 623-624. doi:10.1126/science.64.1669.623-a. PMID 17834475.
* Sampson (Sep. 1909). "LETTERS FROM CHARLES DARWIN". Science 30 (766): 303-304. doi:10.1126/science.30.766.303. PMID 17837456.
* Ayala, Francisco J (May. 2007). "Darwin's greatest discovery: design without designer". Proc. Natl. Acad. Sci. U.S.A. 104 Suppl 1: 8567-73. doi:10.1073/pnas.0701072104. PMID 17494753.