Today, Andra Waagmeester asked on Biostar :"NAR nicely lists all their database issues on http://www.oxfordjournals.org/nar/database/c/. Is the list also available in a downloadable format?".
I suggested to download from pubmed all the articles published in an annual issue of NAR , to extract the URLs from the abstract and to check if they were still active. I just wrote a java program doing this job (it is available on github at https://github.com/lindenb/jsandbox/blob/master/src/sandbox/NucleicAcidsResearch404.java)
A few comments:
- The connection timeout was fixed to 10 seconds.
- Some URLs are poorly written e.g: http://www.ncbi.nlm.nih.gov/pubmed/14681415
- An abstract can contain more than one URL
- There can be different URLs for the same database
- getting a HTTP:404 error doesn't mean that the database has really been discontinued.
- getting a status HTTP:200 doesn't mean that the database is still active and/or maintained
- 1155 URLs have been extracted from this pubmed query `"Nucleic Acids Res"[JOUR] "Database issue"[ISS]` (as far as I can see , this query only goes to 2004) Edit:ok, that was because NCBI eFetch is limited to 10K records
... a snapshot of the output...
- (2007) PlantQTL-GE: a database system for identifying candidate genes in rice and Arabidopsis by gene expression and QTL information.
Credit for the Title: Neil Saunders ;-)
It seems that the URLs in the abstracts are broken where they were cut in the PDF !