People from the "Human Genetics Center" in Houston have compiled a new resource named dbNSFP and described in http://www.ncbi.nlm.nih.gov/pubmed/21520341.
Hum Mutat. 2011 Apr 21. doi:10.1002/humu.21517.
dbNSFP: a lightweight database of human non-synonymous SNPs and their functional predictions.
Liu X, Jian X, Boerwinkle E.
They have compiled the "prediction scores from four new and popular algorithms (
SIFT,
Polyphen2,
LRT and
MutationTaster), along with a conservation score (
PhyloP) and other related information,
for every potential NS in the human genome (a total of 75,931,005)." .
So, you don't have to send some new jobs to SIFT or Polyphen. Everything has already been calculated and joined here.
The database is available from
http://sites.google.com/site/jpopgen/dbNSFP.
Downloading
lindenb@yokofakun:~$ wget "http://dl.dropbox.com/u/17001647/dbNSFP/dbNSFP.chr1-22XY.zip"
--2011-04-27 13:50:26-- http://dl.dropbox.com/u/17001647/dbNSFP/dbNSFP.chr1-22XY.zip
Proxy request sent, awaiting response... 200 OK
Length: 1200703405 (1.1G) [application/zip]
Saving to: `dbNSFP.chr1-22XY.zip'
100%[=================================================================================================================>] 1,200,703,405 1.82M/s in 10m 11s
2011-04-27 14:00:38 (1.87 MB/s) - `dbNSFP.chr1-22XY.zip' saved [1200703405/1200703405]
Content
unzip -t dbNSFP.chr1-22XY.zip
Archive: dbNSFP.chr1-22XY.zip
testing: dbNSFP.chr1 OK
testing: dbNSFP.chr10 OK
testing: dbNSFP.chr11 OK
testing: dbNSFP.chr12 OK
testing: dbNSFP.chr13 OK
testing: dbNSFP.chr14 OK
testing: dbNSFP.chr15 OK
testing: dbNSFP.chr16 OK
testing: dbNSFP.chr17 OK
testing: dbNSFP.chr18 OK
testing: dbNSFP.chr19 OK
testing: dbNSFP.chr2 OK
testing: dbNSFP.chr20 OK
testing: dbNSFP.chr21 OK
testing: dbNSFP.chr22 OK
testing: dbNSFP.chr3 OK
testing: dbNSFP.chr4 OK
testing: dbNSFP.chr5 OK
testing: dbNSFP.chr6 OK
testing: dbNSFP.chr7 OK
testing: dbNSFP.chr8 OK
testing: dbNSFP.chr9 OK
testing: dbNSFP.chrX OK
testing: dbNSFP.chrY OK
Sample (verticalized)
>>2
$1 #chr : 22
$2 pos(1-based) : 15453440
$3 ref : T
$4 alt : G
$5 aaref : M
$6 aaalt : L
$7 hg19pos(1-based) : 17073440
$8 genename : CCT8L2
$9 geneid : 150160
$10 CCDSid : CCDS13738.1
$11 refcodon : ATG
$12 codonpos : 1
$13 fold-degenerate : 0
$14 aapos : 1
$15 cds_strand : -
$16 LRT_Omega : 1.116940
$17 PhyloP_score : 0.963611
$18 PlyloP_pred : C
$19 SIFT_score : 1.0
$20 SIFT_pred : D
$21 Polyphen2_score : 0.25
$22 Polyphen2_pred : P
$23 LRT_score : 0.419288
$24 LRT_pred : U
$25 MutationTaster_score : 1.0
$26 MutationTaster_pred : D
<<2
>>3
$1 #chr : 22
$2 pos(1-based) : 15453440
$3 ref : T
$4 alt : C
$5 aaref : M
$6 aaalt : V
$7 hg19pos(1-based) : 17073440
$8 genename : CCT8L2
$9 geneid : 150160
$10 CCDSid : CCDS13738.1
$11 refcodon : ATG
$12 codonpos : 1
$13 fold-degenerate : 0
$14 aapos : 1
$15 cds_strand : -
$16 LRT_Omega : 1.116940
$17 PhyloP_score : 0.963611
$18 PlyloP_pred : C
$19 SIFT_score : 1.0
$20 SIFT_pred : D
$21 Polyphen2_score : 0.25
$22 Polyphen2_pred : P
$23 LRT_score : 0.419288
$24 LRT_pred : U
$25 MutationTaster_score : 1.0
$26 MutationTaster_pred : D
<<3
That's it,
Pierre