Now in picard: two javascript-based tools filtering BAM and VCF files.
SamJS and VCFFilterJS are two tools I wrote for jvarkit. Both tools use the embedded java javascript engine to filter BAM or VCF file.
To get a broader audience, I've copied those functionalities to Picard in 'FilterSamReads' and 'FilterVcf'.
FilterSamReads
FilterSamReads filters a SAM or BAM file with a javascript expression using the java javascript-engine.The script puts the following variables in the script context: 'record' a SamRecord and 'header' a SAMFileHeader. Last value of the script should be a boolean to tell wether we should accept or reject the record.
The script samFilter.js
/** accept record if second base of DNA is a A */ function accept(r) { return r.getReadString().length()>2 && r.getReadString().substring(1,2)=="A"; } accept(record);
Invoke and output
$ java -jar picard-tools-2.1.1/picard.jar \ FilterSamReads I=in.sam O=out.sam \ JS=samFilter.js FILTER=includeJavascript $ cat out.sam | cut -f 10 | cut -c 2 | sort | uniq A
FilterVcf
FilterVcf one or more hard filters to a VCF file to filter out genotypes and variants.Filters a VCF file with a javascript expression interpreted by the java javascript engine. The script puts the following variables in the script context: 'variant' a VariantContext and 'header' a VCFHeader. Last value of the script should be a boolean to tell wether we should accept or reject the record.
The script variantFilter.js
/** prints a VARIATION if two samples at least have a DP>100 */ function myfilterFunction(thevariant) { var samples=header.genotypeSamples; var countOkDp=0; for(var i=0; i< samples.size();++i) { var sampleName=samples.get(i); if(! variant.hasGenotype(sampleName)) continue; var genotype = thevariant.genotypes.get(sampleName); if( ! genotype.hasDP()) continue; var dp= genotype.getDP(); if(dp > 100 ) countOkDp++; } return (countOkDp>2) } myfilterFunction(variant)
Invoke and output
java -jar picard-tools-2.1.1/picard.jar FilterVcf \ I=in.vcf O=out.vcf \ JS=variantFilter.js $ grep -v '#' jeter.vcf | cut -f 7 | grep variantFilter | wc -l 23
That's it,
Pierre
1 comment:
What am I missing
variant.getLog10PError() > 100.0
or
variant.getLog10PError() < 100.0
These don't seem to filter as I would expect.
Post a Comment