Insert your VCFs in a sqlite database.
vcf2sqlite is C++ tool that is part of my Variation Toolkit.
It inserts a "Variant Call Format document" (VCF) into a sqlite3 database.
Download
Download the sources from Google-Code using subversion:....svn checkout http://variationtoolkit.googlecode.com/svn/trunk/ variationtoolkit-read-only... or update the sources of an existing installation...
cd variationtoolkit svn update... and edit the
variationtoolkit/congig.mk
file.
Dependencies
http://www.sqlite.org/ : libraries and headers for sqlite3.Compilation
Define "SQLITE_LIB
" and
"SQLITE_CFLAGS
" in config.mk
(see HowToInstall )
$ cd variationtoolkit/src/ $ make ../bin/vcf2sqlite if ! [ -z "$(SQLITE_LIB)" ] ;then g++ -o ../bin/vcf2sqlite vcf2sqlite.cpp xsqlite.cpp application.o -O3 -Wall -lz ; else g++ -o ../bin/vcf2sqlite vcf2sqlite.cpp -DNOSQLITE -O3 -Wall ; fi
Usage
vcf2sqlite -f database.sqlite (file1.vcf file2... | stdin )
Options
- -f (file) sqlite3 database (REQUIRED).
Schema
Example:
$ vcf2sqlite -f db.sqlite file.vcf $ sqlite3 -line db.sqlite "select * from VCFCALL LIMIT 4" id = 1 nIndex = 0 vcfrow_id = 1 sample_id = 1 prop = GT value = 1/1 id = 2 nIndex = 1 vcfrow_id = 1 sample_id = 1 prop = PL value = 46,6,0 id = 3 nIndex = 2 vcfrow_id = 1 sample_id = 1 prop = GQ value = 10 id = 4 nIndex = 0 vcfrow_id = 2 sample_id = 1 prop = GT value = 1/1
$ sqlite3 -column -header db.sqlite \ "select SAMPLE.name,VCFCALL.value,count(*) from VCFCALL,SAMPLE where SAMPLE.id=VCFCALL.sample_id and prop='GT' group by SAMPLE.id,VCFCALL.value" name value count(*) ----------- ---------- ---------- rmdup_1.bam 0/1 545 rmdup_1.bam 1/1 429 rmdup_2.bam 0/1 625 rmdup_2.bam 1/1 349 rmdup_3.bam 0/1 595 rmdup_3.bam 1/1 379 rmdup_4.bam 0/1 548 rmdup_4.bam 1/1 426 rmdup_5.bam 0/1 564 rmdup_5.bam 1/1 410 rmdup_6.bam 0/1 724 rmdup_6.bam 1/1 250That's it
Pierre
No comments:
Post a Comment