Insert your VCFs in a sqlite database.
vcf2sqlite is C++ tool that is part of my Variation Toolkit.
It inserts a "Variant Call Format document" (VCF) into a sqlite3 database.
Download
Download the sources from Google-Code using subversion:....svn checkout http://variationtoolkit.googlecode.com/svn/trunk/ variationtoolkit-read-only... or update the sources of an existing installation...
cd variationtoolkit svn update... and edit the
variationtoolkit/congig.mk file.
Dependencies
http://www.sqlite.org/ : libraries and headers for sqlite3.Compilation
Define "SQLITE_LIB" and
"SQLITE_CFLAGS" in config.mk (see HowToInstall )
$ cd variationtoolkit/src/ $ make ../bin/vcf2sqlite if ! [ -z "$(SQLITE_LIB)" ] ;then g++ -o ../bin/vcf2sqlite vcf2sqlite.cpp xsqlite.cpp application.o -O3 -Wall -lz ; else g++ -o ../bin/vcf2sqlite vcf2sqlite.cpp -DNOSQLITE -O3 -Wall ; fi
Usage
vcf2sqlite -f database.sqlite (file1.vcf file2... | stdin )
Options
- -f (file) sqlite3 database (REQUIRED).
Schema
Example:
$ vcf2sqlite -f db.sqlite file.vcf
$ sqlite3 -line db.sqlite "select * from VCFCALL LIMIT 4"
id = 1
nIndex = 0
vcfrow_id = 1
sample_id = 1
prop = GT
value = 1/1
id = 2
nIndex = 1
vcfrow_id = 1
sample_id = 1
prop = PL
value = 46,6,0
id = 3
nIndex = 2
vcfrow_id = 1
sample_id = 1
prop = GQ
value = 10
id = 4
nIndex = 0
vcfrow_id = 2
sample_id = 1
prop = GT
value = 1/1
$ sqlite3 -column -header db.sqlite \ "select SAMPLE.name,VCFCALL.value,count(*) from VCFCALL,SAMPLE where SAMPLE.id=VCFCALL.sample_id and prop='GT' group by SAMPLE.id,VCFCALL.value" name value count(*) ----------- ---------- ---------- rmdup_1.bam 0/1 545 rmdup_1.bam 1/1 429 rmdup_2.bam 0/1 625 rmdup_2.bam 1/1 349 rmdup_3.bam 0/1 595 rmdup_3.bam 1/1 379 rmdup_4.bam 0/1 548 rmdup_4.bam 1/1 426 rmdup_5.bam 0/1 564 rmdup_5.bam 1/1 410 rmdup_6.bam 0/1 724 rmdup_6.bam 1/1 250That's it
Pierre
No comments:
Post a Comment