(...)
public void convertReadNamesToUpperCase(
final File inputSamOrBamFile,
final File outputSamOrBamFile
)
{
final SAMFileReader inputSam = new SAMFileReader(inputSamOrBamFile);
final SAMFileWriter outputSam = new SAMFileWriterFactory().makeSAMOrBAMWriter(inputSam.getFileHeader(),
true, outputSamOrBamFile);
for (final SAMRecord samRecord : inputSam) {
// Convert read name to upper case.
samRecord.setReadName(samRecord.getReadName().toUpperCase());
outputSam.addAlignment(samRecord);
}
outputSam.close();
inputSam.close();
}
(...)
toay, I've played with this SAM API to build two simple java tools:public void convertReadNamesToUpperCase(
final File inputSamOrBamFile,
final File outputSamOrBamFile
)
{
final SAMFileReader inputSam = new SAMFileReader(inputSamOrBamFile);
final SAMFileWriter outputSam = new SAMFileWriterFactory().makeSAMOrBAMWriter(inputSam.getFileHeader(),
true, outputSamOrBamFile);
for (final SAMRecord samRecord : inputSam) {
// Convert read name to upper case.
samRecord.setReadName(samRecord.getReadName().toUpperCase());
outputSam.addAlignment(samRecord);
}
outputSam.close();
inputSam.close();
}
(...)
SbamGrep
SbamGrep is available at http://code.google.com/p/code915/wiki/SbamGrep. It filters a SAM/BAM file using the SAM flags.Option
-o (filename-out) or default is stdout SAM
-v inverse selection
-f [flag] add bam/sam flag for filtering. multiple separeted with a comma. One of:
READ_PAIRED or 1 or 0x1
PROPER_PAIR_ or 2 or 0x2
READ_UNMAPPED or 4 or 0x4
MATE_UNMAPPED or 8 or 0x8
READ_STRAND or 16 or 0x10
MATE_STRAND or 32 or 0x20
FIRST_OF_PAIR or 64 or 0x40
SECOND_OF_PAIR or 128 or 0x80
NOT_PRIMARY_ALIGNMENT or 256 or 0x100
READ_FAILS_VENDOR_QUALITY_CHECK or 512 or 0x200
DUPLICATE_READ or 1024 or 0x400
Example
The following command lists all the reads having a flag (READ_PAIRED and READ_UNMAPPED and MATE_UNMAPPED and FIRST_OF_PAIR):java -cp sam-1.16.jar:dist/sbamgrep.jar fr.inserm.umr915.sbamtools.SbamGrep -f 0x4,0x1,0x8,0x40 file.bam
Result (masked):@HD VN:1.0 SO:unsorted
@SQ SN:chrT LN:349250
IL_XXXX1 77 * 0 0 * * 0 0 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX (94**0-)*7=0688855555522@86;;;5;6:;63:4?-622647..-.5.%
IL_XXXX2 77 * 0 0 * * 0 0 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX (9*+*2396,@5+5:@@@;;5)50)6960684;58;86*5102)0*+8:*137;
IL_XXXX3 77 * 0 0 * * 0 0 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX (/999-00328:88984@@=8??@@:-66,;8;5;6+;255,1;788883676'
IL_XXXX4 77 * 0 0 * * 0 0 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX (916928.82@@50854;33222224;@25?5522;5=;;858888555*0666
IL_XXXX5 77 * 0 0 * * 0 0 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX (9*4*5-**32989+::@82;;853+39;80.53)-)79)..'55.8988*200
IL_XXXX6 77 * 0 0 * * 0 0 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX (*+**,14265;@@??@8?9@@@5@488?8666260.)-*9;;;88:8'05418
IL_XXXX7 77 * 0 0 * * 0 0 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX (9136242-2@@@;96.888@@@@80$585882623.':**+3*03137..--.
IL_XXXX8 77 * 0 0 * * 0 0 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX (/89255855@88?585557..())@@@;5552286526755@@5888..3;/$
(...)
@SQ SN:chrT LN:349250
IL_XXXX1 77 * 0 0 * * 0 0 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX (94**0-)*7=0688855555522@86;;;5;6:;63:4?-622647..-.5.%
IL_XXXX2 77 * 0 0 * * 0 0 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX (9*+*2396,@5+5:@@@;;5)50)6960684;58;86*5102)0*+8:*137;
IL_XXXX3 77 * 0 0 * * 0 0 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX (/999-00328:88984@@=8??@@:-66,;8;5;6+;255,1;788883676'
IL_XXXX4 77 * 0 0 * * 0 0 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX (916928.82@@50854;33222224;@25?5522;5=;;858888555*0666
IL_XXXX5 77 * 0 0 * * 0 0 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX (9*4*5-**32989+::@82;;853+39;80.53)-)79)..'55.8988*200
IL_XXXX6 77 * 0 0 * * 0 0 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX (*+**,14265;@@??@8?9@@@5@488?8666260.)-*9;;;88:8'05418
IL_XXXX7 77 * 0 0 * * 0 0 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX (9136242-2@@@;96.888@@@@80$585882623.':**+3*03137..--.
IL_XXXX8 77 * 0 0 * * 0 0 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX (/89255855@88?585557..())@@@;5552286526755@@5888..3;/$
(...)
SbamStats
The second tool, SbamStats is available at http://code.google.com/p/code915/wiki/SbamStats. It provides a simple report about all the SAM flags used in one or more files.Example
:java -cp sam-1.16.jar:dist/sbamstats.jar fr.inserm.umr915.sbamtools.SBamStats file.bam
Output:READ_PAIRED:0x1|READ_STRAND:0x10|MATE_STRAND:0x20|FIRST_OF_PAIR:0x40 10827
READ_PAIRED:0x1|READ_STRAND:0x10|MATE_STRAND:0x20|SECOND_OF_PAIR:0x80 10827
READ_PAIRED:0x1|FIRST_OF_PAIR:0x40 13951
READ_PAIRED:0x1|SECOND_OF_PAIR:0x80 13951
READ_PAIRED:0x1|MATE_STRAND:0x20|FIRST_OF_PAIR:0x40 23846
READ_PAIRED:0x1|READ_STRAND:0x10|SECOND_OF_PAIR:0x80 23846
READ_PAIRED:0x1|MATE_STRAND:0x20|SECOND_OF_PAIR:0x80 33606
READ_PAIRED:0x1|READ_STRAND:0x10|FIRST_OF_PAIR:0x40 33606
READ_PAIRED:0x1|MATE_UNMAPPED:0x8|READ_STRAND:0x10|SECOND_OF_PAIR:0x80 143090
READ_PAIRED:0x1|READ_UNMAPPED:0x4|READ_STRAND:0x10|MATE_STRAND:0x20|FIRST_OF_PAIR:0x40 143090
READ_PAIRED:0x1|READ_UNMAPPED:0x4|SECOND_OF_PAIR:0x80 161781
READ_PAIRED:0x1|MATE_UNMAPPED:0x8|FIRST_OF_PAIR:0x40 161781
READ_PAIRED:0x1|MATE_UNMAPPED:0x8|SECOND_OF_PAIR:0x80 174429
READ_PAIRED:0x1|READ_UNMAPPED:0x4|FIRST_OF_PAIR:0x40 174429
READ_PAIRED:0x1|MATE_UNMAPPED:0x8|READ_STRAND:0x10|FIRST_OF_PAIR:0x40 176219
READ_PAIRED:0x1|READ_UNMAPPED:0x4|READ_STRAND:0x10|MATE_STRAND:0x20|SECOND_OF_PAIR:0x80 176219
READ_PAIRED:0x1|PROPER_PAIR_:0x2|MATE_STRAND:0x20|FIRST_OF_PAIR:0x40 915161
READ_PAIRED:0x1|PROPER_PAIR_:0x2|READ_STRAND:0x10|SECOND_OF_PAIR:0x80 915161
READ_PAIRED:0x1|PROPER_PAIR_:0x2|MATE_STRAND:0x20|SECOND_OF_PAIR:0x80 2623345
READ_PAIRED:0x1|PROPER_PAIR_:0x2|READ_STRAND:0x10|FIRST_OF_PAIR:0x40 2623345
READ_PAIRED:0x1|READ_UNMAPPED:0x4|MATE_UNMAPPED:0x8|SECOND_OF_PAIR:0x80 26232123
READ_PAIRED:0x1|READ_UNMAPPED:0x4|MATE_UNMAPPED:0x8|FIRST_OF_PAIR:0x40 26232123
READ_PAIRED:0x1|READ_STRAND:0x10|MATE_STRAND:0x20|SECOND_OF_PAIR:0x80 10827
READ_PAIRED:0x1|FIRST_OF_PAIR:0x40 13951
READ_PAIRED:0x1|SECOND_OF_PAIR:0x80 13951
READ_PAIRED:0x1|MATE_STRAND:0x20|FIRST_OF_PAIR:0x40 23846
READ_PAIRED:0x1|READ_STRAND:0x10|SECOND_OF_PAIR:0x80 23846
READ_PAIRED:0x1|MATE_STRAND:0x20|SECOND_OF_PAIR:0x80 33606
READ_PAIRED:0x1|READ_STRAND:0x10|FIRST_OF_PAIR:0x40 33606
READ_PAIRED:0x1|MATE_UNMAPPED:0x8|READ_STRAND:0x10|SECOND_OF_PAIR:0x80 143090
READ_PAIRED:0x1|READ_UNMAPPED:0x4|READ_STRAND:0x10|MATE_STRAND:0x20|FIRST_OF_PAIR:0x40 143090
READ_PAIRED:0x1|READ_UNMAPPED:0x4|SECOND_OF_PAIR:0x80 161781
READ_PAIRED:0x1|MATE_UNMAPPED:0x8|FIRST_OF_PAIR:0x40 161781
READ_PAIRED:0x1|MATE_UNMAPPED:0x8|SECOND_OF_PAIR:0x80 174429
READ_PAIRED:0x1|READ_UNMAPPED:0x4|FIRST_OF_PAIR:0x40 174429
READ_PAIRED:0x1|MATE_UNMAPPED:0x8|READ_STRAND:0x10|FIRST_OF_PAIR:0x40 176219
READ_PAIRED:0x1|READ_UNMAPPED:0x4|READ_STRAND:0x10|MATE_STRAND:0x20|SECOND_OF_PAIR:0x80 176219
READ_PAIRED:0x1|PROPER_PAIR_:0x2|MATE_STRAND:0x20|FIRST_OF_PAIR:0x40 915161
READ_PAIRED:0x1|PROPER_PAIR_:0x2|READ_STRAND:0x10|SECOND_OF_PAIR:0x80 915161
READ_PAIRED:0x1|PROPER_PAIR_:0x2|MATE_STRAND:0x20|SECOND_OF_PAIR:0x80 2623345
READ_PAIRED:0x1|PROPER_PAIR_:0x2|READ_STRAND:0x10|FIRST_OF_PAIR:0x40 2623345
READ_PAIRED:0x1|READ_UNMAPPED:0x4|MATE_UNMAPPED:0x8|SECOND_OF_PAIR:0x80 26232123
READ_PAIRED:0x1|READ_UNMAPPED:0x4|MATE_UNMAPPED:0x8|FIRST_OF_PAIR:0x40 26232123
That's it !
Pierre
I just downloaded both of these scripts and they have been pretty cool so far. I am presenting in lab meeting about the summary one today. Thanks!
ReplyDelete