I've been asked to look for some new / suspected / previously uncharacterized intron-exon junctions in public RNASeq data.
I've used the BAMs under http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeCaltechRnaSeq/.
The following command is used to build the list of BAMs:
curl -s "http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeCaltechRnaSeq/" |\ tr ' <>"' "\n" | grep -F .bam | grep -v bai | sort | uniq | sed 's/.bam$//' | sed 's/$/ \\/' wgEncodeCaltechRnaSeqGm12878R1x75dAlignsRep1V2 \ wgEncodeCaltechRnaSeqGm12878R1x75dAlignsRep2V2 \ wgEncodeCaltechRnaSeqGm12878R1x75dSplicesRep1V2 \ wgEncodeCaltechRnaSeqGm12878R1x75dSplicesRep2V2 \ wgEncodeCaltechRnaSeqGm12878R2x75Il200AlignsRep1V2 \ wgEncodeCaltechRnaSeqGm12878R2x75Il200AlignsRep2V2 \ wgEncodeCaltechRnaSeqGm12878R2x75Il200SplicesRep1V2 \ wgEncodeCaltechRnaSeqGm12878R2x75Il200SplicesRep2V2 \ wgEncodeCaltechRnaSeqGm12878R2x75Il400AlignsRep2V2 \ wgEncodeCaltechRnaSeqGm12878R2x75Il400SplicesRep2V2 \ (...)
This list is inserted as a list named SAMPLES a Makefile.
All in one: