Parallelizing #RStats using #make
In the current post, I'll show how to use R as the main SHELL of GNU-Make instead of using a classical linux shell like 'bash'. Why would you do this ?
- awesomeness
- Make-based workflow management
- Make-based execution with --jobs. GNU make knows how to execute several recipes at once. Normally, make will execute only one recipe at a time, waiting for it to finish before executing the next. However, the '-j' or '--jobs' option tells make to execute many recipes simultaneously.
The only problem is that R doesn't accept a multiline-argument on the command line (see http://stackoverflow.com/questions/21442674) so I created a wrapper 'mockR' that save the argument '-e "code"' into a file and pipe it into R:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
while getopts "e:" opt; do | |
case $opt in | |
e) | |
R=$OPTARG | |
;; | |
\:) | |
echo "Argument missing: -$OPTARG" >&2 | |
exit -1 | |
;; | |
\?) | |
echo "Invalid option: -$OPTARG" >&2 | |
exit -1 | |
;; | |
esac | |
done | |
if [[ -n "$R" ]] | |
then | |
TMP=$(mktemp) | |
trap "{ rm -f ${TMP}; exit 255; }" SIGINT | |
echo "${R}" > ${TMP} | |
R --vanilla --no-readline --quiet < ${TMP} | |
rm -f ${TMP} | |
else | |
echo "Undefined script" >&2 | |
exit -1 | |
fi |
(Edit1: A comment from madscientist : Re your script; you can save yourself some wear-and-tear on your disk and avoid the need for temp files and cleanup by just piping the input directly:
echo "$R" | R --vanilla --no-readline --quiet
. Just a thought. ")(Edit2: the exit value of 'R' should also be returned by 'mockR'.)
This file is set as executable:
$ chmod u+x ./mockRIn the makefile, we tell 'make' to use 'mockR' instead of '/usr/bin/sh':
SHELL = ./mockRThe R code will be passed to 'mockR' using the argument '-e "code"'
.SHELLFLAGS= -eWe also set 'ONESHELL': "If .ONESHELL is mentioned as a target, then when a target is built all lines of the recipe will be given to a single invocation of the shell rather than each line being invoked separately"
.ONESHELL:
Example 1
We download the table 'knownGene' from the UCSC and we plot a pdf file 'countExons=f(txStart)'. Please, note that the targets are created using some R statements, NOT bash statements:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
.PHONY: all clean | |
SHELL = ./mockR | |
.SHELLFLAGS= -e | |
.ONESHELL: | |
CHROM=1 2 | |
UCSC=http://hgdownload.cse.ucsc.edu/goldenpath/hg19/database/ | |
all: $(foreach C,${CHROM}, chr${C}.pdf) | |
chr1.pdf: knownGene.txt.gz | |
gold <- read.delim(gzfile("$<")) | |
colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
gold=subset(gold,chrom=='chr1') | |
pdf(file="$@") | |
plot(gold$$txStart,gold$$exonCount) | |
dev.off() | |
chr2.pdf: knownGene.txt.gz | |
gold <- read.delim(gzfile("$<")) | |
colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
gold=subset(gold,chrom=='chr2') | |
pdf(file="$@") | |
plot(gold$$txStart,gold$$exonCount) | |
dev.off() | |
knownGene.txt.gz: | |
download.file("${UCSC}/$@","$@") | |
clean: | |
$(foreach F,knownGene.txt.gz $(foreach C,${CHROM}, chr${C}.pdf) ,file.remove("$F");) |
Now Invoke make
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ make-4.0/make -f R01.mk | |
download.file("http://hgdownload.cse.ucsc.edu/goldenpath/hg19/database//knownGene.txt.gz","knownGene.txt.gz") | |
> download.file("http://hgdownload.cse.ucsc.edu/goldenpath/hg19/database//knownGene.txt.gz","knownGene.txt.gz") | |
trying URL 'http://hgdownload.cse.ucsc.edu/goldenpath/hg19/database//knownGene.txt.gz' | |
Content type 'application/x-gzip' length 4270683 bytes (4.1 Mb) | |
opened URL | |
================================================== | |
downloaded 4.1 Mb | |
> | |
gold <- read.delim(gzfile("knownGene.txt.gz")) | |
colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
gold=subset(gold,chrom=='chr1') | |
pdf(file="chr1.pdf") | |
plot(gold$txStart,gold$exonCount) | |
dev.off() | |
> gold <- read.delim(gzfile("knownGene.txt.gz")) | |
> colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
> gold=subset(gold,chrom=='chr1') | |
> pdf(file="chr1.pdf") | |
> plot(gold$txStart,gold$exonCount) | |
> dev.off() | |
null device | |
1 | |
> | |
gold <- read.delim(gzfile("knownGene.txt.gz")) | |
colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
gold=subset(gold,chrom=='chr2') | |
pdf(file="chr2.pdf") | |
plot(gold$txStart,gold$exonCount) | |
dev.off() | |
> gold <- read.delim(gzfile("knownGene.txt.gz")) | |
> colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
> gold=subset(gold,chrom=='chr2') | |
> pdf(file="chr2.pdf") | |
> plot(gold$txStart,gold$exonCount) | |
> dev.off() | |
null device | |
1 | |
> |
Example 2
Using a theeval
and the call
function we can make the previous 'Makefile' applicable for all the chromosomes:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
.PHONY: all clean | |
SHELL = ./mockR | |
.SHELLFLAGS= -e | |
.ONESHELL: | |
CHROM=1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Y | |
UCSC=http://hgdownload.cse.ucsc.edu/goldenpath/hg19/database/ | |
define plot_knownGene | |
chr$(1).pdf: knownGene.txt.gz | |
gold <- read.delim(gzfile("$$<")) | |
colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
gold=subset(gold,chrom=='chr$(1)') | |
pdf(file="$$@") | |
plot(gold$$$$txStart,gold$$$$exonCount) | |
dev.off() | |
endef | |
all: $(foreach C,${CHROM}, chr${C}.pdf) | |
$(foreach C,${CHROM}, $(eval $(call plot_knownGene,$C))) | |
knownGene.txt.gz: | |
download.file("${UCSC}/$@","$@") | |
clean: | |
$(foreach F,knownGene.txt.gz $(foreach C,${CHROM}, chr${C}.pdf) ,file.remove("$F");) |
Now Invoke make USING TRHEE PARALLEL JOBS
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ ./make -j 3 -f R02.mk all | |
download.file("http://hgdownload.cse.ucsc.edu/goldenpath/hg19/database//knownGene.txt.gz","knownGene.txt.gz") | |
> download.file("http://hgdownload.cse.ucsc.edu/goldenpath/hg19/database//knownGene.txt.gz","knownGene.txt.gz") | |
trying URL 'http://hgdownload.cse.ucsc.edu/goldenpath/hg19/database//knownGene.txt.gz' | |
Content type 'application/x-gzip' length 4270683 bytes (4.1 Mb) | |
opened URL | |
================================================== | |
downloaded 4.1 Mb | |
> | |
gold <- read.delim(gzfile("knownGene.txt.gz")) | |
colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
gold=subset(gold,chrom=='chr1') | |
pdf(file="chr1.pdf") | |
plot(gold$txStart,gold$exonCount) | |
dev.off() | |
gold <- read.delim(gzfile("knownGene.txt.gz")) | |
colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
gold=subset(gold,chrom=='chr2') | |
pdf(file="chr2.pdf") | |
plot(gold$txStart,gold$exonCount) | |
dev.off() | |
gold <- read.delim(gzfile("knownGene.txt.gz")) | |
colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
gold=subset(gold,chrom=='chr3') | |
pdf(file="chr3.pdf") | |
plot(gold$txStart,gold$exonCount) | |
dev.off() | |
> gold <- read.delim(gzfile("knownGene.txt.gz")) | |
> gold <- read.delim(gzfile("knownGene.txt.gz")) | |
> gold <- read.delim(gzfile("knownGene.txt.gz")) | |
> colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
> colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
> colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
> gold=subset(gold,chrom=='chr3') | |
> gold=subset(gold,chrom=='chr1') | |
> pdf(file="chr3.pdf") | |
> gold=subset(gold,chrom=='chr2') | |
> plot(gold$txStart,gold$exonCount) | |
> dev.off() | |
null device | |
1 | |
> | |
gold <- read.delim(gzfile("knownGene.txt.gz")) | |
colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
gold=subset(gold,chrom=='chr4') | |
pdf(file="chr4.pdf") | |
plot(gold$txStart,gold$exonCount) | |
dev.off() | |
> pdf(file="chr1.pdf") | |
> pdf(file="chr2.pdf") | |
> plot(gold$txStart,gold$exonCount) | |
> plot(gold$txStart,gold$exonCount) | |
> dev.off() | |
null device | |
1 | |
> | |
gold <- read.delim(gzfile("knownGene.txt.gz")) | |
colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
gold=subset(gold,chrom=='chr5') | |
pdf(file="chr5.pdf") | |
plot(gold$txStart,gold$exonCount) | |
dev.off() | |
> dev.off() | |
null device | |
1 | |
> | |
gold <- read.delim(gzfile("knownGene.txt.gz")) | |
colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
gold=subset(gold,chrom=='chr6') | |
pdf(file="chr6.pdf") | |
plot(gold$txStart,gold$exonCount) | |
dev.off() | |
> gold <- read.delim(gzfile("knownGene.txt.gz")) | |
> gold <- read.delim(gzfile("knownGene.txt.gz")) | |
> gold <- read.delim(gzfile("knownGene.txt.gz")) | |
> colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
> colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
> gold=subset(gold,chrom=='chr5') | |
> gold=subset(gold,chrom=='chr6') | |
> pdf(file="chr5.pdf") | |
> plot(gold$txStart,gold$exonCount) | |
> dev.off() | |
null device | |
1 | |
> | |
gold <- read.delim(gzfile("knownGene.txt.gz")) | |
colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
gold=subset(gold,chrom=='chr7') | |
pdf(file="chr7.pdf") | |
plot(gold$txStart,gold$exonCount) | |
dev.off() | |
> pdf(file="chr6.pdf") | |
> plot(gold$txStart,gold$exonCount) | |
> dev.off() | |
null device | |
1 | |
> | |
gold <- read.delim(gzfile("knownGene.txt.gz")) | |
colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
gold=subset(gold,chrom=='chr8') | |
pdf(file="chr8.pdf") | |
plot(gold$txStart,gold$exonCount) | |
dev.off() | |
> gold <- read.delim(gzfile("knownGene.txt.gz")) | |
> gold <- read.delim(gzfile("knownGene.txt.gz")) | |
> colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
> gold=subset(gold,chrom=='chr4') | |
> pdf(file="chr4.pdf") | |
> plot(gold$txStart,gold$exonCount) | |
> dev.off() | |
null device | |
1 | |
> | |
gold <- read.delim(gzfile("knownGene.txt.gz")) | |
colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
gold=subset(gold,chrom=='chr9') | |
pdf(file="chr9.pdf") | |
plot(gold$txStart,gold$exonCount) | |
dev.off() | |
> gold <- read.delim(gzfile("knownGene.txt.gz")) | |
> colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
> gold=subset(gold,chrom=='chr7') | |
> pdf(file="chr7.pdf") | |
> plot(gold$txStart,gold$exonCount) | |
> dev.off() | |
null device | |
1 | |
> | |
gold <- read.delim(gzfile("knownGene.txt.gz")) | |
colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
gold=subset(gold,chrom=='chr10') | |
pdf(file="chr10.pdf") | |
plot(gold$txStart,gold$exonCount) | |
dev.off() | |
> gold <- read.delim(gzfile("knownGene.txt.gz")) | |
> colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
> gold=subset(gold,chrom=='chr9') | |
> pdf(file="chr9.pdf") | |
> plot(gold$txStart,gold$exonCount) | |
> dev.off() | |
null device | |
1 | |
> | |
gold <- read.delim(gzfile("knownGene.txt.gz")) | |
colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
gold=subset(gold,chrom=='chr11') | |
pdf(file="chr11.pdf") | |
plot(gold$txStart,gold$exonCount) | |
dev.off() | |
> colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
> gold <- read.delim(gzfile("knownGene.txt.gz")) | |
> gold=subset(gold,chrom=='chr8') | |
> pdf(file="chr8.pdf") | |
> plot(gold$txStart,gold$exonCount) | |
> dev.off() | |
null device | |
1 | |
> | |
gold <- read.delim(gzfile("knownGene.txt.gz")) | |
colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
gold=subset(gold,chrom=='chr12') | |
pdf(file="chr12.pdf") | |
plot(gold$txStart,gold$exonCount) | |
dev.off() | |
> gold <- read.delim(gzfile("knownGene.txt.gz")) | |
> colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
> gold=subset(gold,chrom=='chr10') | |
> pdf(file="chr10.pdf") | |
> plot(gold$txStart,gold$exonCount) | |
> dev.off() | |
null device | |
1 | |
> | |
gold <- read.delim(gzfile("knownGene.txt.gz")) | |
colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
gold=subset(gold,chrom=='chr13') | |
pdf(file="chr13.pdf") | |
plot(gold$txStart,gold$exonCount) | |
dev.off() | |
> gold <- read.delim(gzfile("knownGene.txt.gz")) | |
> colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
> gold=subset(gold,chrom=='chr11') | |
> pdf(file="chr11.pdf") | |
> plot(gold$txStart,gold$exonCount) | |
> dev.off() | |
null device | |
1 | |
> | |
gold <- read.delim(gzfile("knownGene.txt.gz")) | |
colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
gold=subset(gold,chrom=='chr14') | |
pdf(file="chr14.pdf") | |
plot(gold$txStart,gold$exonCount) | |
dev.off() | |
> gold <- read.delim(gzfile("knownGene.txt.gz")) | |
> colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
> gold=subset(gold,chrom=='chr12') | |
> pdf(file="chr12.pdf") | |
> plot(gold$txStart,gold$exonCount) | |
> dev.off() | |
null device | |
1 | |
> | |
gold <- read.delim(gzfile("knownGene.txt.gz")) | |
colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
gold=subset(gold,chrom=='chr15') | |
pdf(file="chr15.pdf") | |
plot(gold$txStart,gold$exonCount) | |
dev.off() | |
> gold <- read.delim(gzfile("knownGene.txt.gz")) | |
> colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
> gold=subset(gold,chrom=='chr13') | |
> pdf(file="chr13.pdf") | |
> plot(gold$txStart,gold$exonCount) | |
> dev.off() | |
null device | |
1 | |
> | |
gold <- read.delim(gzfile("knownGene.txt.gz")) | |
colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
gold=subset(gold,chrom=='chr16') | |
pdf(file="chr16.pdf") | |
plot(gold$txStart,gold$exonCount) | |
dev.off() | |
> gold <- read.delim(gzfile("knownGene.txt.gz")) | |
> colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
> gold=subset(gold,chrom=='chr14') | |
> pdf(file="chr14.pdf") | |
> plot(gold$txStart,gold$exonCount) | |
> dev.off() | |
null device | |
1 | |
> | |
gold <- read.delim(gzfile("knownGene.txt.gz")) | |
colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
gold=subset(gold,chrom=='chr17') | |
pdf(file="chr17.pdf") | |
plot(gold$txStart,gold$exonCount) | |
dev.off() | |
> gold <- read.delim(gzfile("knownGene.txt.gz")) | |
> colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
> gold=subset(gold,chrom=='chr15') | |
> pdf(file="chr15.pdf") | |
> plot(gold$txStart,gold$exonCount) | |
> dev.off() | |
null device | |
1 | |
> | |
gold <- read.delim(gzfile("knownGene.txt.gz")) | |
colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
gold=subset(gold,chrom=='chr18') | |
pdf(file="chr18.pdf") | |
plot(gold$txStart,gold$exonCount) | |
dev.off() | |
> gold <- read.delim(gzfile("knownGene.txt.gz")) | |
> colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
> gold=subset(gold,chrom=='chr16') | |
> pdf(file="chr16.pdf") | |
> plot(gold$txStart,gold$exonCount) | |
> dev.off() | |
null device | |
1 | |
> | |
gold <- read.delim(gzfile("knownGene.txt.gz")) | |
colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
gold=subset(gold,chrom=='chr19') | |
pdf(file="chr19.pdf") | |
plot(gold$txStart,gold$exonCount) | |
dev.off() | |
> gold <- read.delim(gzfile("knownGene.txt.gz")) | |
> colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
> gold=subset(gold,chrom=='chr17') | |
> pdf(file="chr17.pdf") | |
> plot(gold$txStart,gold$exonCount) | |
> dev.off() | |
null device | |
1 | |
> | |
gold <- read.delim(gzfile("knownGene.txt.gz")) | |
colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
gold=subset(gold,chrom=='chr20') | |
pdf(file="chr20.pdf") | |
plot(gold$txStart,gold$exonCount) | |
dev.off() | |
> gold <- read.delim(gzfile("knownGene.txt.gz")) | |
> colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
> gold=subset(gold,chrom=='chr18') | |
> pdf(file="chr18.pdf") | |
> plot(gold$txStart,gold$exonCount) | |
> dev.off() | |
null device | |
1 | |
> | |
gold <- read.delim(gzfile("knownGene.txt.gz")) | |
colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
gold=subset(gold,chrom=='chr21') | |
pdf(file="chr21.pdf") | |
plot(gold$txStart,gold$exonCount) | |
dev.off() | |
> gold <- read.delim(gzfile("knownGene.txt.gz")) | |
> colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
> gold=subset(gold,chrom=='chr19') | |
> pdf(file="chr19.pdf") | |
> plot(gold$txStart,gold$exonCount) | |
> dev.off() | |
null device | |
1 | |
> | |
gold <- read.delim(gzfile("knownGene.txt.gz")) | |
colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
gold=subset(gold,chrom=='chr22') | |
pdf(file="chr22.pdf") | |
plot(gold$txStart,gold$exonCount) | |
dev.off() | |
> gold <- read.delim(gzfile("knownGene.txt.gz")) | |
> colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
> gold=subset(gold,chrom=='chr20') | |
> pdf(file="chr20.pdf") | |
> plot(gold$txStart,gold$exonCount) | |
> dev.off() | |
null device | |
1 | |
> | |
gold <- read.delim(gzfile("knownGene.txt.gz")) | |
colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
gold=subset(gold,chrom=='chrX') | |
pdf(file="chrX.pdf") | |
plot(gold$txStart,gold$exonCount) | |
dev.off() | |
> colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
> gold <- read.delim(gzfile("knownGene.txt.gz")) | |
> gold=subset(gold,chrom=='chr21') | |
> pdf(file="chr21.pdf") | |
> plot(gold$txStart,gold$exonCount) | |
> dev.off() | |
null device | |
1 | |
> | |
gold <- read.delim(gzfile("knownGene.txt.gz")) | |
colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
gold=subset(gold,chrom=='chrY') | |
pdf(file="chrY.pdf") | |
plot(gold$txStart,gold$exonCount) | |
dev.off() | |
> gold <- read.delim(gzfile("knownGene.txt.gz")) | |
> colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
> gold=subset(gold,chrom=='chr22') | |
> pdf(file="chr22.pdf") | |
> plot(gold$txStart,gold$exonCount) | |
> dev.off() | |
null device | |
1 | |
> | |
> colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
> colnames(gold)= c("name", "chrom", "strand", "txStart", "txEnd", "cdsStart", "cdsEnd", "exonCount", "exonStarts", "exonEnds", "proteinID", "alignID") | |
> gold=subset(gold,chrom=='chrX') | |
> gold=subset(gold,chrom=='chrY') | |
> pdf(file="chrX.pdf") | |
> pdf(file="chrY.pdf") | |
> plot(gold$txStart,gold$exonCount) | |
> dev.off() | |
null device | |
1 | |
> | |
> plot(gold$txStart,gold$exonCount) | |
> dev.off() | |
null device | |
1 | |
> |
You can now watch the final pdf files:
That's it,
Pierre
1 comment:
Very cool.
Post a Comment