Showing posts with label graphviz. Show all posts
Showing posts with label graphviz. Show all posts

05 July 2014

Pushed : makefile2graph , creating a graph of dependencies from GNU-Make.

I pushed makefile2graph at https://github.com/lindenb/makefile2graph. This is the standalone and 'C' implementation of a program I first wrote in java in 2012. The program creates a graph of dependencies from GNU-Make and its output is a graphiz-dot file or a Gexf/Gephi-XML file.

Usage

$ make -Bnd | make2graph > output.dot
$ make -Bnd | make2graph -x > gephi.gexf.xml 

Example

Here is the output of makefile2graph for Tabix:
$ cd tabix-0.2.5
$ make -Bnd |make2graph
digraph G {
n1[label="", color="green"];
n2[label="Makefile", color="green"];
n4[label="all", color="red"];
n3[label="all-recur", color="red"];
n23[label="bedidx.c", color="green"];
n22[label="bedidx.o", color="red"];
n9[label="bgzf.c", color="green"];
n10[label="bgzf.h", color="green"];
n8[label="bgzf.o", color="red"];
n27[label="bgzip", color="red"];
n29[label="bgzip.c", color="green"];
n28[label="bgzip.o", color="red"];
n18[label="index.c", color="green"];
n17[label="index.o", color="red"];
n20[label="khash.h", color="green"];
n16[label="knetfile.c", color="green"];
n11[label="knetfile.h", color="green"];
n15[label="knetfile.o", color="red"];
n24[label="kseq.h", color="green"];
n21[label="ksort.h", color="green"];
n13[label="kstring.c", color="green"];
n14[label="kstring.h", color="green"];
n12[label="kstring.o", color="red"];
n6[label="lib", color="red"];
n7[label="libtabix.a", color="red"];
n26[label="main.c", color="green"];
n25[label="main.o", color="red"];
n5[label="tabix", color="red"];
n19[label="tabix.h", color="green"];
n2 -> n1 ; 
n4 -> n1 ; 
n3 -> n1 ; 
(..)
}

That's it
Pierre

25 March 2013

Embedding Pubmed, Graphiviz and a remote image in #LaTeX. My notebook. .

I'm learning LaTeX. Today I learned how to create a new command in LaTeX.

\newcommand{name}[num]{definition}
"Basically the command requires two arguments: the name of the command you want to create, and the definition of the command" . I played with LaTeX and wrote the following three commands:

Embedding a remote picture

The following LaTeX document defines a new command "\remoteimage". It takes 3 parameters: a filename, a URL and some parameters for \includegraphics. If the file doesn't exist, the url is downloaded and saved in 'file'. The downloaded image is then included in the final LaTeX document.

Note: latex files must be compiled with --enable-write18 to enable system-calls.
pdflatex --enable-write18 input.tex
Result:

External Image /Latex by lindenb


GraphViz Dot

The second LaTex Document works the same way. It defines a command "\graphviz" , sends the content of the 2nd argument to graphviz dot and save the resulting image before importing it into the LaTeX document.

Result:

GraphViz / Latex by lindenb


Pubmed

The last command define "\pmid" . It needs one Pubmed identifer. It downloads the XML record for this pmid, transforms it to LaTeX with xsltproc and the following XSLT stylesheet:

The LaTeX document includes four pubmed identifiers:

Result:

Pumed / Latex by lindenb






That's it,

Pierre




21 November 2012

visualizing the dependencies in a Makefile

Update 2014: I wrote a C version at https://github.com/lindenb/makefile2graph.
I've just coded a tool to visualize the dependencies in a Makefile. The java source code is available on github at : https://github.com/lindenb/jsandbox/blob/master/src/sandbox/MakeGraphDependencies.java. This simple tool parses the ouput of
make -dq
( here option '-d' is 'Print lots of debugging information' and '-q' is 'Run no commands') and prints a graphiz-dot file.

Example

Below is a simple NGS workflow:
%.bam.bai : %.bam
 
file.vcf:  merged.bam.bai ref.fa
merged.bam : sorted1.bam sorted2.bam
sorted1.bam: lane1_1.fastq  lane1_2.fastq ref.fa
sorted2.bam: lane2_1.fastq  lane2_2.fastq ref.fa
Invoking the program:
make -d --dry-run | java -jar makegraphdependencies.jar
generates the following graphiz-dot file:
digraph G {
n9[label="sorted2.bam" ];
n3[label="merged.bam.bai" ];
n10[label="lane2_1.fastq" ];
n11[label="lane2_2.fastq" ];
n2[label="file.vcf" ];
n4[label="merged.bam" ];
n6[label="lane1_1.fastq" ];
n8[label="ref.fa" ];
n7[label="lane1_2.fastq" ];
n0[label="[ROOT]" ];
n5[label="sorted1.bam" ];
n1[label="Makefile" ];
n10->n9;
n11->n9;
n8->n9;
n4->n3;
n3->n2;
n8->n2;
n9->n4;
n5->n4;
n2->n0;
n1->n0;
n6->n5;
n8->n5;
n7->n5;
}
The result: (here using the google chart API for Graphviz)

That's it,
Pierre

30 June 2010

XSLT+NCBI-Taxonomy=Graphviz Dot

The following post was inspired by this question on Biostar.com: http://biostar.stackexchange.com/questions/1549: "lets say I want to know which taxonomic level groups Tribolium castaneum and Drosophila melanogaster. Insects, right? (...) Now lets say I have 10 pairs of such species and I want to see how close & distant they are... How can I do this easily?"
I suggested two solutions, both using a XSLT stylesheet. I then wondered if one could use a xslt stylesheet to draw a tree of life with the help of graphviz. This stylesheet I wrote is available at:


Usage


xsltproc --novalid taxonomy2dot.xsl \
"http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?id=7070,32351,9605,9606&db=taxonomy&retmode=xml" |\
dot -o/home/pierre/file.svg -Tsvg

The main problem with this stylesheet was to create one and only one connection between two nodes even if this connection was present more than one time in the XML file. So, the trick was to use the xpath axis: preceding-sibling:: to check if the connection was previously printed.

Result

Your browser does not support the <CANVAS> element !


That's it !

Pierre

29 October 2008

EMBL/Strings: find interactors at 2 degrees of separation my notebook.

Thank (again) to the Life Scientists on FriendFeed I've discoreved the API of STRING8 ( STRING 8—a global view on proteins and their functional interactions in 630 organisms NAR 2008): STRING is a database and web resource dedicated to protein–protein interactions, including both physical and functional interactions..


I've used this API to find the partners of a protein at two degrees of separations, here is my notebook:
First download the network for each protein (Note : the database is also available for download) using their HTTP-based API: e.g.: http://string.embl.de/api/psi-mi/interactions?identifier=Roxan. The Ensembl gene ID seems to be the more efficient (non ambiguous) identifiers (e.g. http://string.embl.de/api/psi-mi/interactions?identifier=ENSP00000263243). Note that the STRING database is available for download.

I also wrote a basic XSLT stylesheet transforming the PSI/XML to graphiz-dot format. The stylesheet is available here: http://code.google.com/p/lindenb/source/browse/trunk/src/xsl/psi2dot.xslt. e.g:

xsltproc psi2dot.xslt ROXAN.xml | dot -opicture.png -Tpng



Another XSLT stylesheet (psi2sql.xslt creates the statements to insert one or more psi file into a mysql database ).
xsltproc --stringparam temporary "" psi2sql.xslt interaction1.xml | mysql -u login --password=password -D database -N
xsltproc --stringparam temporary "" psi2sql.xslt interaction2.xml | mysql -u login --password=password -D database -N
xsltproc --stringparam temporary "" psi2sql.xslt interaction3.xml | mysql -u login --password=password -D database -N

The parameter temporary is an argument for the stylesheet telling mysql not to work with temporary tables.

Two of the tables created (interactions and interactors) are described below:
mysql> desc interactor;
+------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| pk | varchar(50) | NO | UNI | NULL | |
| shortLabel | varchar(255) | YES | | NULL | |
| fullName | text | YES | | NULL | |
+------------+--------------+------+-----+---------+----------------+

mysql> desc interaction;
+----------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| interactor1_id | int(11) | NO | MUL | NULL | |
| interactor2_id | int(11) | NO | MUL | NULL | |
| unitLabel | varchar(50) | YES | | NULL | |
| unitFullName | varchar(100) | YES | | NULL | |
| confidence | float | YES | | NULL | |
| experiment_id | int(11) | NO | MUL | NULL | |
+----------------+--------------+------+-----+---------+----------------+
7 rows in set (0.00 sec)



And here are the mysql statements finding the protein linked to EIF4G1 at two degrees of separation:
create a temporary table containing a the 2-deg interactions.
create temporary table t1
(
id1 int,
id2 int,
id3 int
);

insert into t1(id1,id2,id3)
select distinct
P1.id,P2.id,P3.id
from
interactor as P1,
interactor as P2,
interactor as P3,
interaction as I1,
interaction as I2
where
P1.shortLabel="EIF4G1" and
P3.shortLabel!="EIF4G1" and
((P1.id= I1.interactor1_id AND P2.id= I1.interactor2_id) or (P2.id= I1.interactor1_id AND P1.id= I1.interactor2_id)) and
((P2.id= I2.interactor1_id and P3.id= I2.interactor2_id) or (P3.id= I2.interactor1_id and P2.id= I2.interactor2_id))
;

Remove the simple interactions from the temporary table:
delete t1 from
t1,
interactor as P1,
interactor as P3,
interaction as I1
where
((t1.id1=P1.id and t1.id3=P3.id) or (t1.id1=P3.id and t1.id3=P1.id)) and
((P1.id= I1.interactor1_id and P3.id= I1.interactor2_id) or (P3.id= I1.interactor1_id and P1.id= I1.interactor2_id))
;


And dump the results:
select
P1.shortLabel as "Partner1",
P2.shortLabel as "Partner2",
P3.shortLabel as "Partner3"
from
t1,
interactor as P1,
interactor as P2,
interactor as P3
where
t1.id1 = P1.id
and
t1.id2 = P2.id
and
t1.id3=P3.id
;


Here is the result:
Partner1 Partner2 Partner3
EIF4G1 ZC3H7B HMGB1
EIF4G1 ZC3H7B KCTD12
EIF4G1 ZC3H7B FGB
EIF4G1 ZC3H7B GLUD1
EIF4G1 ZC3H7B PDGFRA
EIF4G1 ZC3H7B PXN



That's it
Pierre

21 October 2008

Javadoc is not enough: java2dot

I just wrote a tiny tool used to draw a graph for a java hierarchy. The input of the program is a set of jar files and the name of the classes to be displayed.

The source code is available here:

. The information about each class is obtained using the java.lang.reflect API and the classes are dynamically loaded using an URLClassLoader. The output is a DOT file which is then piped into graphiz dot

As an example, the command line below was used to create the hierarchy of the com.hp.hpl.jena.rdf.model.Model.
It was generated using the following command line:
java -jar ./java2dot.jar
Pierre Lindenbaum PhD. pindenbaum@yahoo.fr
Java2Dot : Compiled by lindenb on 2008-10-21 at 17:40:52 in /home/lindenb/src/lindenb/proj/tinytools
-h this screen
-jar <dir0:jar1:jar2:dir1:...> add a jar in the jar list. If directory, will add all the *ar files
-r add a pattern of classes to be ignored.
-i ignore interfaces
-m ignore classes iMplementing interfaces
-d ignore declared-classes (classes with $ in the name)
-o output file

class-1 class-2 ... class-n




java -jar ./java2dot.jar -jar ${JENADIR}/Jena-2.5.6/lib -d com.hp.hpl.jena.rdf.model.Model |\
dot -Tjpeg -ojenamodel.jpeg



Update: A jar is available here http://lindenb.googlecode.com/files/java2dot.jar.

Pierre

21 March 2007

Geni, Graphiz, Dot & Family Tree.

Geni is a genealogy-related social networking website launched in beta mode in January 2007. Since yesterday, the family tree can now be exported as a gedcom-xml file (alpha version).

The following xslt stylesheet transforms the gedcom file into a Graphiz/DOT input which can be used to generate the family tree.

Family Tree



<?xml version='1.0' ?>
<xsl:stylesheet xmlns:xsl='http://www.w3.org/1999/XSL/Transform' version='1.0'>
<xsl:output method='text' omit-xml-declaration="yes" />

<xsl:template match="GEDCOM">
digraph &quot;G&quot; {
<xsl:apply-templates select="FamilyRec"/>
<xsl:apply-templates select="IndividualRec"/>
}
</xsl:template>

<xsl:template match="IndividualRec">
<xsl:value-of select="@Id"/>[ shape=box, label=&quot;<xsl:value-of select="IndivName/GivenName"/><xsl:text> </xsl:text><xsl:value-of select="IndivName/SurName"/> <xsl:if test="DeathStatus=&apos;dead;&apos;">(d)</xsl:if>&quot;

<xsl:choose>
<xsl:when test="Gender=&apos;M&apos;">
,color=blue
</xsl:when>
<xsl:when test="Gender=&apos;F&apos;">
,color=pink
</xsl:when>
<xsl:otherwise>
,color=black
</xsl:otherwise>
</xsl:choose>


];
</xsl:template>

<xsl:template match="FamilyRec">
<xsl:variable name="famId"><xsl:value-of select="@Id"/></xsl:variable>

<xsl:value-of select="$famId"/>[shape=point];

<xsl:if test="HusbFath">
<xsl:value-of select="HusbFath/Link/@Ref"/>-&gt;<xsl:value-of select="$famId"/>;
</xsl:if>

<xsl:if test="WifeMoth">
<xsl:value-of select="WifeMoth/Link/@Ref"/>-&gt;<xsl:value-of select="$famId"/>;
</xsl:if>

<xsl:for-each select="Child">
<xsl:value-of select="$famId"/>-&gt;<xsl:value-of select="Link/@Ref"/>;
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>



Pierre