YOKOFAKUN: graphviz

Showing posts with label graphviz. Show all posts

05 July 2014

Pushed : makefile2graph , creating a graph of dependencies from GNU-Make.

I pushed makefile2graph at https://github.com/lindenb/makefile2graph. This is the standalone and 'C' implementation of a program I first wrote in java in 2012. The program creates a graph of dependencies from GNU-Make and its output is a graphiz-dot file or a Gexf/Gephi-XML file.

Usage

$ make -Bnd | make2graph > output.dot
$ make -Bnd | make2graph -x > gephi.gexf.xml

Example

Here is the output of makefile2graph for Tabix:

$ cd tabix-0.2.5
$ make -Bnd |make2graph

digraph G {
n1[label="", color="green"];
n2[label="Makefile", color="green"];
n4[label="all", color="red"];
n3[label="all-recur", color="red"];
n23[label="bedidx.c", color="green"];
n22[label="bedidx.o", color="red"];
n9[label="bgzf.c", color="green"];
n10[label="bgzf.h", color="green"];
n8[label="bgzf.o", color="red"];
n27[label="bgzip", color="red"];
n29[label="bgzip.c", color="green"];
n28[label="bgzip.o", color="red"];
n18[label="index.c", color="green"];
n17[label="index.o", color="red"];
n20[label="khash.h", color="green"];
n16[label="knetfile.c", color="green"];
n11[label="knetfile.h", color="green"];
n15[label="knetfile.o", color="red"];
n24[label="kseq.h", color="green"];
n21[label="ksort.h", color="green"];
n13[label="kstring.c", color="green"];
n14[label="kstring.h", color="green"];
n12[label="kstring.o", color="red"];
n6[label="lib", color="red"];
n7[label="libtabix.a", color="red"];
n26[label="main.c", color="green"];
n25[label="main.o", color="red"];
n5[label="tabix", color="red"];
n19[label="tabix.h", color="green"];
n2 -> n1 ; 
n4 -> n1 ; 
n3 -> n1 ; 
(..)
}

That's it
Pierre

25 March 2013

Embedding Pubmed, Graphiviz and a remote image in #LaTeX. My notebook. .

I'm learning LaTeX. Today I learned how to create a new command in LaTeX.

\newcommand{name}[num]{definition}

"Basically the command requires two arguments: the name of the command you want to create, and the definition of the command" . I played with LaTeX and wrote the following three commands:

Embedding a remote picture

The following LaTeX document defines a new command "\remoteimage". It takes 3 parameters: a filename, a URL and some parameters for \includegraphics. If the file doesn't exist, the url is downloaded and saved in 'file'. The downloaded image is then included in the final LaTeX document.

Note: latex files must be compiled with --enable-write18 to enable system-calls.

pdflatex --enable-write18 input.tex

Result:

External Image /Latex by lindenb

GraphViz Dot

The second LaTex Document works the same way. It defines a command "\graphviz" , sends the content of the 2nd argument to graphviz dot and save the resulting image before importing it into the LaTeX document.

Result:

GraphViz / Latex by lindenb

Pubmed

The last command define "\pmid" . It needs one Pubmed identifer. It downloads the XML record for this pmid, transforms it to LaTeX with xsltproc and the following XSLT stylesheet:

The LaTeX document includes four pubmed identifiers:

Result:

Pumed / Latex by lindenb

That's it,

Pierre

21 November 2012

visualizing the dependencies in a Makefile

Update 2014: I wrote a C version at https://github.com/lindenb/makefile2graph.

I've just coded a tool to visualize the dependencies in a Makefile. The java source code is available on github at : https://github.com/lindenb/jsandbox/blob/master/src/sandbox/MakeGraphDependencies.java. This simple tool parses the ouput of

make -dq

( here option '-d' is 'Print lots of debugging information' and '-q' is 'Run no commands') and prints a graphiz-dot file.

Example

Below is a simple NGS workflow:

%.bam.bai : %.bam
 
file.vcf:  merged.bam.bai ref.fa
merged.bam : sorted1.bam sorted2.bam
sorted1.bam: lane1_1.fastq  lane1_2.fastq ref.fa
sorted2.bam: lane2_1.fastq  lane2_2.fastq ref.fa

Invoking the program:

make -d --dry-run | java -jar makegraphdependencies.jar

generates the following graphiz-dot file:

digraph G {
n9[label="sorted2.bam" ];
n3[label="merged.bam.bai" ];
n10[label="lane2_1.fastq" ];
n11[label="lane2_2.fastq" ];
n2[label="file.vcf" ];
n4[label="merged.bam" ];
n6[label="lane1_1.fastq" ];
n8[label="ref.fa" ];
n7[label="lane1_2.fastq" ];
n0[label="[ROOT]" ];
n5[label="sorted1.bam" ];
n1[label="Makefile" ];
n10->n9;
n11->n9;
n8->n9;
n4->n3;
n3->n2;
n8->n2;
n9->n4;
n5->n4;
n2->n0;
n1->n0;
n6->n5;
n8->n5;
n7->n5;
}

The result: (here using the google chart API for Graphviz)

That's it,
Pierre

30 June 2010

XSLT+NCBI-Taxonomy=Graphviz Dot

The following post was inspired by this question on Biostar.com: http://biostar.stackexchange.com/questions/1549: "lets say I want to know which taxonomic level groups Tribolium castaneum and Drosophila melanogaster. Insects, right? (...) Now lets say I have 10 pairs of such species and I want to see how close & distant they are... How can I do this easily?"
I suggested two solutions, both using a XSLT stylesheet. I then wondered if one could use a xslt stylesheet to draw a tree of life with the help of graphviz. This stylesheet I wrote is available at:

http://code.google.com/(...)/taxonomy2dot.xsl

Usage

xsltproc --novalid taxonomy2dot.xsl \
"http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?id=7070,32351,9605,9606&db=taxonomy&retmode=xml" |\
dot -o/home/pierre/file.svg -Tsvg

The main problem with this stylesheet was to create one and only one connection between two nodes even if this connection was present more than one time in the XML file. So, the trick was to use the xpath axis: preceding-sibling:: to check if the connection was previously printed.

Result

That's it !

Pierre

29 October 2008

EMBL/Strings: find interactors at 2 degrees of separation my notebook.

Thank (again) to the Life Scientists on FriendFeed I've discoreved the API of STRING8 ( STRING 8—a global view on proteins and their functional interactions in 630 organisms NAR 2008): STRING is a database and web resource dedicated to protein–protein interactions, including both physical and functional interactions..

I've used this API to find the partners of a protein at two degrees of separations, here is my notebook:
First download the network for each protein (Note : the database is also available for download) using their HTTP-based API: e.g.: http://string.embl.de/api/psi-mi/interactions?identifier=Roxan. The Ensembl gene ID seems to be the more efficient (non ambiguous) identifiers (e.g. http://string.embl.de/api/psi-mi/interactions?identifier=ENSP00000263243). Note that the STRING database is available for download.

I also wrote a basic XSLT stylesheet transforming the PSI/XML to graphiz-dot format. The stylesheet is available here: http://code.google.com/p/lindenb/source/browse/trunk/src/xsl/psi2dot.xslt. e.g:

xsltproc psi2dot.xslt ROXAN.xml | dot -opicture.png -Tpng

Another XSLT stylesheet (psi2sql.xslt creates the statements to insert one or more psi file into a mysql database ).

xsltproc  --stringparam temporary "" psi2sql.xslt interaction1.xml | mysql -u login --password=password -D database -N
xsltproc  --stringparam temporary "" psi2sql.xslt interaction2.xml | mysql -u login --password=password -D database -N
xsltproc  --stringparam temporary "" psi2sql.xslt interaction3.xml | mysql -u login --password=password -D database -N

The parameter temporary is an argument for the stylesheet telling mysql not to work with temporary tables.

Two of the tables created (interactions and interactors) are described below:

mysql> desc interactor;
+------------+--------------+------+-----+---------+----------------+
| Field      | Type        | Null | Key | Default | Extra          |
+------------+--------------+------+-----+---------+----------------+
| id        | int(11)      | NO  | PRI | NULL    | auto_increment |
| pk        | varchar(50)  | NO  | UNI | NULL    |                |
| shortLabel | varchar(255) | YES  |    | NULL    |                |
| fullName  | text        | YES  |    | NULL    |                |
+------------+--------------+------+-----+---------+----------------+

mysql> desc interaction;
+----------------+--------------+------+-----+---------+----------------+
| Field          | Type        | Null | Key | Default | Extra          |
+----------------+--------------+------+-----+---------+----------------+
| id            | int(11)      | NO  | PRI | NULL    | auto_increment |
| interactor1_id | int(11)      | NO  | MUL | NULL    |                |
| interactor2_id | int(11)      | NO  | MUL | NULL    |                |
| unitLabel      | varchar(50)  | YES  |    | NULL    |                |
| unitFullName  | varchar(100) | YES  |    | NULL    |                |
| confidence    | float        | YES  |    | NULL    |                |
| experiment_id  | int(11)      | NO  | MUL | NULL    |                |
+----------------+--------------+------+-----+---------+----------------+
7 rows in set (0.00 sec)

And here are the mysql statements finding the protein linked to EIF4G1 at two degrees of separation:
create a temporary table containing a the 2-deg interactions.

create temporary table t1
(
id1 int,
id2 int,
id3 int
);

insert into t1(id1,id2,id3)
select distinct
 P1.id,P2.id,P3.id
from
 interactor as P1,
 interactor as P2,
 interactor as P3,
 interaction as I1,
 interaction as I2
where
 P1.shortLabel="EIF4G1" and
 P3.shortLabel!="EIF4G1" and
 ((P1.id= I1.interactor1_id AND P2.id= I1.interactor2_id) or (P2.id= I1.interactor1_id AND P1.id= I1.interactor2_id)) and
 ((P2.id= I2.interactor1_id and P3.id= I2.interactor2_id) or (P3.id= I2.interactor1_id and P2.id= I2.interactor2_id))
 ;

Remove the simple interactions from the temporary table:

delete t1 from
 t1,
 interactor as P1,
 interactor as P3,
 interaction as I1
where
 ((t1.id1=P1.id and t1.id3=P3.id) or (t1.id1=P3.id and t1.id3=P1.id)) and
 ((P1.id= I1.interactor1_id and P3.id= I1.interactor2_id) or (P3.id= I1.interactor1_id and P1.id= I1.interactor2_id))
 ;

And dump the results:

select
 P1.shortLabel as "Partner1",
 P2.shortLabel as "Partner2",
 P3.shortLabel as "Partner3"
from
 t1,
 interactor as P1,
 interactor as P2,
 interactor as P3
where
 t1.id1 = P1.id
 and
 t1.id2 = P2.id
 and
 t1.id3=P3.id
;

Here is the result:

Partner1    Partner2    Partner3
EIF4G1    ZC3H7B    HMGB1
EIF4G1    ZC3H7B    KCTD12
EIF4G1    ZC3H7B    FGB
EIF4G1    ZC3H7B    GLUD1
EIF4G1    ZC3H7B    PDGFRA
EIF4G1    ZC3H7B    PXN

That's it
Pierre

21 October 2008

Javadoc is not enough: java2dot

I just wrote a tiny tool used to draw a graph for a java hierarchy. The input of the program is a set of jar files and the name of the classes to be displayed.

The source code is available here:

http://code.google.com/p/lindenb/source/browse/trunk/proj/tinytools/src/org/lindenb/tinytools/Java2Dot.java

. The information about each class is obtained using the java.lang.reflect API and the classes are dynamically loaded using an URLClassLoader. The output is a DOT file which is then piped into graphiz dot

As an example, the command line below was used to create the hierarchy of the com.hp.hpl.jena.rdf.model.Model.
It was generated using the following command line:

java -jar ./java2dot.jar
Pierre Lindenbaum PhD. pindenbaum@yahoo.fr
Java2Dot : Compiled by lindenb on 2008-10-21 at 17:40:52 in /home/lindenb/src/lindenb/proj/tinytools
 -h this screen
 -jar <dir0:jar1:jar2:dir1:...> add a jar in the jar list. If directory, will add all the *ar files
 -r  add a pattern of classes to be ignored.
 -i ignore interfaces
 -m ignore classes iMplementing interfaces
 -d ignore declared-classes (classes with $ in the name)
 -o  output file

 class-1 class-2 ... class-n



java -jar ./java2dot.jar -jar ${JENADIR}/Jena-2.5.6/lib -d com.hp.hpl.jena.rdf.model.Model |\
     dot -Tjpeg -ojenamodel.jpeg

Update: A jar is available here http://lindenb.googlecode.com/files/java2dot.jar.

Pierre

21 March 2007

Geni, Graphiz, Dot & Family Tree.

Geni is a genealogy-related social networking website launched in beta mode in January 2007. Since yesterday, the family tree can now be exported as a gedcom-xml file (alpha version).

The following xslt stylesheet transforms the gedcom file into a Graphiz/DOT input which can be used to generate the family tree.


<?xml version='1.0' ?>
<xsl:stylesheet xmlns:xsl='http://www.w3.org/1999/XSL/Transform' version='1.0'>
<xsl:output method='text'  omit-xml-declaration="yes" />

<xsl:template match="GEDCOM">
        digraph &quot;G&quot; {
        <xsl:apply-templates select="FamilyRec"/>
        <xsl:apply-templates select="IndividualRec"/>
        }
</xsl:template>

<xsl:template match="IndividualRec">
        <xsl:value-of select="@Id"/>[ shape=box,  label=&quot;<xsl:value-of select="IndivName/GivenName"/><xsl:text> </xsl:text><xsl:value-of select="IndivName/SurName"/> <xsl:if test="DeathStatus=&apos;dead;&apos;">(d)</xsl:if>&quot;

        <xsl:choose>
                <xsl:when test="Gender=&apos;M&apos;">
                        ,color=blue
                </xsl:when>
                <xsl:when test="Gender=&apos;F&apos;">
                        ,color=pink
                </xsl:when>
                <xsl:otherwise>
                        ,color=black
                </xsl:otherwise>
        </xsl:choose>


        ];
</xsl:template>

<xsl:template match="FamilyRec">
        <xsl:variable name="famId"><xsl:value-of select="@Id"/></xsl:variable>

        <xsl:value-of select="$famId"/>[shape=point];

        <xsl:if test="HusbFath">
                <xsl:value-of select="HusbFath/Link/@Ref"/>-&gt;<xsl:value-of select="$famId"/>;
        </xsl:if>

        <xsl:if test="WifeMoth">
                <xsl:value-of select="WifeMoth/Link/@Ref"/>-&gt;<xsl:value-of select="$famId"/>;
        </xsl:if>

        <xsl:for-each select="Child">
                <xsl:value-of select="$famId"/>-&gt;<xsl:value-of select="Link/@Ref"/>;
        </xsl:for-each>
</xsl:template>
</xsl:stylesheet>

Pierre

YOKOFAKUN

05 July 2014

Pushed : makefile2graph , creating a graph of dependencies from GNU-Make.

Usage

Example

25 March 2013

Embedding Pubmed, Graphiviz and a remote image in #LaTeX. My notebook. .

Embedding a remote picture

GraphViz Dot

Pubmed

21 November 2012

visualizing the dependencies in a Makefile

Example

30 June 2010

XSLT+NCBI-Taxonomy=Graphviz Dot

Usage

Result

29 October 2008

EMBL/Strings: find interactors at 2 degrees of separation my notebook.

21 October 2008

Javadoc is not enough: java2dot

21 March 2007

Geni, Graphiz, Dot & Family Tree.

About Me

Feeds

Blog Archive

Web2.0

Labels