18 May 2007

A filter for RSS feeds

Stew has introduced the google-code-project where is hosted OpenRevien (the source code of Postgenomic). This gave me the idea to host feedfilter a program filtering rss feeds that I wrote quickly a few days ago. My project is hosted at http://code.google.com/p/feedfilter/. It's a small java program running as a CGI (you don't need tomcat or either..). A parameter in the cgi is used to filter the feeds:

Examples:

PNAS - RSS feed of Early Edition articles: keep posts containing the word "protein"
feed("http://www.pnas.org/rss/ahead.xml",contains("protein"))


PNAS - RSS feed of Early Edition articles: keep posts NOT containing the words "[IMMUNOLOGY]" or "[geology]"
feed("http://www.pnas.org/rss/ahead.xml",NOT(OR(contains("[IMMUNOLOGY]"),contains("[geology]"))))

RSS feeds from connotea about tag=bioinformatics where author is not dc:creator=lindenb
feed("http://www.connotea.org/rss/tag/bioinformatics",not(equals("dc:creator","lindenb")))

Syntax
input: feed( <url:string>, <node> )
node: AND( <node> , <node> )
| OR( <node> , <node> )
| NOT( <node> )
| contains( <qName:string>,<value:string>)
| contains( <value:string>)
| equals( <qName:string>,<value:string>)
| equals( <value:string>)
| regex( <qName:string>,<java regular expression:string>)
| regex( <java regular expression:string>)




Pierre

6 comments:

  1. why not use Yahoo Pipes?
    Classic bioinfo doing too much work.

    ReplyDelete
  2. You're absolutely right ! :-)
    1) I forgot about yahoo.pipes (shame on me !)
    2) I wanted to implement this tiny language with boolean operators for some other applications
    3) I also wanted to test code.google.com

    ReplyDelete
  3. Another option is RSSFilter http://www.feedforall.com/rssfilter.htm

    ReplyDelete
  4. good idea, but if you want to have the "web 2.0" stamp, you could try to have a different URL and avoid explicit parameters, like feed("http://www.pnas.org/rss/ahead.xml/protein")

    if you are outside a web 2.0 framework, apache URL rewrite can do that for you.

    ReplyDelete
  5. How does this program work?

    ReplyDelete
  6. I tried using yahoo pipes but it didn't update in a timely manor, also difficult to create a simple filter. Now I use http://www.filtermyrss.com

    ReplyDelete