19 February 2008

Freebase Wikipedia Extraction (WEX)

Via the Freebase blog.

The Freebase Wikipedia Extraction (WEX) http://download.freebase.com/wex/ is a processed dump of the English language Wikipedia. The wiki markup for each article is transformed into machine-readable XML, and common relational features such as templates, infoboxes, categories, article sections, and redirects are extracted in tabular form.

Freebase WEX is provided as a set of database tables in TSV format for PostgreSQL, along with tables providing mappings between Wikipedia articles and Freebase topics, and corresponding Freebase Types

See also:


1 comment:

Deepak said...


I can't see the body of your posts in my feed reader.

Loving your work with Freebase and feeling quite envious :)