News Air: 2010-12-12

My views on TreeBASE are pretty well known. Lately I've been thinking a lot about how to "fix" TreeBASE, or indeed, move beyond it. I've made a couple of baby steps in this direction.

The first step is that I've created a group for TreeBASE papers on Mendeley. I've uploaded all the studies in TreeBASE as of December 13 (2010). Having these in Mendeley makes it easier to tidy up the bibliographic metadata, add missing identifiers (such as DOIs and PubMed ids), and correct citations to non-existent papers (which can occur if at the time the authors uploaded their data the planned to submit their paper to one journal, but it ending up being accepted in another). If you've a Mendeley account, feel free to join the group. If you've contributed to TreeBASE, you should find your papers already there.

The second step is playing with CouchDB (this years new hotness), exploring ways to build a database of phylogenies that has nothing much to do with either a relational database or a triple store. CouchDB is a document store, and I'm playing with taking NeXML files from TreeBASE, converting them to something vaguely usable (i.e., JSON), and adding them to CouchDB. For fun, I'm using my NCBI to Wikipedia mapping to get images for taxa, so if TreeBASE has mapped a taxon to the NCBI taxonomy, and that taxon has a page in Wikipedia with an image, we get an image for that taxon. The reason for this is I'd really like a phylogeny database that was visually interesting. To give you some examples, here are trees from TreeBASE (displayed using SVG), together with thumbnails of images from Wikipedia:

Everything (tree and images) is stored within a single document in CouchDB, making the display pretty trivial to construct. Obviously this isn't a proper interface, and there's things I'd need to do, such as order the images in such a way that they matched the placement of the taxa on the tree, but at a glance you can see what the tree is about. We could then envisage making the images clickable so you could find out more about that taxon (e.g., text from Wikipedia, lists of other trees in the database, etc.).

We could expand this further by extracting geographical information (say, from the sequences included in the study) and make a map, or eventually a phylogeny on Google Earth) (see David Kidd's recent "Geophylogenies and the Map of Life" for a manifesto doi:10.1093/sysbio/syq043).

One of the big things missing from databases like TreeBASE is a sense of "fun", or serendipity. It's hard to find stuff, hard to discover new things, make new connections, or put things in context. And that's tragic. Try a Google image search for treebase+phylogeny:

Call me crazy, but I looked at that and thought "Wow! This phylogeny stuff is cool!" Wouldn't it be great if that's the reaction people had when they looked at a database of evolutionary trees?

Journal	Rights
PLoSOne	Embedded RDF, e.g. <license rdf:resource="http://creativecommons.org/licenses/by/2.5/" />
Nature Communications	<meta name="access" content="Yes" /> for open, <meta name="access" content="No" /> for close
Systematic Biology	<meta name="citation_access" content="all" /> for open, this tag missing if closed
BioOne	Nothing for article, Open Access icon next to open access articles in table of contents
BMC Evolutionary Biology	<meta name ="dc.rights" content="http://creativecommons.org/licenses/by/2.0/" />
Philosophical Transactions of the Royal Society	<meta name="citation_access" content="all" /> for open access
Microbial Ecology	No metadata (links and images in HTML)
Human Genomics and Proteomics	<meta name ="dc.rights" content="http://creativecommons.org/licenses/by/2.0/" />

Journal

Rights

PLoSOne

Embedded RDF, e.g. <license rdf:resource="http://creativecommons.org/licenses/by/2.5/" />

Nature Communications

Systematic Biology

BioOne

Nothing for article, Open Access icon next to open access articles in table of contents

BMC Evolutionary Biology

Philosophical Transactions of the Royal Society

Microbial Ecology

No metadata (links and images in HTML)

Human Genomics and Proteomics

News Air

TreeBASE, again

How do I know if an article is Open Access?

Feedjit

My Blog List