Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-settings.php on line 512 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-settings.php on line 527 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-settings.php on line 534 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-settings.php on line 570 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-includes/cache.php on line 103 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-includes/query.php on line 61 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-includes/theme.php on line 1109 darcusblog » Blog Archive » The Babel of Citations - geek tools and the scholar

The Babel of Citations

I’m prompted to consolidate thoughts I’ve been thinking about for awhile by a recent post to the OpenDocument comment list from Alex Brown. In it, Alex correctly observes that [t]he modelling of bibliographic citations in ODF is totally inadequate for real-world content, and suggests instead that ODF must either remove the existing inadequate model, or replace it with a model which is fit-for-purpose; preferably one based on existing actual or de-facto standard.

I support Alex’s basic analysis, though have a somewhat different conclusion that keeps with the spirit of his post. I’d suggest removing the current support AND also adding more substantial support via the new RDF-based metadata support coming in ODF 1.2. Using an RDF vocabulary like bibo heavily reuses existing standards like Dublin Core and FOAF, and only adds those domain-specific types and properties they are missing. It also means it include native RDF-extensbility. If a developer needs to encode some data not appropriate to include in bibo or DC or FOAF as a whole, they can simply do so without breaking things.

Here’s the thing, though:

Simply having standard support in a document format is not enough; you need to entice developers to actually use them. And the evidence from the OOXML world is that is not happening. I am, for example, aware of three different third-party bibliographic applications that can work with Word 2007/2008: Zotero, Mendeley, and Endnote. None of them use the standard OOXML support for citations and bibliographies, and all of them use their own custom fields.

The upshot: an absolutely unacceptable tower of babel. Users cannot collaborate on their documents because the citation fields are specific to different applications.

So, yes, some guidelines and standards for OOXML and ODF would be valuable. But this is not close to enough. Consider this very practical use case that shows both where this market really needs to be, and how far away it still is:

Jane starts a document using OpenOffice and Zotero, adding citations as she goes. She sends the document to her colleagues who use Word and Mendeley and/or Endnote, and still another who prefers Microsoft’s built-in support. They also add citations, and then send the document back to Jane. Jane can then add still more citations, change the citation style, and everything updates correctly for the final draft.

This use case is impossible to realize right now. Even projects like Zotero that have been built on the principle of openness, and which support both Word and OpenOffice, cannot support different users collaborating on the same document.

So what do we need?

We need applications developers to build the APIs that makes it easy for these developers to use the standard fields and metadata support.

I’m looking at you, Microsoft, where you have a citation and bibliographic API that is not really serious about opening up opportunities for third-party applications.

I’m looking at Apple, whose Pages application has a closed API for use only by Endnote.

I’m looking at OpenOffice, who I hope is successful in contributing towards some of this with the forthcoming RDF support.

I also think third-party projects like Zotero and Mendeley and Thomson Reuters need to raise the priority of interoperability. Doing so effectively can also be in their own self-interest, as it could mean less development resources needing to be poured into having to maintain separate processing code bases.

So in short, let’s add richer support to ODF, but let’s also see different developers contribute towards realizing the use case I outline above.

4 Comments

  1. Thomas Zander says:

    I think you will find a much higher uptake if you talk work with someone that is not a monopoly in their respective field.

    I personally would find funds or manpower and start working on a plugin for KOffice, as I think that will give you a real-world usecase and something that works that doesn’t have a huge political and technical burden to overcome towards completion.

    Think about it :)

    ps. yes, I’m a core KOffice developer.

  2. Bruce D'Arcus says:

    Hi Thomas. Sounds good!

    I guess a practical way to proceed is figuring out how to integrate something like Zotero into KOffice, but in a generic way that would a) rely on standard ODF features (and so generic document encoding), and b) a generic API, so that other applications could similarly integrate with KOfifce, and that a similar (hopefully lightweight) approach could also be implemented in OOo.

    That would go a long way towards enabling the sort of interoperability I suggest we need, at least within the OOo/KWord world.

    The current (Python) code that Zotero uses to integrate with OOo is here.

    BTW, I’m sorry I forgot to mention previous discussions we’ve had on this, which I earlier blogged about.

  3. [...] said, it’d be nice if applications could work on the problem I earlier outlined, as well as imagine server-based solutions that were not centralized (see laconica), so that users [...]

  4. [...] Fenner discusses issues related to a recent post of mine, and concludes, first: Both Microsoft Word and OpenOffice should open up their citation APIs to [...]