Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-settings.php on line 512 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-settings.php on line 527 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-settings.php on line 534 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-settings.php on line 570 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-includes/cache.php on line 103 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-includes/query.php on line 61 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-includes/theme.php on line 1109 darcusblog » 2006 » January - geek tools and the scholar

Archive for January, 2006

A Model for Citation Metadata

Posted in Uncategorized on January 28th, 2006 by darcusb – 2 Comments

I’ve been saying for awhile we need a solid and widely accessible bibliograpic data model for citations, and finally just decided to write one myself.

The start of the documented version of the RDF schema is here.

I’ve just decided to model the basic classes for now; the sort of thing I’d do anyway if I was writing some web app, or using RDF. One could then just incorporate these classes into other contexts: RSS feeds, Dublin Core-focused RDF, or maybe even into a Django or Rails-based web app model. RDF/OWL provides a nice way to formalize the relationships.

In earlier versions of my thinking on this, I relied much more on the structural approach I’ve been using for citation formatting. So there I’d have “base classes” like “part-InMonograph” and so forth.

But when I got to it, I found this rather limiting, as you don’t always know how an item relates to anotther item. If you cite a song, for example, you don’t know that it’s on an album. So I’ve left things fairly flexible.

The primary classes are Agent, Event, Reference, and Collection. The rest of the currently 55 classes are subclasses of those.

One of the nice things about OWL is that not only can I define classes and subclasses, and then annotate them with text, but I can also make statements about how my classes relate to other classes. For people in the library world, the interesting equivalences I’ve drawn here are to the new FRBR RDF vocabulary. I have made the primary biblio:Reference class a subclass of frbr:Manifestation. This would allow descriptions encoded in my more grounded vocabulary to be placed in the context of a wider and more general FRBR view as needed.

URIs and My Metadata

Posted in Uncategorized on January 28th, 2006 by darcusb – Comments Off

I actually started work on this awhile ago, but given Tim Berners-Lee’s polite request that we all give ourselves uris, here’s mine: http://purl.org/net/darcusb/info#me. I registered a purl so that I keep the uri (and others; see below) consistent despite the actual url.

This is the beginning of my moving my metadata to RDF. From that foaf document—which I hope to get to the point where my CV is automatically generated from it—you can see links to further RDF documents, most of which are dedicated to my bibliographic metadata.

Because the metadata is now fairly nicely normalized, it’s more compact and much more consistent than my previous MODS collection. When finishing my book I realized just how much of a mess that collection was, and how awkward it was to try to normalize it. That process is getting close to being done, and I found RDF to be immensely helpful in doing that. Now all important content—authors, publishers, subjects, periodicals, etc.—are full resources with uris.

Next step? After fixing cleaning up the metadata, I need to modify CiteProc to take RDF input.

Thought experiment: what would happen to citation practices if all academics had their CV’s—including their publications list—encoded in RDF and available as uris?

Markup Aesthetics and Standards

Posted in Uncategorized on January 21st, 2006 by darcusb – 5 Comments

From Edd Dumbill, commenting on Apple’s latest standards faux paus:

It’s no longer enough to make your applications and hardware pretty and functional, but the guts that other people get to see must look good too.

This is one reason that people prefer RELAX NG over XML Schema, for instance. Where markup is concerned, it turns out that the excuse “only computers will read it, and we’ll provide tools to generate it” doesn’t cut it. The web’s had a view-source mentality since it started, and the aesthetics of markup matter a great deal.

This left me wondering though: what is about the aesthetics of markup that matter so much to so many of us? When people argue about RDF in general, it is typically about just this issue. Likewise when I express nervousness about using XMP in OpenDocument, that concern is only partly about more fundamental modeling issues (which I actually consider paramount), but also because XMP insists on using the more ugly aspects of RDF/XML syntax.

Why do aesthetics matter? Perhaps there is a kind o clarity in “beauty”? I know from experience it is hard to convince time-strapped developers to implement new standards, and it gets much easier to get them to do it—and to do it right—if the specs and the expected output is clear to them.

The problem with Apple’s repeated standards missteps, BTW, is not only aesthetic; it’s substantive. When applications produce XHTML that just consists of a bunch of dumb divs without any semantic coding, they are dramatically limiting the usefulness of those documents. As a user, I personally get offended when I see this sort of thing. Likewise, when companies like Apple violate basic principles of XML (their photocasting spec abuses namespaces for example), they are making life difficult not only for users, but for other developers. Not good!

Perhaps it’s time for Apple to hire a standards czar; one whose sole job is to vet all application development to ensure that it meets up to larger expectations.

CiteProxy: Networked Citations

Posted in General on January 8th, 2006 by darcusb – Comments Off

Alf Eaton has posted a note about his experiments with CiteProc, and his efforts to adapt it to his own workflow. With his CiteProxy script, Alf has come up with the sort of thing I was envisioning awhile back: that in an ideal world, I would never have to manage my own citation metadata, but my tools would simply get that data as needed from the network.

CiteProc assumes all of a document’s metadata will be accessible from a single source: a local flat file, or an online RESTful interface. With CiteProxy, Alf has smartly realized that you can fool CiteProc by just offering it a proxy. So, send it a list of references you want—using say a pubmed id or doi—and let the proxy go and find the metadata for you.

I really think this approach has a lot of promise, and examples like CiteProxy are just the beginning.