Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-settings.php on line 512 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-settings.php on line 527 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-settings.php on line 534 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-settings.php on line 570 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-includes/cache.php on line 103 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-includes/query.php on line 61 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-includes/theme.php on line 1109 darcusblog » 2004 » October - geek tools and the scholar

Archive for October, 2004

Library Applications and Cross-Over

Posted in General on October 26th, 2004 by darcusb – Comments Off

It seems to me the application space for bibliographic software—whether commercial or open source—breaks down into two broad camps. The first is to view the software as a library catalog: as a way to organize physical objects. This is the model for a variety of applications, such as Books, LibDB, Alexandria, and a long list of others I’m too busy to track down right now.

It’s also the model for this new application called Delicious Library, which someone goes so far as to call a “killer app.” I don’t see that; it strikes me as pure gimmick. And pure to open source form, there’s now a virtual copy of the application in the Mono-based project mCatalog

… which brings me to the second category of application: one for the hard-core bibliographer. Here the application does not just catalog physical objects, but all manner of resources, as well as concepts that allow one to begin to understand how they relate to each. These tools are thus not just for managing stuff, but for dealing with ideas. Finally, they also take on the difficult task of integrating content into documents, and of formatting them according to precise publisher specs.

So different applications for different users. What I’d really like to see is for these worlds to be less distinct; for the general-purpose applications to be able to extend to handle more demanding needs, for example. I’d like to be able to store the complete range of records I need to store, and to easily add a module for bibliographic formatting.

The trick in designing for cross-over is in a solid general-purpose data model (and good MODS import/export would help), and with a GUI designed for flexibility. Simply assuming books and videos ia a recipe for a narrow application.

It strike me that a project like LibDB is well-positioned for this sort of cross-over because it has a rich data model based on the FRBR, which represents the state-of-the-art in library metadata. Still, the FRBR wasn’t designed with things like journal articles (or songs, or legal cases) in mind, and so there needs to be some work to extend it to cover these sorts of records.

XSLT Performance

Posted in Uncategorized on October 9th, 2004 by darcusb – Comments Off

Using Saxon on the commandline yields a significant performance hit as the JVM must startup each time. As a result, it’s sometime hard to judge performance of your stylesheets. It turns out Saxon 8 has an undocumented commandline switch that allows one to measure performance apart from the startup time. So, use the -3 switch, and the transformation is run three times.

Following are the processing times in milliseconds reported by Saxon in each of three successive runs on the example document included in my bib stylesheet archive:


So, the first run is three times slower than the last, and the last is not too bad! Yes, advocates of xsltproc will note it would probably handle it faster. Still, my stylesheets are doing a lot of work, and XSLT 2.0 support in xsltproc is nowhere on the horizon as near as I can tell.

Citation Style Language

Posted in Uncategorized on October 5th, 2004 by darcusb – 2 Comments

I’m at the point where I really need some input on the citation style language I’ve been developing along with XSLT stylesheets. Current archive is posted here.

I really need input from people with a knowledge of both XML and other citation style languages, either bibtex .bst files or the binary styling files in applications like Endnote or Reference Manager. I also need input in particular from people who regularly use either the numbered or citekey styles common in the hard sciences, or note-based styles common in the humanities.

Examples included in the archive.

The (annotated) citation style schema is written in RELAX NG. I will ultimately generate an XML Schema version of it, but not now.

The primary reason for this is that RNG allows me to constrain the schema more finely than I can with XML Schema (the class attribute on the root controls validation throughout the rest of the schema in RNG). So, the design of the schema reflects how the underling XSL code is designed, and indeed how I believe it ought to be designed in any language.

If anyone wants to actually play with the schema, I strongly recommend nxml mode in emacs. It’s simply excellent, and properly represents the schema to the user (not all editors have fully implemented RNG support; for example, Oxygen doesn’t handle it that well, and presents a looser view of the schema to the user than I intend).

Standards-Based XHTML Slide Shows

Posted in Uncategorized on October 2nd, 2004 by darcusb – Comments Off

Showing the power and utter coolness of standards-based XHTML/CSS, Eric Meyer has come up with a cross-platform, cross-browser, slideshow solution that has the following characteristics:

  1. entire presentation contained in a single file, for fast switching between slides
  2. all presentation handled with CSS, which means you get the presentation view onscreen, and a nice outline representation when printed

Here’s a “slide”:

<div class="slide">
<h2>The Advantages</h2>
<li>With one file, you get a slide show, a printable outline, and a screen presentation</li>
<li>Files are incredibly lightweight and compress easily</li>
<li>Thanks to their semantic (X)HTML, slideshow files are also highly accessible</li>
<li>New slide themes can be created simply by writing new style sheets</li>

Now, we (I!) need an XSLT to convert Keynote files (and DocBook, or whatever other structured XML) to this format. This would be perfect for posting online versions of class lectures; something my students have been asking for for years. This is the first thing that’s come around that makes me want to do it?

Bib Formatting as Web Service?

Posted in Uncategorized on October 2nd, 2004 by darcusb – Comments Off

I have come to believe that innovation in open source bibliographic software must come from adopting standards (and in some cases creating them), and in modularizing the pieces that make the puzzle: data storage and query, online record access, interaction with word-processors and such, and bibliographic formatting. Trying to do everything as a monolithic application is just too difficult.

My stylesheets address the last. One question I’m left with is how to actually integrate the processing into other applications, particularly when those applications may be written in a variety of languages (PHP, C++, Ruby, Pthon, C, etc.), while there is only one currently available implementation of XSLT 2.0: Saxon, which is written in Java.

So I posted a note to the (excellent) xsl list, and got this suggestion from Saxon author Michael Kay:

Consider implementing the transformation as a web service and invoking it from the client application via HTTP calls. The client need never know that the transformation is done using XSLT, let alone that it’s done using an XSLT processor written in Java.

This is an intriguing idea. In essence, it’d be an XML-based, unicode-enabled, web service version of bibtex for the 21st century. Indeed, all that’s really involved in processing a document with the citations is to send a source document with the embedded MODS records—along with the citation style parameter—to the processor, and get the formatted document (or zip file, or whatever) back.

Any thoughts on how well this might work with traditional desktop applications? Could a demo of such a service be hosted on Sourceforge (I have a project set up; just haven’t announced anything yet)? Anyone out there interested in trying to implement this? I have no Java skills, and no time to learn.