Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-settings.php on line 512 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-settings.php on line 527 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-settings.php on line 534 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-settings.php on line 570 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-includes/cache.php on line 103 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-includes/query.php on line 61 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-includes/theme.php on line 1109 darcusblog » 2005 » January - geek tools and the scholar

Archive for January, 2005

DocBook Workflow and Commenting System

Posted in Uncategorized on January 30th, 2005 by darcusb – Comments Off

My current workflow uses DocBook NG for a schema, emacs + nxml mode for authoring and editing, CVS for version control, the eXist XML DB for storing bibliographic records, and my own XSLT stylesheets to bring it all together in formatted documents.

This doesn’t really address another important aspect of the authoring process, though: commenting and revision. Most of my colleagues would use Word with its change-tracking functionality, but I refuse to use Word, so what are the alternatives?

Lars Trieloff and I were recently chatting about the possibility of adding commenting support to a DocBook + SVN workflow. I had thought perhaps integration with Trac—along with some JS magic—might offer some interesting possibilities to do this in a user-friendly way. Now Lars is exploring which tracker to use, with Trac being one option. If you have any thoughts/experiences, pass them on.

I’m personally also interested in the possibilities of integrating a wiki markup language into such a setup. The only example I’m aware of is the system Eric van der Vlist used to author his RELAX NG book, but I don’t know any of the details.


Posted in Uncategorized on January 29th, 2005 by darcusb – Comments Off

Alf Eaton sent me a link to an interesting new Firefox extension called Piggy-Bank. There’s a related Firefox extension called Research Buddy that also has potential.

It is something like a holy grail for academics and researchers to be able to highlight content in a web page, or simply annotate them, while storing along with it complete citation-ready metadata. Nobody has quite done that yet, but these are a start.

I still hate it when I see example after example based on hard-science research workflows. I don’t use CiteSeer, and I rarely encounter BibTeX in my field (more commonly RIS or Refer). I guess it’s no wonder that most bibliographic software poorly supports the social sciences and humanities, given that it’s almost all written by people who come from hard science backgrounds.

RELAX NG vs. XML Schema

Posted in Uncategorized on January 27th, 2005 by darcusb – Comments Off

Over on the MODS list, I’ve been involved in an interesting discussion about design choices in schema development. While I had a strong impression that there were problems with XML Schema quite apart from its complexity, this discussion showed me things are worse in the XSD world than I imagined.

Basic things I take for granted in RELAX NG are simply not possible in XSD. Consider, for example, something so basic as this in RELAX NG:

Name = element name {(Name-Given | Name-Family)+}
Name-Given = element namePart {attribute type {"given"}, text}
Name-Family = element namePart {attribute type {"family"}, text }

Nope; you can’t do it in XSD. It seems XSD does not allow one to define the same top-level element in different places (never mind that they have different attributes, or that in other cases they will be used in different contexts).

Now, what about this: you want to create a library of definitions and simply include them into various other schemas at will. However, those schemas each have different namespaces. RELAX NG does as you would expect: it allows you to leave a default namespaces off of the library. When those definitions are used in schemas, they take on the default namespace of that schema.

Once again: you cannot do this in XSD.

How about conditioning validation of a document based on an attribute value? Again, you cannot do it in XSD.

It’s really quite frustrating to watch how bad design choices of an industry standard limit the way XML designers can approach the problems of schema development. It’s also frustrating to continue to see large organizations like the Library of Congress and Apple invest tons of energy and money into a clearly broken standard, when there’s a better alternative in RELAX NG. I—an amateur developer with just over a year of experience with RELAX NG—can write better, more compliant, XSD schemas by authoring them by hand in the RELAX NG Compact Syntax and converting with Trang than it seems many professionals can working with XSD directly in an expensive commercial editor (XMLSpy). What does that say?

Atom and MODS

Posted in Uncategorized on January 27th, 2005 by darcusb – Comments Off

I’ve been chatting with BibDesk developer Mike McCracken more about using Atom to syndicate MODS content. We’ve concluded that the following is perfectly valid Atom:

<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns=""
      version="draft-ietf-atompub-format-04 : do not deploy">
  <head xml:lang="en">
      <name>Bruce D'Arcus</name>
    <title>Public Space Readings</title>
    <link href="http:://"/>
    <category label="topic" term="public space"/>
    <title>Some Article</title>
    <content type="application/xml+annote">
      <div class="bibnotes" xmlns="">
        <div class="summary">
          <p>Some notes with <q cite="doe99a@23-24">a quote that spans <span
          class="pagebreak"/> pages</q>.</p>
    <content type="application/xml+mods">
      <mods ID="doe99" xmlns="http://www.loc/gov/mods/v3">
          <title>Some Title</title>
    <link href=""/>

I’m liking this idea! There are still some issues to consider though:

  • annotation markup XHTML or specific schema? It’s really important to me that annotations be able to encode at the very minimum the specific location in a text where a quote comes from (e.g. it’s page number(s)). So either you do it by hacking XHTML a bit (as I do above), or you write a tailored schema (which I’ve done), but lose the advantages of XHTML. I’d like the solution to be amenable to wiki-languages like Textile too.
  • linking I would like the linking between annotation and mods record to be self-contained, and not rely (only) on Atom. In my eXist DB, my annotations are stored in a separate collection, but linked to their MODS referent(s) (one can link to more than one).
  • IDs That suggests—in a net-enabled sharing environment—the need to think again about this issue.

Update: Seems I was wrong about the ability to use multiple content elements in an entry. Still, the xhtml content could be moved to the summary element.

Javascript and XML-HTTP Request

Posted in Uncategorized on January 25th, 2005 by darcusb – Comments Off

I don’t really understand the technical details, but I get the feeling this discussion on the use of Javascript and XML-HTTP Request in the Ruby web app Ta-da list might be useful. The feature is used to provide near instantaneous GUI changes (for example, adding an entry), without need to refresh the page, and it works quite nicely. Hmm … I wonder how this could be used in a bib app? Maybe for an editing GUI for CSL files?

Update: Just adding some cool links I found on this:

OmniOutliner 3 and the Power of XML/XSLT

Posted in Uncategorized on January 24th, 2005 by darcusb – 1 Comment

Over the past two years both Microsoft and the OpenOffice community have figured out how to exploit XML and XSLT in the context of GUI productivity suites. Apple most clearly has not.

However, some Apple ISVs are starting to do just that. The best example of this is the new integrated XSLT processing embedded in OmniOutliner 3. OO has had an XML file format for the past couple of years. With OO 3, however, Omni has added:

  1. an XSLT-based export/import plug-in system
  2. in the Pro version, named character styles

So what does this mean in practical terms? It means OO Pro suddenly becomes a decent structured document authoring tool. A case in point is a plug-in I just wrote to convert OO to the S5 XHTML-based presentation system. Instead of harassing Omni to add support for the format, and telling them which kind of styling to support, I just do it myself. I have named style for emphasis, blockquote, and citation, and map them to appropriate output via the XSLT. You can get it here. It’s written for my own workflow, and is rather spare in features.

Now just imagine what would happen if Apple was to do this throughout the OS?

YABA and Syndicating Bib Data

Posted in Uncategorized on January 21st, 2005 by darcusb – Comments Off

I’ve decided we need a new acronym: YABA. Here’s a nice example, with support for Atom and RSS synidication. However:

  • the feeds don’t contain the bib metadata itself
  • it escapes content as CDATA (I really hope Atom doesn’t make this hack necessary!)
  • do we really need YABA?

In related news, Mike McCracken at the BibDesk project is looking into syndicating MODS data via Atom. I really like this idea.

Update: Just to be clear, part of my point above is that BibTeX is not suitable for syndicating (in the more exapansive sense of not just a representation of the metadata, but the metadata itself). It’s neither XML, nor would I call it a modern representation of bibliographic metadata. MODS, on the other hand …

CSS and Printing

Posted in Uncategorized on January 20th, 2005 by darcusb – Comments Off

A really interesting demonstration of how to use CSS to do printing of the sort typically done with XSL-FO. I was skeptical reading through it, but the results are actually quite decent. It’s not like FO has proven itself in the realm of fine typography anyway.

Exploiting CSS and Javascript

Posted in General on January 5th, 2005 by darcusb – Comments Off

I’ve been thinking more about using CSS and Javascript to create more elegant UI’s for bibliographic metadata. In short, the problem is how to have a simple GUI that quickly provides a good overview of lots of records, while also allowing the user to focus in on more detailed information such as annotations without needing to load new pages. Zeldman has a link to this really nice use of JS and CSS. The problem is different (images), but there are similar underling issues.

A couple of months back, I modified this example, and came up with this.

One problem with the approach is that it takes a fair bit of horizontal real estate, particularly if there are multiple author names or very long titles. Does anyone out there with better CSS skills than I know of any clever CSS solutions to this problem? Maybe somehow shorten long fields?

Another issue is that I wonder if it’d be better to constrain the main table, and only display the notes in a panel below (probably in a single record at a time). That could have the advantage of better transferring to smaller GUI’s, such as … Dashboard Widgets.

I’d like to start by modifying this demo included with the eXist XML DB.

Update: After ages spent cleaning up some ugly html and converting to xhtml, I got the above example working with overflow:hidden. I’d really like to get some advice from some CSS and design gurus though.

Update 2: Johan Kool did what I in turn had done: borrowed code and turned it into something else. I quite like the result.