Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-settings.php on line 512 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-settings.php on line 527 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-settings.php on line 534 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-settings.php on line 570 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-includes/cache.php on line 103 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-includes/query.php on line 61 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-includes/theme.php on line 1109 darcusblog » 2005 » February - geek tools and the scholar

Archive for February, 2005

XML, Indexing and Query

Posted in Uncategorized on February 27th, 2005 by darcusb – 1 Comment

This is hardly my area, but came across two projects that seem to overlap. One is the Zebrahigh-performance, general-purpose structured text indexing and retrieval engine from Index Data. This is a GPL solution that provides out-of-the-box support for z39.50 and SRU/W. While I’ve not figured out how to configure it for MODS records, the thing is quite fast on the example documents included in the distribution.

The other example is XmlIndexer, a Mono-based project from Edd Dumbill that makes use of dotLucene to provide similar sort of functionality. As with Zebra, you specify an index, and then run the tools against it.

I can’t help but wonder why nobody outside of the library world seems to have heard about SRU/W and CQL, though. It would be nice to see these worlds converge a bit more.

Google Maps

Posted in Uncategorized on February 15th, 2005 by darcusb – Comments Off

Really interesting discussion of the technology behind the new Google Maps. Jon Udell discusses how it exploits XML and XSLT, and the larger implications for users. Also, another useful link.

Bleeding-Edge Workflow

Posted in Uncategorized on February 10th, 2005 by darcusb – 5 Comments

Over on the OpenOffice bibliographic project user list, we’ve been having a conversation that started with me asking if anyone found the existing bibliographic support adequate. Every single answer was a resounding “no!”

One user complained that he may need to move from his current Linux-only solution to a dual-boot setup with Word and Endnote. Always wanting to encourage people to stay with free software, I told him to contact me off-list if he wanted to try a DocBook-based alternative.

So, here’s my suggestions:

1) Get a good XML editor. I use both oXygen and emacs NXML mode. I’d tend to recommend oXygen for new users. It has a free 30-day demo, and is otherwise reasonable. Let’s assume oXygen for these instructions.

2) Download the appropriate DocBook DTD or RELAX NG schema. I despise DTD-based toolchains, so recommend RELAX NG. You want to use v4.4 or the in-developement v5, since they both include enhanced citation support. The latter is available here. When you create a new document, use this schema.

3) You need to convert a bunch of references from Endnote? No problem; download these tools.

Export your references from Endnote as Endnote or Refer, and use end2xml to convert to MODS XML, preferably using the -s and -un options, so that each record becomes a separate file, and all-unicode.

4) Next, download and install the eXist XML DB. Create a collection (with the Java client) called mods, and load all your docs. Point your web browser to here and make sure everything works.

5) To be able to process your documents with the citations, you need to download CiteProc.

6) In oXygen, with your example document open, setup a “transformation scenario” that uses one of the stylesheets in the citeproc/xsl/document directory; let’s say dbng-xhtml.xsl. Make sure you choose the “Saxon 8″ processor option, and give it a “citation-style” parameter; say “author-year.”

That’s it (OK, it’s a little long!). When you run the transformation, the XSLT processor will go through your document looking for citations, then request all those documents from eXist, and format them.

RefDB is another DocBook solution for bibliographic storage and formatting.

PS: If you’re writing a dissertation, think about version-controlling it with cvs or subversion.

DocBook Review System

Posted in Uncategorized on February 3rd, 2005 by darcusb – Comments Off

Lars Trieloff has announced a promising new XML-based DocBook review system. It exploits Subversion, XSLT and Trac to allow user comments to be integrated into version-controlled document source. I may be using this for my next big writing project!

Textile, Citations and Favelets

Posted in Uncategorized on February 1st, 2005 by darcusb – Comments Off

Been chatting with the author of PyTextile about how to implement my desire for citation-related coding for quotes and such in textile. Combining his idea with another from an email I got would give us:

Some notes with ``a quote that 
spans %(pagebreak)% pages'':doe99a#page=23-24

In related news, a cool idea to use XML-HTTPRequest to add textile support to arbitrary HTML text areas.