Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-settings.php on line 512 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-settings.php on line 527 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-settings.php on line 534 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-settings.php on line 570 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-includes/cache.php on line 103 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-includes/query.php on line 61 Deprecated: Assigning the return value of new by reference is deprecated in /var/san/www/prod/html/blogs/darcusb/wp-includes/theme.php on line 1109 darcusblog » 2007 » June - geek tools and the scholar

Archive for June, 2007

Endnote and the Case for Zotero

Posted in Uncategorized on June 24th, 2007 by darcusb – 1 Comment

Another year, and another Endnote release. Its developers have announced a new feature: what they call groups. This is basically user-defined folders.

Here’s the thing: Zotero has had this feature from the beginning. Zotero also still goes way beyond Endnote in its support for notes, tagging and so forth.

So let’s see … you can pay $99 upgrade fee for a generally more limited application, or you can get a superior application for free, and contribute to a genuine movement.

That movement is one where users fully shape the direction of the application. Indeed, most of the people involved in coding or designing Zotero and related pieces are scholars; users who know what they want and need, and want to create something better than existing alternatives like Endnote.

A second principle that flows from the user-led orientation of Zotero is that data must be free.

Finally, a consequence of the open data and open code approach to Zotero is that it creates new opportunities. One simple example is word-processor integration. As a start, one of the Zotero developers added support for Word. Then a Zotero user came along and improved that, and then still another outside developer who wanted to add equivalent (and compatible) support to OpenOffice (to be released soon). I expect to see a whole ecosystem of similar innovations build up over time.

A few years ago I actually gave up the Endnote ghost. I had been beta testing the first version that ran natively on Mac OS X, and been really frustrated by the poor quality of what I saw. As I was starting work on a book manuscript, I was finding Word crashing regularly, and I knew it had something to do with Endnote. So I complained; about that, and a whole lot of other things.

A project lead actually told me at the time something like “if you don’t like Endnote, use something else.” So I did! Not finding any good alternatives, I asked the question “what would it take to create a better and more open alternative to Endnote?” When enough people start to ask the same question, the answer is something like Zotero.

What is a citation?

Posted in Uncategorized on June 7th, 2007 by darcusb – 10 Comments

As a part of Zotero and ODF discussions of citations recently, we’ve stumbled on a tricky issue that impacts on a lot of different pieces to the puzzle of robust and reliable citation formatting. The question for today is, what is a citation?

Let’s examine an example rendered in author-date style:

(Doe, 1999:25; see also Smith, 2000)

So how to model this? What to do with the cited page number, and the “see also” bit? How to encode it so that it’s easy to sort and reformat?

One option is to just say that a citation may have one or more references, each of which can contain parameters for cited pages and prefixes or suffixes. Graphically (think in terms of RDF), this might look like:

This is rather complex; probably more than it needs to be.

Another option is to treat the cited page not as a parameter of some reference abstraction, but to more directly encode what the user intends: namely to cite the item fragment itself.

That’s simpler. Hmm …

Come to think of it, I’m not sure it’s correct; the “see also” prefix isn’t really for the source, but rather for the reference to the source. So maybe we can’t throw out the reference abstraction just yet.

Moreover, we’re left with another, even more fundamental, problem. The “see also” prefix in fact references to a different kind of reference. As such, it gets sorted differently within the citation. So if you have a style that says to sort references within the citation according to author-date, it needs to group such references at the end of the citation, sort within that group, and attach a prefix to the entire group.

So a formatting system that fully supports real-world citation styles really ought to understand that this is a different kind of reference; something like:

Admittedly, option 3 would require some cleverness to make this all fully intuitive and natural for the user. It would also likely result in limiting the flexibility users are accustomed to with other applications like Endnote. However, in that case, users are basically forced to handle all this themselves, so it seems a small loss in flexibility is balanced by a fairly big payoff in automation.

OOo: Quality Through Obsolescence?

Posted in General on June 2nd, 2007 by darcusb – Comments Off

Michael Meeks (from an interview) on what many of us have seen for a long time as serious problems within OpenOffice.org:

I would stress that there are people inside Sun that do ‘get it’. People that are open, and helpful, and really good. But there are also a large number who are very traditional, very staid people, particularly in quality assurance. You can’t argue with them, because they’re in their own self-reinforcing world view. They say specifications are necessary for product quality, and you say “That’s fine, but look at the quality. It’s still not very good.” They say more specifications are necessary! The answer is always more of the same, and you can’t argue with that. It leads to obsolescence - quality through obsolescence, is what I like to call it.

Michael notes a lot of progress on the OOo organizational front of late, such as the move to more frequent releases. But clearly the deeper organizational dynfucntions are really, seriously, weighing on the capacity for OOo to innovate. I really hope they don’t slow down implementation of the new metadata support in ODF 1.2. It really has the potential to be a killer innovation opportunity for OOo, but not if it gets delayed for five years by business as usual.

I’m cautiously optimistic, though.

Joost and Metadata

Posted in General on June 2nd, 2007 by darcusb – Comments Off

There’s been a lot of focus on both the promise and likely pitfalls of Joost’s attempt to bring television to the web. But this post emphasizes that all the focus on the multimedia aspect of the effort misses what may well be Joost’s killer feature: its metadata support.

They have a lot of really good people working on building out a killer metadata system based on open standards like RDF, and exploiting open source code. Moreover, they are contributing back to the open source world (witness TripleSoup at the Apache Foundation).

Janko speculates on the goals:

So what can these metadata frameworks be used for? Timestamped comments and tags are certainly one interesting possibility. Combine this with FOAF-like social networking structures, and you got yourself a whole new way to explore TV programming.

Perhaps Joost will be RDF’s first recognized “killer app”?

Rules for RDF Modelling

Posted in Uncategorized on June 2nd, 2007 by darcusb – 2 Comments

I saw this post from one of the Joost developers awhile back, but forgot about it. He talks about modeling choices they made in designing a scalable system. Summing up, he concludes by saying:

  • Don’t use RDF collections. Use one-to-many properties that result in “collections” instead.
  • If you need ordering, define the sorting algorithm instead of putting the ordering in your data.
  • If you have (sort-of) one-to-one relationships in your model, and one or both sides of the relationship is identified by a bnode, merge the concepts into one and distinguish using properties.

Intuitively, this seems right to me. But I’m having a hard time solving one particular problem within these constraints: how to represent ordered contributors (authors, editors, and so forth).

I had been leaning towards using collections to represent this. For example:

<http://ex.net/1> dcterms:creator <http://ex.net/2> .

<http://ex.net/2> a foaf:Group ; vcard:sort-string "Doe, Jane; Smith, John" b:members ( <http://ex.net/3>, <http://ex.net/4> ) .

So I’m trying to wrap my head around how one would both preserve order and avoid collections and sequence properties (the define the sorting algorithm instead of putting the ordering in your data bit).