Twitterati

I'm part of the JISC funded project LinkSphere at Reading University. They are building an online social media tool to do cross institutional repository searching, and facility research relationships. While the guys there are getting up to speed on programming the first demos, we set our research assistant, Claire Ross, a task: why not write up a paper on how people use social media. And why not do one on twitter. And why not study how Digital Humanities folks use it, within conference settings (thereby giving a nice corpus on which to base the study).

The results are here. A nice full, paper which has been submitted to a journal for consideration. Let us know if you have any comments!

iPad: Dodging The "Doonesbury Bullet"…


Doonesbury cartoons flayed the Newton to death for its (dreadful) handwriting recognition...

Here's a little story I'd like to share on why I believe Apple's iPad will succeed where Microsoft Windows-based TabletPCs have failed to gain more than a tiny niche-market share. I believe it offers a classic illustration of "geek" versus "consumer" thinking...

I happened to be having an online conversation several months ago with a former colleague who had been one of the Handwriting Recognition (HR) experts on the Microsoft TabletPC team. His view was that Apple would have real trouble launching a Tablet device - because Microsoft holds a number of key patents in the area of handwriting recognition, and he could see no way in which they'd be able to get around those to create a usable device.

It's true that Microsoft has indeed done some great work around handwriting recognition. My wife Tanya wrote the first draft of a 400,000-word book using it, on a Windows TabletPC. It wasn't absolutely perfect. But it was perfectly usable. And I'm sure MSFT has a ton of patents around it.

Apple, of course, has been seriously bitten in the past by handwriting recognition - or lack of it. The Newton was launched in 1993, as the first in a new category of device - the Personal Digital Assistant. Theory was, you'd write on the screen with a stylus, Newton would recognize your writing, and turn it into typed text on the screen. Newton might not do a perfect job at first, but it would "learn" your handwriting as it went along, and rapidly improve.

It was a disaster. Newton's recognition mistakes were so legendary that the widely-syndicated Doonesbury cartoon strip poked fun at them for months, at the end of which time Newton was a laughing-stock. It eventually died a well-deserved death.

I bet Steve Jobs vowed at that time that the company would never again ship a device which depended on handwriting recognition for its success. Anyone on the iPad team who suggested putting it in would be given The Glasgow Farewell. (Pick a window - you're leaving!)

Classic. If there's an obstacle, go around.

To the geeks at Microsoft, though, handwriting recognition was one of The Last Great Problems of Computing - a really interesting and complex area. Lots of languages, too! They tackled it head-on.

And, you know what? By applying brute force, effort, huge investment, and some really smart people - they solved it! I've never tried Windows HR with any language other than English - where it does work really well. But I'm sure it does a great job on other languages, too - even Chinese and Japanese.

And you know what else? It won't matter! Because when you pack a tablet device with enough power to run Windows and Office, do handwriting recognition, full-screen video, and everything else, you end up with a machine that is too thick, too heavy, uses too much power, and runs too hot. And it doesn't help that the hardware manufacturers who're building them all get off on a 16:9 aspect ratio video trip at the same time.

You've built a fleet of Hummers, when the market just wants a Prius...


iPad: It's definitely not a Newton...

Steve Jobs knows how to hook consumers. The first Macintoshes had 128K of RAM, and only a 400K floppy disk - no hard drive at all. But they reset people's expectations of what a computer should look like, how easy it should be to use, and what you could do with one. As they got better, people just kept upgrading, with no resentment. Today, Macintosh laptops and desktops are better than Windows machines. You pay more, for sure. But if you can afford it, it's worth it.

The Windows TabletPC philosophy was: "If we can build it, they will come". It's a valid gamble, if you have deep enough pockets. Occasionally, it even comes off.

Apple's philosophy, on the other hand, is: "First, we get them to come. Then we can take them with us." If you create a market with enough customers, they will tell you what they want next. They'll tell you what's missing. You build on a relationship with a LOT of customers. And you make a LOT of money while you're doing it. And oh, by the way - there's a new business model that goes along with it so you make more money AFTER you've sold the device.

I would not be in the least surprised to find that in a few years there's a high-end iPad that is a powerful computer used for many tasks other than media consumption. I've already speculated elsewhere that we might see a stylus for the iPad sooner rather than later. Apple's website says the iPad's touch-recognition capability is high-precision; so a stylus ought to give a lot more precision than a finger for applications like drawing, for example.

Geeks have been complaining since the launch: "It doesn't do multitasking!" "There's no support for Flash!" "It's not an open environment!"

All I have to say is this: It's a Prius, not a Ferrari - yet. And you'll see plenty on the road...



iPad: No egg freckles on its face...

A four layer model for image-based editions

Perhaps the most iconic sort of project in the literary digital humanities is the electronic edition.  Unfortunately, these projects, which seek to preserve and provide access to important and endangered cultural artifacts, are, themselves, endangered.  Centuries of experimentation with the production and preservation of paper have generated physical artifacts that, although fragile, can be placed in specially controlled environments and more or less ignored until a researcher wants to see them.  On the other hand, only the most rudimentary procedures exist for preserving digital artifacts, and most require regular care by specialists who must convert, transfer, and update the formats to those readable by new technologies that are not usually backwards compatible.   A new model is required.  The multi-layered model pictured here will, we believe, be attractive to the community of digital librarians and scholars, because it clearly defines the responsibilities of each party and requires each to do only what they do best.

Level 1:  Digitization of Source materials

Four-layered  model for image-based editionsThe creation of an electronic edition often begins with the transfer of analog objects to binary, computer readable files.  Over the last ten years, these content files (particularly image files) have proven to be among the most stable in digital collections.  While interface code must regularly be updated to conform to the requirements of new operating systems and browser specifications, text and image file formats remain relatively unchanged, and even 20 year old GIFs can be viewed on most modern computers.  The problem, then, lays not so much with the maintenance of these files but in their curation and distribution.  For various reasons (mostly bureaucratic and pecuniary rather than technical), libraries have often attempted to limit access to digital content to paths that passed through proprietary interfaces.  This protectionist approach to content prevents scholars from using the material in unexpected (though perhaps welcome) ways, and also endangers the continued availability of the content as the software that controls the proprietary gateways becomes obsolete.  Moreover, these limitations are rarely able to prevent those with technical expertise (sometimes only the ability to read JavaScript code) from accessing the content in any case, and so nothing is gained, and (potentially) everything is lost by this approach.

More recently, projects like the Homer Multitext Project, the Archimedes Palimpsest, and the Shakespeare Quartos Archive, have taken a more liberal approach to the distribution of their content.  While each provides an interface specially designed for the needs of their audience, the content providers have also made their images available under a Creative Commons license at stable and open URIs.   Granting agencies could require that content providers commit to maintain their assets at stable URIs for a specified period of time (perhaps 10-15 years).  At the end of this period, the content provider would have the opportunity to either renew their agreement or move the images to a different location.  The formats used should be as open and as commonly used as possible.  Ideally, the library should also provide several for each item in the collection.  A library might, for instance, chose to provide a full-size 300 MB uncompressed tiff image, a slightly smaller JPEG2000 image served via a Djatoka installation, or a set of tiles for use by “deep zooming” image viewers such open layers.

Level 2:  Metadata

The files and directories in level 1 should be as descriptive as possible and named using a regular and easily identifiable progression (e.g. “Hamlet_Q1_bodley_co1_001.tif”); however, all metadata external to the file itself should be considered part of level 2. Following Greene and Meissner’s now famous principle of “More Product, Less Process”, we propose that all but the most basic work of identification of content should be located in the second level of the model, and possibly performed by institutions or individuals not associated with the content provider at level 1.  The equipment for digitizing most analog material is now widely available and many libraries have developed relatively inexpensive and efficient procedures for the work, but in many cases there is considerable lag time between the moment the digital surrogates are generated and the moment they are made publicly available.  Many content providers feel an obligation to ensure that their assets are properly cataloged and labeled before making them available to their users.  While the impulse towards quality assurance and thorough work is laudable, a perfectionist policy that delays publication of preliminary work is better suited for immutable print media than an extensible digital archive.  In our model, content providers need not wait to provide content until it has been processed and catalogued.

Note also that debates about the proper choice or use of metadata may be contained at this level without delaying at least basic access to the content.  By entirely separating metadata and content, we permit multiple transcriptions and metadata (perhaps with conflicting interpretations) to point to the same item’s URI.  Rather than providing, for example, a single transcription of an image (inevitably the work of the original project team that reflects a set of scholarly presuppositions and biases) this model allows those with objections to a particular transcription to generate another, competing one.  Each metadata set is equally privileged by the technology, allowing users, rather than content-providers, to decide which metadata set is most trustworthy or usable.

In my next blog entry I will discuss the next (and final) two layers of this model:  interfaces and user-generated data.

TILE directors begin blogging

Last week, the TILE team held their six month project meeting in Bloomington, Indiana.  At this meeting we further refined the scope of the project and have agreed to deliver the following tools by July of 2010:

  • A extension of the image markup features of the Ajax XML Encoder (AXE).  The extension will feature a newly designed, more user-friendly web interface and will permit editors to link regions of any shape to tags selected from a metadata schema supplied by the editor.  Additionally, editors will be able to link non-contiguous regions and specify the relationship between the two regions.
  • A automated region recognizing plugin for AXE that can be modified to recognize regions of any type but which will initially be designed to identify all of the text lines in an image of horizontally-oriented text.
  • A jQuery plugin that permits text annotation of an HTML document.

Also, in order to better communicate the work of the project with our partners as well as the larger digital humanities community, we have decided to blog weekly about some important issue relating to the project or text & image linking in particular.  This week, I (Doug Reside) will post a series of articles about a new, structural model for multimodal editions.  We welcome your feedback.

links for 2010-02-03


Emerging technologies and the need to experiment

About a month ago I posted a copy of my report Emerging technologies for the provision of access to archives on Scribd. It’s already edging up towards a thousand reads, so I thought it was time I put a link in from here.

The basic message is we need to experiment and find the spaces both within and between our institutions to foster such experimentation. Is that asking too much? Anyway… read, enjoy, use!