Sunday, 10 November 2013

Making the most of possibilities of digitisation: an #rbscg13 write-up

Rare Books and Special Collections Group Conference (programme (.docx)).  The theme this year was digitisation (last year was fundraising and advocacy).  The three recurring themes across all of the papers were audience, metadata, and the 'thinginess of things', with some very useful practical advice thrown in. I'm not going to give a blow-by-blow account of every paper, but rather to pick out the bits I've been chewing over since.

All three themes were addressed in Simon Tanner's keynote, which adeptly summed up the current situation as well as challenging us to think more imaginatively about the future and shaking us out of complacency about what we're doing now.

Tanner started by using the analogy of an ant mill to describe the fate of too many digitisation projects.  He emphasised that it's vital to plan properly and to seriously consider audience, opportunity costs (what aren't you doing if you are doing digitisation instead?), and to keep asking the 'so what' question to keep you focussed on why you're doing what you're doing. He also shared lots of useful resources:
Sian Prosser presented a case study on cataloguing and digitising a comparatively small collection of  manuscript fragments. She emphasised that even though there weren't many fragments, a very high level of specialist knowledge was required to describe them well, and that any project of this type is likely to take more time than envisaged.
  • TEI by example. Sian's project used TEI to mark up the descriptions. TEI by example is a set of free online tutorials.
  • Ransom Center Fragments. This is a very useful Flickr site displaying images from a large collection of manuscript fragments at the Harry Ransom Center, University of Texas at Austin.
My favourite paper from the conference was  Rowena Willard-Wright on 'Transforming our data for the internet on a tight budget'.  Willard-Wright works for English Heritage and described their work on digitising and improving their catalogue, including photographing objects, improving old records, cataloguing from scratch, and create varied means of access to the records.  Her talk really exposed infuriating and frustrating the problems faced by all cataloguers are.  As she spoke I wrote:
Willard-Wright's description of migrating and updating catalogue data is very familiar: data has been lost and garbled in transfers over years. Cataloguing has been and still is viewed as archane, and a thing not worth funding, because catalogues were and are seen as not for general consumption. I.e. they are perceived as been the exact opposite of their whole point. This problem that has been seen with the English Heritage catalogue is *exactly* what is being exposed as libraries moves from traditional OPACs to resource discovery/next-generation systems. And, most infuriatingly, the things that cataloguers have known and have been saying forever (e.g. consistency matters, access points matter) is suddenly being "discovered" as if it's new.
Willard-Wright was talking from the perspective of a museum catalogue, which is in some ways very different to a library catalogue.  Museums don't have such a tradition of the publicly accessible comprehensive catalogue, and write much more descriptive and less codified entries for their objects.

The English Heritage cataloguing project used teams of volunteers with very well-defined tasks.  They write clear, concise, engaging, small chunks of description - i.e. entries that confirm to the principles of good writing for the web. The volunteers aren't necessarily experts on the objects, and they're writing for audiences who aren't necessarily experts either. However, there's a recognition that the audience may have additional knowledge or stories to share, and for this reason a 'tell the curators something about this' button is being built into the public catalogue.  I absolutely love this - it's baffled me for years that so few library catalogues have a 'tell us if there's a mistake' button. Copac is a notable exception.  I fear that many libraries don't have one because, if it was ever mentioned in a meeting, someone piped up and said "but think of all the extra work" which likely trumped "think of how handy that will be for our readers, and how useful for us to make use of their knowledge".

That lack of connection to the audience was hammered home for me in another way throughout Willard-Wright's talk.  The museum descriptions are being written for general audiences.  Rare books records contain descriptions that are, frankly, written for librarians.  Not even, really, for most researchers. Yes, we include all sorts of useful information, but we code it up in impenetrable ways, and there's all sorts of information we don't include accessibly.  This has maybe been less of an issue in the past, when catalogue records were only seen by those initiated into our arcane world. But now catalogue records go along with beautiful/intriguing/important digitised books that all sorts of people might want to see, and our gibberish means *nothing*, and doesn't explain any of the basics. (How many records for the first folio show clearly that this is a first folio? Or the Nuremberg Chronicle?)

During Willard-Wright's paper Jill Dye commented that "The only difference between an online catalogue and a digitisation project is adding a photo?", and I think that in one way she's right: it's completely wrong to think that a digitisation project stands apart from cataloguing. However, making materials accessible in any way, but especially if they're freely available online demands a new attitude to description. We really need to step up our game.

Melissa Terras is director of the UCL Centre for Digital Humanities, and she presented a wide-ranging paper highlighting some of the possibilities of high-end digital imaging. She works on these projects in collaboration with computer scientists and engineers, and they're often adapting techniques already used elsewhere (such as in medical imaging). As well as drawing our attention to current projects, Terras made some important points about the theory and practice. Digitisation means lots more sorts of metadata needed so that we can properly interpret the images. As Hannah Thomas put it, Terras' work is "not just about creating a surrogate but about using the image to discover new things, inspire new research". Terras made the point that digital images are *not* exact reproductions of originals. Terras asked us to talk to her if we know of collections or items that would benefit from advanced digitisation and imaging work; part of her role is to connect the various people involved.
  • The one resource I'll share is this breathtaking (there were audible gasps in the room) video of the digital flattening of the great parchment book. Watch it. It's amazing.
Alixe Bovey spoke from the academic's perspective, and addressed some of the threats posed by digitisation. She was heavily involved in the campaign to try to prevent the sale of some of the Mendham Collection books last year. Bovey passionately explained that the digital is not the same as the physical, and we all need to communicate this better.  With the Mendham sale, the existence of digitised copies of the titles on databases such as EEBO and ECCO was used as justification for the sale, ignoring the copy specific details of the Mendham copies, as well as the failings of the scans themselves. Earlier digitisations have been particularly lacking. Never mind the poor black and white reproductions of scanned microfilm, they tended, for example, not to include any blank or apparently blank the source copy (see this post), and also ignored bindings, and made it difficult to determine the original size of the book. But we're not past such difficulties even with the best modern digitisations; they tend not to include scale rules, (see this post for difficulties of determining size), and give little indication of other factors such as weight, quality of materials used, or even smell.

Anne Welsh spoke very pragmatically from the point of view of libraries and library staff themselves.  She pointed out that we are continually needing to update and improve what we've done before: both content format and types of description.  She faced the fact that we can't do everything, and used the example of the University of Manchester Library Digitisation Strategy Group's 'Criteria for ensuring value to the Library for partnerships' (pdf link), which considers the value to the library of any potential projects.

Nicolas Pickwoad spoke about one element of early books which is too often overlooked in digitisations: book bindings. Most bookbinding digitisations (Pickwoad mentioned the Uppsala Probok project as an exception) show only beautiful, expensive, fancy and/or fine bindings, turning bookbinding digitisation has into "a decorative arts ghetto". This doesn't represent most early book bindings, which are less extravagant, but can tell us a very great deal about the book's history, and may often be the most interesting.

There's also a vicious circle at work: bindings aren't so often described in catalogue records, soscholars can't ask for them, so there's not so much research, so it's not seen as a priority... At least when books are viewed in person, the binding will be seen 'by accident' as it were.  If they're not included in digital surrogates they disappear altogether. Like many specialist aspects of digitisation, imaging bindings takes special requirements, including lighting, including to show structures accurately.
  • I'm keen to keep an eye on Pickwoad's Ligatus project, which is working on guidelines and terms for describing bindings better. It's hoping to develop vocabulary and multilevel descriptions for bookbinding including the ability to record negatives (e.g. 'no clasps'). This is key, because otherwise you just can't tell whether a feature is absent or it's just a bad record.

So all in all, my summary would be that we need to be using better, subtler and more flexible descriptive frameworks and presentational tools to make digitised materials accessible and available to the audiences who want to see them.  Digitisation can help with some, but not all, problems, and we need to advocate loudly for the intrinsic physical value of the things we want to digitise, to try to stem the tide of feeling that a copy is as good as, and entirely replaces, the original.
    There was lots of tweeting throughout the conference: