Is a common framework for library data a dead end?

Is the expectation that libraries should share a common framework for data a utopia that hinders real progress in the sector?

In MARC, libraries had a notionally common exchange format and this continues to serve the needs of the sector after almost fifty years. Admittedly, it is an exchange format that was co-opted for use as a storage format and broken into several dialects used in different regions and sub-sectors, but nevertheless a format that could easily be used to transfer data between providers. Except for when you tried to import data without transformation because you expected interpretations of fields to be the same, if not within the wider concept of MARC, within the specific dialect.

“Local flavour” seems a laughable concept, but libraries don’t react well to coercion in the form of validation and using fields for what they’re supposed to be used for. Nor is there a great understanding of normalization, or indeed any standard database stuff. What librarians understand is getting stuff done that is useful for users; oftentimes it seems like a good idea to shoehorn a bit of useful stuff in there — like that course code for the course readings. Sure, it helps the user, but it breaks the data.

Sometimes eager vendors claim that stuff can be done and try to press data in to fix things, local system administrators bodge stuff together without really considering the consequences. In fact, thinking through how “local flavour” can create a vendor lock-in because the data is part of a system should preclude any great flights of fancy on the data front. Nevertheless, in a non-networked world, this makes some sense, but in the modern context, it doesn’t.

It is hard to criticise libraries for trying to do things creatively that made the systems more useful, but a lot of the innovation was a response to lacklustre search capabilities in systems provided by vendors saddled with an aging data format. And so the snake swallows its own tail. It is hard to criticise library systems vendors for trying to satisfy the consistency requirements in the legacy format at the same time as struggling with basically bad data.

I’d argue that local flavour is unavoidable and inevitable when people try to provide support for different communities. It is anathema because we have expectations around a “shared format” where each element in the format is usable for every community and every community interprets the elements in the same way. Obviously, there was an expectation that “bibliographic data” is one thing; without reference to the use cases that drive the creation of the data in the first place.

In creating a new format, it seems inevitable that to some extent the expectation that bibliographic data is one thing, at the same time as trying to add space for different communities’ local flavour. Local flavour, however, lies deeper than this.

Local flavour is doing what you need to do with your data in order to get your work done. What each stage in this process is concerned with is entirely subjective, person-driven and frankly not something we are ever going to be able to “fix”. Every engineer that has seen some data has seen enough weirdness in to know that understandings and applications of field X from two people are rarely congruent in all but the most banal cases (and even then you are always prepared to be surprised).

To understand why local flavour is important, you need to understand that data is something that is there to be used. In every other domain, a database is designed to be useful to the company/institution using the data. The expectation that it works for them is central in database design. And databases are and should be in constant flux because uses are constantly changing (see [1]).

Libraries are no different.

While a minimal exchange format is attractive, exchanging a few simple details about items is relatively banal. But basing a data-driven architecture on such an anaemic foundation isn’t going to provide the solution needed to do the job.

In Resource Description Framework (RDF), we finally have a framework that will allow data to be expressed in a way that a) doesn’t conflict where sources disagree (i.e. local flavour in interpretation), and b) allows unlimited additional statements to be made about resources (i.e. local flavour in expression). Up in all of this, BIBFRAME seems to be aiming to provide a one-stop-shop for all terminology regarding what can be said; even if this is not the intention, having all of the terms controlled by one body (even when there are perfectly acceptable pre-existing terms in well-established vocabularies) does not encourage implementers to look beyond simple field-mappings, and thus brings none of the RDF goodness to the table.

For libraries to be able to move ahead now, they need to understand that they don’t need the help of a library systems vendor, they need the help of a non-library systems vendor who is well versed in the realities of the semantic Web, RDF and data design.

[1] Monash C. Data model churn | DBMS 2 : DataBase Management System Services [Internet]. [cited 2013 Aug 8]. Available from: http://www.dbms2.com/2013/08/04/data-model-churn/

Advertisements
Posted in Uncategorized
2 comments on “Is a common framework for library data a dead end?
  1. There is also tons of library bibliographic data that is not in MARC, never has been, and will probably only ever have the passing acquaintance with Bibframe (or RDA for that matter): repository data, article databases, not to mention bookseller data, dbpedia, etc.

  2. In Germany we have a long tradition of merging bibliographic data by frameworks. The “Gesamtkatalog” project required a governmental framework in the 19th century – the “Preußische Instruktionen” (PI) was born and the “modern librarian” who humbly obeys rules. Librarians here are used to frameworks: PI, RAK, soon RDA. And we had a very exotic MARC dialect, MAB, now stalled.
    Technologies like RDF modeling, but also Bibframe, are fantastic opportunities now for expressing catalog information and for building global library communication. Librarians can’t communicate through MARC formats today any more, because they are hindered by legacy software. The Web plays the dominant role and MARC is not web compatible.
    Bibframe could serve as a “web ticket” for the metadata elements librarians are familiar with. They can hand over Bibframe to IT professionals, instructing SW developers how to build catalogs for them, just like MARC was the language once before.
    I agree it’s kind of bad habit that LoC wants a onedimensional model in the Web age. But librarians are used to live behind fences and they agree in accepting LoC as the global leader for their professional communication. It’s up to other initiatives to open library data and their services for the web as a database – for example LODLAM projects. I don’t care too much for ILS vendors. They got paid for what they should do for librarians. A dead end? Yes and no. To master the future and to serve their patrons successfully, librarians must learn how to cooperate and transfer their knowledge and their rules to new, enhanced systems on the Web. Bumpy road ahead.

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s