Is the expectation that libraries should share a common framework for data a utopia that hinders real progress in the sector?
In MARC, libraries had a notionally common exchange format and this continues to serve the needs of the sector after almost fifty years. Admittedly, it is an exchange format that was co-opted for use as a storage format and broken into several dialects used in different regions and sub-sectors, but nevertheless a format that could easily be used to transfer data between providers. Except for when you tried to import data without transformation because you expected interpretations of fields to be the same, if not within the wider concept of MARC, within the specific dialect.
“Local flavour” seems a laughable concept, but libraries don’t react well to coercion in the form of validation and using fields for what they’re supposed to be used for. Nor is there a great understanding of normalization, or indeed any standard database stuff. What librarians understand is getting stuff done that is useful for users; oftentimes it seems like a good idea to shoehorn a bit of useful stuff in there — like that course code for the course readings. Sure, it helps the user, but it breaks the data.
Sometimes eager vendors claim that stuff can be done and try to press data in to fix things, local system administrators bodge stuff together without really considering the consequences. In fact, thinking through how “local flavour” can create a vendor lock-in because the data is part of a system should preclude any great flights of fancy on the data front. Nevertheless, in a non-networked world, this makes some sense, but in the modern context, it doesn’t.
It is hard to criticise libraries for trying to do things creatively that made the systems more useful, but a lot of the innovation was a response to lacklustre search capabilities in systems provided by vendors saddled with an aging data format. And so the snake swallows its own tail. It is hard to criticise library systems vendors for trying to satisfy the consistency requirements in the legacy format at the same time as struggling with basically bad data.
I’d argue that local flavour is unavoidable and inevitable when people try to provide support for different communities. It is anathema because we have expectations around a “shared format” where each element in the format is usable for every community and every community interprets the elements in the same way. Obviously, there was an expectation that “bibliographic data” is one thing; without reference to the use cases that drive the creation of the data in the first place.
In creating a new format, it seems inevitable that to some extent the expectation that bibliographic data is one thing, at the same time as trying to add space for different communities’ local flavour. Local flavour, however, lies deeper than this.
Local flavour is doing what you need to do with your data in order to get your work done. What each stage in this process is concerned with is entirely subjective, person-driven and frankly not something we are ever going to be able to “fix”. Every engineer that has seen some data has seen enough weirdness in to know that understandings and applications of field X from two people are rarely congruent in all but the most banal cases (and even then you are always prepared to be surprised).
To understand why local flavour is important, you need to understand that data is something that is there to be used. In every other domain, a database is designed to be useful to the company/institution using the data. The expectation that it works for them is central in database design. And databases are and should be in constant flux because uses are constantly changing (see ).
Libraries are no different.
While a minimal exchange format is attractive, exchanging a few simple details about items is relatively banal. But basing a data-driven architecture on such an anaemic foundation isn’t going to provide the solution needed to do the job.
In Resource Description Framework (RDF), we finally have a framework that will allow data to be expressed in a way that a) doesn’t conflict where sources disagree (i.e. local flavour in interpretation), and b) allows unlimited additional statements to be made about resources (i.e. local flavour in expression). Up in all of this, BIBFRAME seems to be aiming to provide a one-stop-shop for all terminology regarding what can be said; even if this is not the intention, having all of the terms controlled by one body (even when there are perfectly acceptable pre-existing terms in well-established vocabularies) does not encourage implementers to look beyond simple field-mappings, and thus brings none of the RDF goodness to the table.
For libraries to be able to move ahead now, they need to understand that they don’t need the help of a library systems vendor, they need the help of a non-library systems vendor who is well versed in the realities of the semantic Web, RDF and data design.