I have had the great pleasure of attending the second Linked open data for libraries archives and museums (LODLAM) summit in Montréal Canada. This is a quick summary of my impressions from the summit.
For those unfamiliar with LODLAM summit, it is an unconference on semantic Web and has been held twice (I was also lucky enough to have attended the previous LODLAM). The structure of the summit is largely open and decided underway. The actual format used is Open space technology [beware, the Wikipedia article reads like a sales pitch for this method], which the LODLAM main man, Jon Voss, adapts and specifies a few ground rules. The main point of this method is that it makes it possible to create sessions ad hoc and discuss the things people feel like discussing.
The unconference methodology used is intended to facilitate and encourage dynamic, relevant sessions, and it to some extent succeeds in this. A traditional conference runs the risk of being irrelevant because the topics chosen are typically decided many months in advance, unconferences have different risks.
One important factor for LODLAM is that it is international, this presents a number of challenges that are quite difficult to address. It’s quite apparent to me that the unconference format used here is suited to confident people, and cultures where directness is viewed positively. If you abide by principles of indirectness to maintain polite discourse, you either have to suspend this and join in or get marginalized.
One of the things that helps is moderation, but since the people proposing the sessions aren’t obliged (rightly) to moderate, it’s not always the case that a moderator simply volunteers unrequested. On several occasions, I was reminded of some of the less productive meetings I have been involved in.
That said, the really great thing about LODLAM was the people; here, I should give no special mentions, but it was really very nice to meet everyone, such a positive atmosphere with people who are passionate about what we do. It’s always nice to hook up with old friends and make new ones.
Won battles & the roadmap
In terms of what has been achieved, we saw several examples of the work that has been done, and it’s clear that the last two years have seen a lot of development happening. Back in 2011, most people either had produced or were publishing datasets. Moving forward 2013, while some have produced full linked-data driven sites, others have published data sets, so we’re all on the same page now.
Comparing the situation in 2011 with the current situation, there is a lot more knowledge about linked data and the techniques that can be used to produce linked data sites. At the same time, I wonder to what extent we really have moved on so much. There’s a pervading impression I get that people still see RDF as a database, and that the job of publishing data is largely one of representing what is there. We still hear the word “record” and we still hear complaints about issues that belong somewhere outside the space of the RDF and Web stack.
I have also noticed a rise in confidence in the domain with a bitter twist of arrogance; while we were all happily learning together before, there’s a trend towards this approach or that approach that seems exclusive. It amused me to hear that there are “solved problems” within linked data, particularly the technical aspect. It’s odd because this really is not the case; there are so many fundamental issues at play here that it’s difficult for me personally to understand how anyone could imagine that there’s anything hard and fast here. By way of example, I mentioned at one point that one could use technologies to match links, and heard that this was uninteresting and was a solved problem. My point for discussion was that the issue at stake isn’t getting the links, it’s what is being linked. “A solved problem” on the application side typically results in having to “eat what you’re given”, and at the moment, we’re very much at the level of simplistic type and term matching.
One of the things that I react to when discussing things with people outside say, Libris and the ex-Talis community, is that there isn’t very much development beyond this. Libraries, archives and museums have done a lot of work on semantics, and the road is yet long. My group are by no means experts here, but we do have the benefit of having worked with RDF exclusively since 2009, having published several large datasets and creating a cataloguing workflow and linked data-driven site in 2010.
We got seriously burned by relying on a technology platform that collapsed when the company running it went feet up, but this reliance on “semantic tech” showed us how naïve we’d been. We’d concentrated on creating nice, semantified records for search and presentation on the web. The content negotiation supplied machine and human views and yet, when we discovered that we needed functions that weren’t part of the package we had, we had to resort to costly workarounds to get this stuff to work.
I suppose that my personal view, and possibly that of the team I work with, is that the semantic layer replaces the old database layer in applications, but with the obvious caveat that this isn’t database technology. If you want a database with database functionality, get a database, not a triplestore. At the same time, if you’re using linked data, it should be obvious that the most important aspect is HTTP, followed by the schemaless nature of RDF. The rest isn’t particularly interesting right now as the tools just aren’t there and are probably not going to be for quite some time (here I reference discussions about reasoning and NLP).
I suggested a session that was intended to discuss the nature of interfaces for linked data cataloguing, that is to say how we want to present possibilities to create triples in a pure RDF/linked data workflow for registration of things. The session was good from my point of view as we had some time to look at the issues at the core of creating RDF and I think that everyone had come across the same issue — that the management of RDF isn’t trivially solved by any current solution. In other discussions, I have been reprimanded for saying this because d’oh “NAMED GRAPH”, while on closer inspection this doesn’t solve very much at all, and does so very inelegantly (except if you think of RDF not as statements, but as collections of statements that form a record — but also inelegantly). Here, we seemed to suspend disbelief a second and look at the ways we can actually generate linked data intelligently. Here’s it’s really good to talk to savvy cataloguers who know current standards and interfaces well — we’re not exactly spoilt for interfaces in Norway, so the discussion (which I have jotted down) was particularly healthy for me!
I guess that the next session I attended was an eye-opener, as it was suggested by a researcher and was using PDFs as a delivery platform for research using linked data. The idea appealed to me as I have used XMP since 2006, and I’m a fan of the idea of embedded metadata. However, the idea in question stems from a research paper where it is suggested that PDF attachments should be used. I suspect that a lot of the issues here are covered by Research Objects, so I won’t go into detail here. What I took away was that a perspective is taken in every research case, and we can’t via a standard interface, account for the needs of a researcher.
After a delicious lunch, I attended a session on linked data interfaces, which was really interesting. The though-provoking part is that we’re still largely discussing record presentation and mashups of these. I was triggered into thinking that linked data is a machine interface, and that any human interface needs to relate to this in a very abstract way. In truth many of us do this quite explicitly, and most of the people I talked with were using some indexing method to abstract away from RDF. The discussion was both lively and worthwhile — I feel that it was good to hear the challenges and solutions that people are finding.
A session on BIBFRAME followed, which was a tad disappointing as to be honest, I follow the mailing list and have heard it all before; while I feel Kevin Ford did a good job of trying to answer the questions posed, there’s only so much that can be said right now. I had a bit of a rant at this point — I do actually care — which I guess opened a bit of a floodgate as Cory Harper also vented. BIBFRAME is such a missed opportunity to do some real good and seed real change in the way we work with bibliographic data; it’s missed because it is not really going about this transition in the right way — playing with others — and trying to keep it in our domain. The real fear is that this won’t attract new players to our domain, and won’t inspire existing developers to do anything new. Advice: get back to the HTTP and RDF of things.
End of day one was celebrated on multiple terraces with multiple beers and most excellent company from particularly New York. During the discussions, we touched upon the idea that LODLAM was a venue for vapourware — software that is hyped, but never comes to fruition. I’m sad to agree here. We do have a fixation on having running code, but when the running code isn’t really running, and doesn’t really work as we claim, it’s all spin. Give that stuff to your managers, but keep it out of the conferences.
Day two was a very pleasant day, which I sat with a guy from Florida and we looked at his data. It’s really nice to see someone who can sit and do the same kind of work we do, quickly, effortlessly and take on board the bits of advice that can be gleaned from the group. The productive side here was that we got things working, but also had a nice view on the way he thought about collections in his domain (museums), which again harked back to the session on interfaces — what we present to the public shouldn’t be record by record (this should be a banal thing to do anyway), but in a meaningful aggregation of content. An eye-opener made possible by looking at data through someone elses eyes — you really don’t get that in other fora.
After this, I attended the final content session on the future of the conference, and it was nice to see that the level of engagement hasn’t sunken even after so much jetlag and sleep deprivation. I would attend a regional LODLAM for the Scandinavian countries. We couldn’t invite the Finns of course because they’re too good at this stuff 😉
The closing session was a short affair, and featured a standing ovation for Jon Voss — a really nice guy. (Thanks Jon!)
My feelings after the event are a bit mixed, I’m very glad to have had the chance to meet so many engaged and engaging people, but I don’t really feel that we’ve moved on. I want desperately to see the next generation of stuff that we see a glimpse of in LIBRIS XL and that I’m struggling with in the back-end of our systems.
I spoke with Richard Wallis, and I think that he’s right in that there’s still a lot of Web that needs to be learnt before the semantics we’re doing becomes semantic web.
It was worth the it going to LODLAM, because of the ideas around interfaces, collections, abstractions and the networking that can be done. LODLAM is undeniably a place that is most available for North Americans, and it has been a great venue for meeting people I know from Twitter and others, and seeing exactly what the state of play is in these forward thinking countries.