The problem of using a “local” upper ontology

In BIBFRAME, Europeana’s EDM, VIVO among others, it’s become a common pattern to use a common data model, often termed a “top ontology”, but more properly for logicians “upper ontology”. At SWIB13, we’ve seen a number of presentations that take heterogenous data and convert this to a common platform, using a top-level ontology; only one of these, YSO, referred to a true top-ontology (in the web ontologist’s sense of the word).

A true upper ontology has ONE major benefit (yes, one, because the other “benefits” are context-driven, and not common), and that is that you can align ontologies because of the common upper ontology is shared and the entailments of saying X is class Y apply down the chain. The problem is that choosing a upper ontology isn’t a no-brainer, in fact it has serious implications because it dictates how you model things. YSO’s choice of DOLCE is interesting because it is heavily skewed towards a cognitive view of the world; because, at their own admission, they chose a cultural orientation, the usefulness for sharing knowledge about the world becomes moot — your culturally coloured understanding of the world makes it impossible to coalesce the knowledge with other knowledge bases that do not share this cultural bias. (In fact, whichever upper ontology you choose causes these problems, so the MAJOR benefit isn’t a real thing…) Nevertheless, YSO is sensible because they’re coalescing heterogenous ontologies to create domain ontologies within the domain of their middle ontology, with DOLCE on top.

It’s more interesting that BIBFRAME, VIVO and EDM choose to coalesce data from diverse sources in heterogenous formats into a homogenous RDF-based format — and here I do mean format — because a core profile won’t allow for the inclusion of data that falls “outside” the format/profile. The reasoning behind formats/profiles are probably many, but they largely simplify the work of the users because they create a closed world of reliably structured data. And I contend that this is against any understanding of linked open data.

A system that claims to use linked open data should competently handle this heterogeneity because it is inherent in a web of data that is created by diverse groups in diverse contexts — shoehorning this diversity into a new schema, even a relatively forgiving one like EDM, does something to the data. That “outside” data can be placed in the mix (but ignored by the applications that use the profile) doesn’t change this fact.

As a developer, I see that it’s easier to create a “profiled” schema that can be expressed in an input form/presentation interface than to create triples within an arbitrary-but-valid structure and express these in an interface. The former is actually relatively easily solved by providing an interface that allows arbitrary creation of triples, but the user needs to understand that what they are doing is arbitrarily ordered facet description. Presentation is different, because it involves a lot of trust — and trust is a part of the semantic web stack; but, the problem of using someone else’s ontology doesn’t go away: it’s still like using someone else’s underwear, you just need to trust that other person & their underwear

Tagged with:
Posted in Uncategorized

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s