[Transparency note: I work for a company that provides services directly related to the mentioned technologies; I attempt to avoid mentioning specific solutions]
I wrote a post about some thoughts I had about schemas and schemalessness and how I irrespective of how hard I try (I gave a lightning talk at ELAG14 on this a very similar topic and I spoke about cataloguing interfaces at the end of my talk at SWIB11; so this is a quandry with a bit of a pedigree for me), I always come back to a point where I have a schema. Jakob Voß nicely summarizes the situation in the second sentence of his comment on the post:
Once you start to express anything in data, there is a hidden data model in your mind.
I had a discussion of the post with Laura Akerman on Twitter that helped me come to a more nuanced understanding — as Laura rightly points out, the loose thoughts I penned are difficult to understand; our discussion helped me focus the problem and work towards a solution (this is exactly the kind of thing I want from the Net, interaction with people who think about stuff). Jakob also said “I’d even say there is no data without schema.“; I think that this is at the crux of my issue with data entry for schemaless data formats: I want schemalessness to pervade data entry too.
Here, my discussion with Laura turned to the Choose-your-own-adventure-style books, like the Fighting fantasy series and others that I read as a child. Laura mentioned decision-tree-driven interfaces, but I’m talking about something a bit more complex because a decision tree implies a pre-existing, albeit flexible, schema.
What I’m thinking about is an approach that can allow a user to build arbitrary data structures without reference to specific schema; something that would potentially allow two different individuals to arrive at entirely different results using one and the same interface. Now, evil tongues would argue that libraries already have this technology in the shape of MARC, but I disagree; what we want is a way of creating schemaless data that is nevertheless structured, rather than an unstructured soup of strings.
I recently became familiar adaptive case management; from the wikipedia article, it might be slightly hard to understand why this is relevant, but imagine the data entry process as one where one can enter the process at any point (early in the logistics flow, late in the descriptive flow, etc.) and that enrichment can happen via a number of sources and that how the data is treated and how it progresses is controlled not by a linear process, but a flexible subparts that are oriented-oriented and often pattern-based, but the entire process is itself organized in relation to the individual case.
This kind of approach allows structured, semi-structured and unstructured processes to be treated uniformly because the case-management approach allows users to interact with various resources via a common interface. This abstracts away from unusual circumstances by framing the circumstance in a predictable way. Breaking down sub-processes and ensuring that these are treated uniformly, either from existing process libraries, or by adding new processes per case makes this approach very flexible.
Over time, perhaps the process library becomes so large that it forms a schema of sorts. The major difference is that how the data is entered is not schema oriented, but can be put together in novel ways; the sub-processes in any flow are controlled only by the goals of the given case. This might include some standard information that must be present in any flow, but can otherwise be different in each and every case (though it is unlikely that this will be the case).