What are the characteristics of a credible Semantic Web site? That was the question on a semantic web group on LinkedIn. I attempt an answer out here.
Is there anything called a Semantic Web app?
My immediate thought was, does anyone know at all? Is there a minimum set of features that would make an application SemWeb compliant? Of course there is the vision of Tim Berners Lee on the Semantic Web out here. The Wikipedia article here does a good job of laying out the overall idea of it.
But there is no consensus, that I am aware of at least, on the minimum characteristics required for anything to be called a SemWeb app.
Without accepted criteria, anything goes
Without a established threshold it becomes easy for the hype machine to mislead and set wrong expectations. Those of you who followed the startup Twine will know what am talking about. Not everything with the SemWeb label is remotely what the vision of TimBLee implied.
Here is an answer that I proposed.
Criteria 1 – Data Portability
Use common agreed upon standards to markup information, so that they can be mashed up in contexts the original data provider did not anticipate. This is not a trivial exercise. Often data is locked in proprietary data formats and behind antique APIs. An entire industry of data integration tools exists to serve this problem.
The metadata surrounding the data is one aspect. The other is how much of this metadata actually is available at the point of consumption for consumers to leverage. This is more odious than it sounds. This would be topic for another blog post!
Check DataPortability for initiatives in this area.
Criteria 2 – Ubiquity
Make the above marked up data available in the widest possible channels. Though this is a content delivery criteria I feel its critical to derive the benefits of the SemWeb. No point hiding semantically rich data behind proprietary APIs and endpoints. HTTP and the REST route should be the protocol; XMPP is another delivery channel, if your data is time-sensitive.
Criteria 3 – Expose data graph
I use this term for lack of alternatives. Data does not live in a silo. Defining a grammar for your data via an ontology is just one aspect. There is always a reference to some other element that will enhance or clarify its meaning.
Mapping and translating between ontologies is a possibility too. Still the idea of a data graph needs to be present. Make this explicit by providing links from key entities and facts within your data, say by linking to DBPedia if the concepts involved are public. If the information is private to your organization, then allow the data consumption hops possible across applications within your organization.
Criteria 4 – Allow inferences
Too many SW apps stop at searching and aggregation. I feel some basic amount of inferences should be allowed. To make non-obvious connections bare should be the outcome for a data graph that is linked deeply. To make patterns hidden with data apparent.
I remember seeing the term Serendipity Quotient, a measure how much non-apparent connections or insights can be revealed. This could be similar to data mining but I think this is a superficial similarity. The nature of insights from the SW apps would also be on unstructured data unlike data mining which is more attuned to structured data.
Note that we are not trying to be dogmatic about which data formats or inference mechanisms are used.
Infancy of the SemWeb
Going by this criteria I think we are yet to see a proper SemWeb app. These are early days and the apps are our first attempt at building something so ambitious as a globally linked data, allowing machines to be infused with intelligence.
We also have to account for the fact that many of these criteria may be already implemented behind the scenes to pull off the kind of smart behavior we have come to expect from the SemWeb.
Your chance to add meaning!
With that I would like to pose some questions. Do you agree with the criteria above? What would you add/remove/embellish to this list? Are there apps that do all of the above?