Characteristics of a Semantic Web Application

What are the characteristics of a credible Semantic Web site? That was the question on a semantic web group on LinkedIn. I attempt an answer out here.

Is there anything called a Semantic Web app?

My immediate thought was, does anyone know at all? Is there a minimum set of features that would make an application SemWeb compliant? Of course there is the vision of Tim Berners Lee on the Semantic Web out here. The Wikipedia article here does a good job of laying out the overall idea of it.

But there is no consensus, that I am aware of at least, on the minimum characteristics required for anything to be called a SemWeb app.

Without accepted criteria, anything goes

Without a established threshold it becomes easy for the hype machine to mislead and set wrong expectations. Those of you who followed the startup Twine will know what am talking about. Not everything with the SemWeb label is remotely what the vision of TimBLee implied.

Here is an answer that I proposed.

Criteria 1 – Data Portability

Use common agreed upon standards to markup information, so that they can be mashed up in contexts the original data provider did not anticipate. This is not a trivial exercise. Often data is locked in proprietary data formats and behind antique APIs. An entire industry of data integration tools exists to serve this problem.

The metadata surrounding the data is one aspect. The other is how much of this metadata actually is available at the point of consumption for consumers to leverage. This is more odious than it sounds. This would be topic for another blog post!

Check DataPortability for initiatives in this area.

Criteria 2 – Ubiquity

Make the above marked up data available in the widest possible channels. Though this is a content delivery criteria I feel its critical to derive the benefits of the SemWeb. No point hiding semantically rich data behind proprietary APIs and endpoints. HTTP and the REST route should be the protocol; XMPP is another delivery channel, if your data is time-sensitive.

Criteria 3 – Expose data graph

I use this term for lack of alternatives. Data does not live in a silo. Defining a grammar for your data via an ontology is just one aspect. There is always a reference to some other element that will enhance or clarify its meaning.

Mapping and translating between ontologies is a possibility too. Still the idea of a data graph needs to be present. Make this explicit by providing links from key entities and facts within your data, say by linking to DBPedia if the concepts involved are public. If the information is private to your organization, then allow the data consumption hops possible across applications within your organization.

Criteria 4 – Allow inferences

Too many SW apps stop at searching and aggregation. I feel some basic amount of inferences should be allowed. To make non-obvious connections bare should be the outcome for a data graph that is linked deeply. To make patterns hidden with data apparent.

I remember seeing the term Serendipity Quotient, a measure how much non-apparent connections or insights can be revealed. This could be similar to data mining but I think this is a superficial similarity. The nature of insights from the SW apps would also be on unstructured data unlike data mining which is more attuned to structured data.

Note that we are not trying to be dogmatic about which data formats or inference mechanisms are used.

Infancy of the SemWeb

Going by this criteria I think we are yet to see a proper SemWeb app. These are early days and the apps are our first attempt at building something so ambitious as a globally linked data, allowing machines to be infused with intelligence.

We also have to account for the fact that many of these criteria may be already implemented behind the scenes to pull off the kind of smart behavior we have come to expect from the SemWeb.

Your chance to add meaning!

With that I would like to pose some questions. Do you agree with the criteria above? What would you add/remove/embellish to this list? Are there apps that do all of the above?

Comments

  1. Nice article! Been asked something similar as well a few days ago, for semantic web *sites*, though, not apps. I'd also say that an app that claims to be semwebby should at least expose its data in some machine-readable form . A true semantic web app would IMO also consume and repurpose data provided by other sites. Sort-of related: http://bnode.org/blog/2009/02/18/linked-data-va… (a fully-fledged semweb app would cover multiple sectors in the spiral, but not necessarily all). http://challenge.semanticweb.org/ lists some additional criteria and example apps.

  2. Actually I tried to come up with a definition in my PhD thesis. It is based on the criteria from the ISWC Semantic Web Challenge (bengee mentioned this, already, as I see). However, FWIW, you may wanna have a look at [1], Section 2.1, De?nition 2.2 (Semantic Web Application). Cheers,Michael[1] https://online.tu-graz.ac.at/tug_online/edit.ge

  3. Benjamin – Thanks for the feedback, appreciate it 🙂 I liked the conciserepresentation of criteria in the SemWeb challenge link you gave. The linkeddata value spiral's categorization is interesting too. One item,"accelerator products" is intriguing – do you have specific use case orproducts to illustrate this?

  4. Michael – first of all thanks for the comment. I tried to download yourthesis but unable to do so for some reason. Would it be possible for you tosend a softcopy? Am eager to learn how this definition can be nailed withsome rigor!

  5. I'm *trying* to work on something in that direction at paggr.com, which is a netvibes-like system where individual items and widgets can be dragged on each other, for filtering or annotation. Let's say you drag a photo on a person and also on an event, and the person is linked to a slide deck which you then also link to the event. This visual linking generates RDF in the background and could then let you (or others) retrieve stuff like "photo set of speakers", "photos from session X", etc. Certain use cases you didn't think of while you were working with the data, but which are readily available once you do.

Leave a Reply