Following up on
this blog post, and the oral adaptation of it I gave as a talk at the Cambridge Semantic Web Gathering on 12 August 2008 at MIT, here are some bits from my speaking notes about the shortcomings of various current Semantic Web pieces as solutions to the problem of doing for graphs of data what VisiCalc did for columns of numbers. This is a discussion among practitioners, so don't feel bad if you don't have any idea what I'm talking about here.
RDF
Ingenious data decomposition idea, but:
- too low-level; the assembly language of data, where we need Java or Ruby
- "resource" is not the issue; there's no such thing as "metadata", it's all data; "meta" is a perspective
- lists need to be effortless, not painful and obscure
- nodes need to be represented, not just implied; they need types and literals in a more pervasive, integrated way
SPARQL (and Freebase's MQL)
These are just appeasement:
- old query paradigm: fishing in dark water with superstitiously tied lures; only works well in carefully stocked lakes
- we don't ask questions by defining answer shapes and then hoping they're dredged up whole
Linked Data
Noble attempt to ground the abstract, but:
- URI dereferencing/namespace/open-world issues focus too much technical attention on cross-source cases where the human issues dwarf the technical ones anyway
- FOAF query over the people in this room? forget it.
- link asymmetry doesn't scale
- identity doesn't scale
- generating RDF from non-graph sources: more appeasement, right where the win from actually converting could be biggest!
Giant Global Graph
Hugely motivating and powerful idea, worthy of a superhero (Graphius!), but:
- giant and global parts are too hard, and starting global makes every problem harder
- local projects become unmanageable in global context (Cyc, Freebase data-modeling lists...)
And my thus my plea, again. Forget "semantic" and "web", let's fix the database tech first:
- node/arc data-model, path-based exploratory query-model
- data-graph applications built easily on top of this common model; building them has to be easy, because if it's hard, they'll be bad
- given good database tech, good web data-publishing tech will be trivial!
- given good tools for graphs, the problems of uniting them will be only as hard as they have to be