File systems and databases provide different ways of organizing data
to help find structure and meaning in what you've stored, but they're
not the only approaches possible. Moreover, the structure they
provide is really for one purpose: to simplify accessing it. Once you
realize it's the access, not the structure, that matters, the whole
debate changes character. -- Rob Pike Responds
Yesterday I searched for "snort license". Surprisingly, this is what Google returned:
Snort — License: GNU General Public License
According to http://en.wikipedia.org/wiki/Snort_(software)
which means that Google has special code to detect queries about licenses. Nice! (Actually I cringed.)
Data Search
Obviously, the next step is extending this to arbitrary properties, e.g., "rob pike's email", and including synonyms in the search.
There's a joke that there are only two hard things in CS, cache invalidation and naming things (due to Phil Karlton), and resolving such queries is as big a naming problem as it gets. (General question answering is of course harder, but we aren't there yet.)
Basically, you want to be as sloppy as possible in recognizing the query terms, the system should work even if you talk to it on the phone.
For that, we need something even beyond megadatabases, we need data search systems.
Data Search Systems
A data search system answers queries about properties of things (e.g., "elvis' address"). Input, queries, and output are close to natural language.
- Thanks to the R in REST, data search systems can always fall back to fuzzy results, or add additional information that may or may not be helpful to the user.
- Data search systems do away with notions about ultra-precise naming and use an acoustic approach to query recognition, and ontological knowledge to find synonyms.
- Data search systems snarf the whole world wide web, looking for patterns humans use to express data in writing, such as "bedrooms: 2" or "2 bedrooms". (See humane metadata syntax.)
- seems to be in the best position to make real data search possible, but technical advances will soon allow a kid with an OLPC to kick their ass.
Dig we must!
Recent Comments