Things do heat up in the summer time and some say there is some competition brewing among Natural Language vendors that are offering search services.
Over at the Conceptualist, Sahar Sarid comments on whether 30 years of research is enough to beat Google. Citing Michael Reisman for MIT Technology, he thinks semantic search is important but he believes that digging relationships from text is not as useful as personalization and understanding the user’s intent.
I cannot argue against the importance of understanding the user’s intent, but personally, I don’t think any search technology, with or without a personalization feature, is “enough” to beat Google. Google is so much more than a search engine at this stage, their business will be hard to upset.
On the other hand, there certainly seems to be a competition brewing, in the views of some bloggers and in the opinions of the technology press, at least. And the competition is about semantic search offerings, or so they say. Over at the Read/Write web Bernard Lunn, claimed that the money seems to be riding on the NLP systems. It does not feel right to him and not to me either.
These NLP systems, along with the AI of the Semantic Web and lexical resources such as WordNet, are each in themselves great and powerful systems. They are each like the old Roman numbering system in that these modern linguistics systems have a similar effect on people using the Internet as the Roman numeral system had on ancient Romans.
Roman Numerals were a numbering system that prevented an entire civilization from doing any higher math. You can read what Thomas Frey has to say about it here. The proponents of NLP and AI systems from 30 years ago have tried to prevent research into other viable semantic methods.
They have blocked the widespread development of semantic techniques that are capable of processing real and conceptual relationships between words and names and topics or subjects of interest, in favor of extracting part of speech relations. It is very important. Because language deals with everything, and human semantics are universal, getting the fundamentals wrong here mucks up the entire works. It makes things become more complex than need be, and more expensive. That is the state of affairs today.
The ways and means of NLP systems and functional grammars, and all their adherents’ and proponents, are preventing semantic search from surfacing. This goes unnoticed by everyone until someone shouts loud enough to rise above the din of the crowd. There is even greater pressure than the burden of unwieldy systems and better cover than market confusion.
After pumping giga-tax and industrial dollars into the research labs of the prominent schools and the works of their scientists and students, Governments need the venture capitalists to cough up the giga-bucks needed to actually produce something and capture some kind of market. I am not saying that this in itself is good or bad — it is just the way of capitalism after all, and it meets the objective of the industrio-academia-government partnerships that dominate the field.
Yet, by focusing research and market development funding on NLP and AI based-systems, “gatekeepers” have nearly prevented independent theories and very creative developers from getting funding and from “going commercial” just by playing their role as gatekeepers. By such actions, they continue to stymie and hobble viable research directions and other quite defeasible possibilities for semantic search. Thomas Frey wrote an essay about that too; you can find it here.
So I predict that although these companies are making in-roads, and they are making NLP systems more adaptable and usable, they will fail as “semantic search” systems because they are not doing semantic search at all. Or perhaps the public is as fickle as they seem and can be fooled, in which case, I could be wrong.
While Hakia and Lexxe have excellent implementations, and I have no doubt that PowerSet’s offering will also have strengths — not one of them qualifies as semantic search in my book. In regards to PowerSet, what Michael Reisman was reporting was that Barney Pell claims that Powerset has innovations that make the system more adaptable so that it can extract deep relationships from text. No one is saying what that “deep relationship” is, mainly because it is not deep at all; it is a surface level linguistic feature.
Not one of these so-called NLP-wonders can answer a third grade question; as I previously wrote here. Neither can they pass a simple test for semantic search capabilities — the most revealing of which is the capability to construe the meaning of a query given in another language, like this.
Commercial NLP based systems, such as Hakia, Lexxe and PowerSet can only do this in regards to English grammar– and how well they handle all forms of grammar is highly questionable and often disagreeable.
People should remember that the relationships they deliver are grammatical relationships. These relations cannot even be classed as semantic except as they relate terms to parts of speech. Having and knowing the concept of noun or verb and extracting the relation between the subject and object in a sentence reveals little about the possible associations and relevance between words and structures and concepts of the mind.