July 2, 2007 5:00 AM PDT

Powerset: Re-indexing the Web

My first thought when stepping into the Powerset offices: "Overfunded." The company, which aims to create a better search engine than Google, already has some of the search giant's trappings: fancy offices (though rented), a game room, and a victor's arrogance. Yet if the Powerset team can pull off what it's set out to do, it will indeed revolutionize search and the way people use the Web, not to mention its economics.

Only natural

Powerset is "natural language search." What that means is that instead of searching the Web based on keywords, like Google does, it searches on meaning. Powerset understands what a search query means, and it understands what every sentence it has indexed is about, too. The company's shining example (which is getting a little old) is this: If you enter the query, "politicians who died from disease," Powerset will return a list that begins, "Edward Heath," with the supporting snippet from Wikipedia, "Sir Edward heath died from pneumonia." It says this because it knows that Heath was Prime Minister of England (and thus a politician), and that pneumonia is a disease.

Powerset's well-worn show-off query.

(Credit: Powerset)

Understanding Web content this way is, as they say, nontrivial. Powerset acquired an exclusive license to a 35-year-old Xerox research exercise called XLE, which does the job. Powerset COO Steve Newcomb told me that recent breakthroughs in both the XLE algorithms and in technology (the predictable Moore's Law) have made it economically feasible to index the Web for meaning.

(Newcomb said it took a year and a half of negotiating to strike the license deal with PARC, Xerox's research arm spinout. The deal includes provisions that prevent any other company--like Google--from getting access to the technology even if the other company acquires Xerox or PARC.)

Building a semantic index, as opposed to simply a semantic search query parser, is fundamentally new and different, and if Powerset can pull it off, it will make Web searches more accurate and useful. No longer will users have to experiment with subtle variations in search queries to get useful results. Slight differences in wording that mean the same thing will pull up the same results. Also, Powerset technology enables the display of results that are more readable than Google's: Powerset highlights passages that answer the query, instead of simply flagging keywords that match.

The hype curve

Am I skeptical that this will work? Of course. For the past several months, Powerset has been slowly peeling back layers of its work, trying to stay just ahead of the building sentiment that it's more hype than reality. The demo is impressive, to be sure. But Whatsit-style queries are just one kind of search. And to date, no outsiders have been turned loose on Powerset's engine. Only Powerset execs drive during the public demos.

That changes in September, when Powerset will launch PowerLabs, a special site for early Powerset testers that will unleash the search technology on limited corpuses of knowledge, like Wikipedia. After a few months of beta testers banging on the algorithm--and Powerset tweaking its engine--it will shut down PowerLabs, turn its technology loose on the Web itself for a few months, and then launch Powerset proper.

Compute-bound

Powerset's search technology is more expensive to run than Google's. It takes more computing power to parse semantics than to simply index, and nearly 20 percent of Powerset's ongoing budget is spent on compute resources, Newcomb told me. That's an awful lot for a Web startup, and although the price of compute cycles keeps dropping, Powerset's technology will always cost more than other search engines.

So it remains to be seen how Powerset will make a buck, even if it is better than Google. Perhaps Google Adwords on Powerset's highly precise search results will be do the trick. I believe there is margin to spare in Google's advertising business, so even though Powerset queries are more expensive than Google's, the economics might work. Powerset can also be turned loose on corporate databases for the big bucks. Imagine what it could do for lawyers.

Recent posts from Webware
Why can't they fix the Flash/Firefox bug?
Get remote file access, management on your iPhone with Sugarsync
Patents.com lets you search through ideas (good and bad)
Exploring Internet Explorer 8
Accounting on the go: Quickbooks for iPhone and Blackberry
Add a Comment (Log in or register) 2 comments
Still think Google is easier
by BlueSpirit_23 July 2, 2007 8:44 AM PDT
If this search engine depends on the meaning, that means the the queries should be meaningful, and I think this is not the easier way all the time, sometimes is just easier to search a single keyword rather than composing a meaningful query or sentence, but still I'll wait till I can try it myself and see how intelligent it can be
Reply to this comment
Others worth looking at
by carlact July 2, 2007 11:18 AM PDT
I've talked to several companies in this space and am equally frustrated by Powerset's hype and level of control over the demo. There are other companies doing equally fascinating and sophisticated work in semantics/natural language who are much more open about letting you play around with their product. CognitionSearch is structured into a small number of specific categories at the moment but delivers impressive results - and is actually being used by a real customer. And I've found myself using Hakia a lot more lately. Powerset is definitely one to watch but I don't think they're going to be the only game in town in this space - something they need to admit themselves as well.
--Carla Thompson, Analyst, Guidewire Group
Reply to this comment
Powered by Jive Software
advertisement

About Webware

Say No to boxed software! The future of applications is online delivery and access. Software is passé. Webware is the new way to get things done.

Add this feed to your online news reader

Webware topics

Inside CNET News

Scroll Left Scroll Right