In the beginning, there was little content on the web so we could organize it in directories, hierarchically. Then the number of the pages has increased, and we needed search to find our way around. But searching for the text in a page wasn't enough to bring relevant results. So we looked at the links in a page, to understand the authority of a page by looking at the authority of sites that point to a page. Now we have relevancy, but the search is still textual. So the next step would be to grasp the meaning of a page and of its parts, to create a semantic web algorithmically.
Google already knows how to separate navigation of the page from the actual content, it knows how to tell bad links (affiliates, paid links) from good links, it knows the theme of a page and relevant keywords, but it can't yet divide the content into meaningful parts. If you look at product reviews, you'll find them in different places: in sites that are specialized in reviews and have a special format Google can understand (like Amazon, IMDB), in blogs, news sites, forums. Google's next step is to understand that a blog post reviews a product, to gather relevant passages, to understand if the review is positive and negative, what's good/bad about the product. So the next time you type [review iPod nano], you'll see a list of sites that review the product, grouped by positive / negative reviews and a shared vision about the product. You'll know that most people love or hate iPod Nano and their reasons in an aggregate chart.
Google could also find job descriptions, addresses, emails, events, biographies and let you search them using query-dependent parameters. The next step for Google is to structure unstructured information, and to turn the web into a Google Base.
{ Photo by Joao Bambu. }
No comments:
Post a Comment