Wednesday, March 17, 2010

Data Mining Using Google

Today's xkcd comic is about quantitative Google queries. Randall Munroe found the number of search results for queries like "My IQ is X", where X is a variable, and plotted a graph for each query. While the results aren't reliable (Google only shows an estimation for the number of search results), it's an interesting way to mine Google's index of the web.



If you are familiar with Google Spreadsheets, try to create a sheet that lets you enter a query like "My IQ is X", a variable name and the values for that variable. The result should be a graph that shows the number of Google search results for each instance of your query. Use importXML and an XPath expression to find the number of Google search results: "//p[@id='resultStats']/b[3]". Here's an example.

{ Image licensed as Creative Commons Attributions-Noncommercial. }

No comments:

Post a Comment