Note: This tool may conflict with Google's TOS, but I thought it was interesting enough to tell you about it.
Many people would like to have feeds for Google search results. They could use them to monitor some keywords or to develop their web applications. But Google didn't show any interest in providing this feature; moreover, they cut the support for the SOAP API.
In an interesting twist, someone realized that Google actually has a way to return results in an XML file, but you need to do some work to actually retrieve them. So your URL will look the same as the standard URL for a Google search, except that you'll have to add some new parameters:
* ch=[value of a checksum]
* client=navclient-auto
Basically, you'll pretend you're Google Toolbar (that's the explanation for the client parameter) and add a checksum for the query that uses a similar algorithm to the checksum used to find the PageRank value. Unlike the API, you won't have any limitation (although Google might realize you're not Google Toolbar).
The code and some demos are available here.
Homework:
1. How does this code breaches Google's TOS more than screen scraping?
2. Do you know where is this feature used in Google Toolbar?
No comments:
Post a Comment