Wednesday, September 5, 2007

Google Spreadsheets Lets You Import Online Data

Google Spreadsheets wins the prize for the coolest new features launched this summer in Google's web applications.

The magical autofill lets you type the first elements from a series, select them, drag the fill handle (a small blue square) across the range that you want to fill and obtain the rest of the items automatically. It works for time series (days of the week, months), numerical series (arithmetical progressions), a combination between text and numbers (for example: type Q1, Q2 and Google Spreadsheets will add Q3, Q4 etc.) or dates.


You can also try entering related words, products, companies, countries and you may get surprising results; just press Ctrl while dragging the fill handle. It seems that autofill uses data from Google Sets. Look what I got when I typed "pop", "rock", "rap":


The other great new feature uses the "online" part from from "online spreadsheet". Now you can add data from many different sources available online: feeds, HTML files or simply text files (CSV/TSV). The data is automatically refreshed, although it's not very clear how often.

For HTML or XML files, you have to type an XPath expression to describe the data:
=importXML("URL","XPath expression")

For example, to get the Google search results for [live], you need to check the source code and notice that the class attribute has the value "l":
=importXML("http://www.google.com/search?q=live", "//a[@class='l']/@href")


There's a special function to import tables and lists from HTML files:
=importHtml(URL, element, index)

Element can be either "list" or "table", while the index tells an element's order in the page. Here's how to import all the definitions for [live] found by Google:
=importHTML("http://www.google.com/search?q=define:live", "list", 1)


To import structured text files, use this function:
=importData("URL")

This is especially useful if you want to import data from multiple CSV files available online.


Atom and RSS feeds can be imported using this function:
=GoogleReader(URL)

The function has an advanced form that lets you add only some of the items from the feed. For example, here's how to obtain the latest headline from New York Times:
=GoogleReader ("http://graphics8.nytimes.com/services/xml/rss/nyt/HomePage.xml", "items title", "false", 1)

If we couple these new additions with the two other functions that retrieve information from the web and consider that each published spreadsheet is available as a feed, it's clear that you can connect multiple spreadsheets and easily reference data available online.

No comments:

Post a Comment