What is the fastest way to update a google spreadsheet with a lot of data through the spreadsheet api?

11,157

Solution 1

I was able to speed up the batch request provided in the official API http://code.google.com/apis/spreadsheets/data/3.0/developers_guide.html#SendingBatchRequests by skipping the QUERY part before the UPDATE. So this is what they have in the example:

// Prepare the update
    // getCellEntryMap is what makes the update fast.
    Map cellEntries = getCellEntryMap(ssSvc, cellFeedUrl, cellAddrs);

    CellFeed batchRequest = new CellFeed();
    for (CellAddress cellAddr : cellAddrs) {
      URL entryUrl = new URL(cellFeedUrl.toString() + "/" + cellAddr.idString);
      CellEntry batchEntry = new CellEntry(cellEntries.get(cellAddr.idString));
      batchEntry.changeInputValueLocal(cellAddr.idString);
      BatchUtils.setBatchId(batchEntry, cellAddr.idString);
      BatchUtils.setBatchOperationType(batchEntry, BatchOperationType.UPDATE);
      batchRequest.getEntries().add(batchEntry);
    }
  // Submit the update
    Link batchLink = cellFeed.getLink(Link.Rel.FEED_BATCH, Link.Type.ATOM);
    CellFeed batchResponse = ssSvc.batch(new URL(batchLink.getHref()), batchRequest);

and this is what I changed it to

CellFeed batchRequest = new CellFeed();
        for (CellInfo cellAddr : cellsInfo) {
             CellEntry batchEntry = new CellEntry(cellAddr.row, cellAddr.col, cellAddr.idString);
              batchEntry.setId(String.format("%s/%s", worksheet.getCellFeedUrl().toString(), cellAddr.idString));         
              BatchUtils.setBatchId(batchEntry, cellAddr.idString);
              BatchUtils.setBatchOperationType(batchEntry, BatchOperationType.UPDATE);  
              batchRequest.getEntries().add(batchEntry);



        }

        CellFeed cellFeed = ssSvc.getFeed(worksheet.getCellFeedUrl(), CellFeed.class);      
        Link batchLink =  cellFeed.getLink(Link.Rel.FEED_BATCH, Link.Type.ATOM);

        ssSvc.setHeader("If-Match", "*");
        CellFeed batchResponse = ssSvc.batch(new URL(batchLink.getHref()), batchRequest);
        ssSvc.setHeader("If-Match", null);

Notice, the header should be changed to make it work.

Solution 2

Speedup: posted by David Tolioupov - it works. Some extra info that helped.

Example of how to use the CellFeed, see CellDemo.java http://gdata-java-client.googlecode.com/svn-history/r51/trunk/java/sample/spreadsheet/cell/CellDemo.java

The example has details, enough detail that it helped me optimize my code.

As stated by David Tolioupov, create the CellEntry this way:

CellEntry batchEntry = new CellEntry(cellAddr.row, cellAddr.col, cellAddr.idString);
batchEntry.setId(String.format("%s/%s", cellFeedUrl.toString(), cellAddr.idString)); 

From the example:

/**
 * Returns a CellEntry with batch id and operation type that will tell the
 * server to update the specified cell with the given value. The entry is
 * fetched from the server in order to get the current edit link (for
 * optimistic concurrency).
 * 
 * @param row the row number of the cell to operate on
 * @param col the column number of the cell to operate on
 * @param value the value to set in case of an update the cell to operate on
 * 
 * @throws ServiceException when the request causes an error in the Google
 *         Spreadsheets service.
 * @throws IOException when an error occurs in communication with the Google
 *         Spreadsheets service.
 */
private CellEntry createUpdateOperation(int row, int col, String value)
    throws ServiceException, IOException {
  String batchId = "R" + row + "C" + col;
  URL entryUrl = new URL(cellFeedUrl.toString() + "/" + batchId);
  CellEntry entry = service.getEntry(entryUrl, CellEntry.class);
  entry.changeInputValueLocal(value);
  BatchUtils.setBatchId(entry, batchId);
  BatchUtils.setBatchOperationType(entry, BatchOperationType.UPDATE);

  return entry;
}

All that is required is the cellFeedUrl, then create the request and send it.

Solution 3

If you are updating entire lines, you can try working with list-based feeds:

http://code.google.com/intl/fr-FR/apis/spreadsheets/data/3.0/developers_guide.html#UpdatingListRows

It will allow you to update values (not formulas).

If you still have performance problems, you should switch to something like a relational database server or google's datastore (if you are working with google app engine)

Share:
11,157

Related videos on Youtube

Daniel
Author by

Daniel

Updated on June 04, 2022

Comments

  • Daniel
    Daniel almost 2 years

    I am using the Google Spreadsheet API to update a spreadsheet with a lot of data (hundreds of rows and around twenty columns).

    I have tested making a batch call to update 2500 cells. The call takes around 40 seconds to complete, with the request being about 1mb and the response being ~2mb.

    Is there any way to get it to work faster?

  • Daniel
    Daniel over 12 years
    Thanks. But i understood there is no way of making a batch update of rows so that would mean hundreds of requests.google.com/support/forum/p/apps-apis/… Using a different datastore is not an option for me as my app is built in order to manipulate google spreadsheets and not to store this data.
  • Martin Dimitrov
    Martin Dimitrov over 11 years
    Great answer. Thanks. It dropped insert time to 4.5 sec.
  • Adam Wallner
    Adam Wallner almost 11 years
    The If-Match header is the key! Thanks.