How to stop reading a spreadsheet with xlrd at the first empty row?
To check if a single cell is empty, inspect its ctype
attribute. To check an entire row, use the all
function on a list comprehension:
workbook = xlrd.open_workbook(filepath)
sheet = workbook.sheets()[0]
rows = sheet.get_rows()
next(rows) # skip first row
for row in rows:
if all([cell.ctype in (xlrd.XL_CELL_EMPTY, xlrd.XL_CELL_BLANK)
for cell in row]):
break
# process this non-empty row here...
auslander
Geospatial analyst/devop with a preference for Esri but pragmatic enough to use open source where it's faster and easier.
Updated on September 11, 2020Comments
-
auslander over 3 years
I'm using xlrd to sniff through a directory structure and pull out spreadsheets, reading the second row (row 1) down to "do stuff." The issue is that I don't know how to stop the reading/printing at the first empty row. I understand that rows are not "empty" objects, but I'd appreciate a little help showing how to check if all cells are empty. Here is the code I'm working with:
import xlrd import os def excel_file_filter(filename, extensions=['.xls', '.xlsx']): return any(filename.endswith(e) for e in extensions) def get_filenames(root): filename_list = [] for path, subdirs, files in os.walk(root): for filename in filter(excel_file_filter, files): filename_list.append(os.path.join(path, filename)) return filename_list spreadsheets = get_filenames('C:\\Temp') for s in spreadsheets: with xlrd.open_workbook(s) as wb: cs = wb.sheet_by_index(0) num_cols = cs.ncols for row_index in range(1, cs.nrows): print('Row: {}'.format(row_index)) for col_index in range(0, num_cols): cell_object = cs.cell(row_index, col_index) if cell_obj is not xlrd.empty_cell: print('Col #: {} | Value: {}'.format(col_index, cell_obj))
What ends up happening is that it prints all the way through almost 1000 rows, when only the first say, 25 rows have content in them. The amount of content between the spreadsheets varies, so a generic solution (that does not rely on other optional libraries) that helps me understand how to detect an empty row and then break, would be appreciated.