I have a googlesheet where a column may contain no information in it. While iterating through the rows and looking at that column, if the column is blank, it\'s not returni
I've dabbled in Sheetsv4 and this is indeed the behavior when you're reading a range of cells with empty data. It seems this is the way it has been designed. As stated in the Reading data docs:
Empty trailing rows and columns are omitted.
So if you can find a way to write a character that represents 'empty values', like zero, then that will be one way to do it.
If you pull a range from the google sheet API v4 then empty row data IS included if its at the beginning or middle of the selected range. Only cells which have no data at the end of the range are omitted. Using this assumption you can 'fill' the no data cells in your app code.
For instance if you selected A1:A5 and A1 has no value it will still be returned in row data as {}
.
If A5 is missing then you'll have an array of length 4 and so know to fill the empty A5. If A4 & A5 are empty then you'll have an array of length 3 and so on.
If none of the range contains data then you'll receive an empty object.
The way I solved this issue was converting the values into a Pandas dataframe. I fetched the particular columns that I wanted in my Google Sheets, then converted those values into a Pandas dataframe. Once I converted my dataset into a Pandas dataframe, I did some data formatting, then converted the dataframe back into a list. By converting the list to a Pandas dataframe, each column is preserved. Pandas already creates null values for empty trailing rows and columns. However, I needed to also convert the non trailing rows with null values to keep consistency.
# Authenticate and create the service for the Google Sheets API
credentials = ServiceAccountCredentials.from_json_keyfile_name(KEY_FILE_LOCATION, SCOPES)
http = credentials.authorize(Http())
discoveryUrl = ('https://sheets.googleapis.com/$discovery/rest?version=v4')
service = discovery.build('sheets', 'v4',
http=http,discoveryServiceUrl=discoveryUrl)
spreadsheetId = 'id of your sheet'
rangeName = 'range of your dataset'
result = service.spreadsheets().values().get(
spreadsheetId=spreadsheetId, range=rangeName).execute()
values = result.get('values', [])
#convert values into dataframe
df = pd.DataFrame(values)
#replace all non trailing blank values created by Google Sheets API
#with null values
df_replace = df.replace([''], [None])
#convert back to list to insert into Redshift
processed_dataset = df_replace.values.tolist()
I know that this is super late, but just in case someone else who has this problem in the future would like a fix for it, I'll share what I did to work past this. What I did was increase the length of the range of cells I was looking for by one. Then within the Google Spreadsheet that I was reading off of, I added a line of "."s in the extra column (The column added to the array now that the desired range of cells has increased). Then I protected that line of periods so that it can't be changed from the "." This way gives you an array with everything you are looking for, including null results, but does increase your array size by 1. But if that bothers you, you can just make a new one without the last index of the arrays.
Just add:
values.add("");
before:
cells = values.get(0);
This will ensure that you do not query an empty list because of blank cell or a row.
I experienced the same issue using V4 of the sheets api but was able to workaround this using an extra column at the end of my range and the valueRenderOption
argument for the values.get
API
Given three columns, A, B and C any of which might contain a null value, add an additional column, D and add an arbitrary value here such as 'blank'.
Ensure you capture the new column in your range and add the additional parameter,
valueRenderOption: 'FORMATTED_VALUE'
.
You should end up with a call similar to this:
sheets.spreadsheets.values.get({
spreadsheetId: SOME_SHEET_ID,
range: "AUTOMATION!A:D",
valueRenderOption: 'FORMATTED_VALUE'
}, (err, res) => {})
This should then give you a consistent length array for each value, returning a blank string "" in the place of the empty cell value.