So I have this google sheets API, and I am grabbing data from it and running a KS test. However, I only want to run the KS test on a number. But, the string consists of words as
Given strings coming from Google Sheets API, run kstest on the last number of each string.
A better way would be getting the numbers straight from Google Sheets API, store them and feed to stats.kstest
.
You can split the string using str.split then covert the it to float.
>>> s = '2020-09-15 00:05:43,chemsense,co,concentration,-0.75889,'
>>> s.split(',')
['2020-09-15 00:05:43', 'chemsense', 'co', 'concentration', '-0.75889', '']
>>> s.split(',')[4] # get the number (5th item in the list)
'-0.75889'
>>> float(s.split(',')[4]) # convert to float type
-0.75889
>>> round(float(s.split(',')[4]), 2) # round to 2 decimal place
-0.76
from scipy import stats
# Assuming strings coming back from API are in a list
str = [
'2020-09-15 00:05:13,chemsense,co,concentration,-0.51058,',
'2020-09-15 00:05:43,chemsense,co,concentration,-0.75889,',
'2020-09-15 00:06:09,chemsense,co,concentration,-1.23385,',
'2020-09-15 00:06:33,chemsense,co,concentration,-1.23191,',
'2020-09-15 00:06:58,chemsense,co,concentration,-0.94495,',
'2020-09-15 00:07:23,chemsense,co,concentration,-1.16024,'
]
x = []
for s in str:
x.append(float(s.split(',')[4]))
stats.kstest(x, 'norm')