问题
A similar question has been asked before but has received no responses
I have looked through a number of forums for a solution. The other questions involve a year but mine does not - it is simply H:M:S
I web scraped this data which returned
Time - 36:42 38:34 1:38:32 1:41:18
Data samples here: Source data 1 and Source data 2
I need this time in minutes like so 36.70 38.57 98.53 101.30
To do this I tried this:
time_mins = []
for i in time_list:
h, m, s = i.split(':')
math = (int(h) * 3600 + int(m) * 60 + int(s))/60
time_mins.append(math)
But that didn't work because 36:42 is not in the format H:M:S, so I tried to convert 36:42 using this
df1.loc[1:,6] = df1[6]+ timedelta(hours=0)
and this
df1['minutes'] = pd.to_datetime(df1[6], format='%H:%M:%S')
but have had no luck.
Can I do it at the extraction stage? I have to do it for over 500 rows
row_td = soup.find_all('td')
If not, how can it do it after conversion into a data frame
Thanks in advance
回答1:
If your input (time delta string) only contains hours/minutes/seconds (no days etc.), you could use a custom function that you apply to the column:
import pandas as pd
df = pd.DataFrame({'Time': ['36:42', '38:34', '1:38:32', '1:41:18']})
def to_minutes(s):
# split string s on ':', reverse so that seconds come first
# multiply the result as type int with elements from tuple (1/60, 1, 60) to get minutes for each value
# return the sum of these multiplications
return sum(int(a)*b for a, b in zip(s.split(':')[::-1], (1/60, 1, 60)))
df['Minutes'] = df['Time'].apply(to_minutes)
# df['Minutes']
# 0 36.700000
# 1 38.566667
# 2 98.533333
# 3 101.300000
# Name: Minutes, dtype: float64
Edit: it took me a while to find it but this is a variation of this question. And my answer here is based on this reply.
回答2:
You were on the right track. Below has some modifications to your code and it gets the minutes.
Create a function
def get_time(i):
ilist = i.split(':')
if(len(ilist)==3):
h, m, s = i.split(':')
else:
m, s = i.split(':')
h = 0
math = (int(h) * 3600 + int(m) * 60 + int(s))/60
return np.round(math, 2)
Call the function using split
x = "36:42 38:34 1:38:32 1:41:18"
x = x.split(" ")
xmin = [get_time(i) for i in x]
xmin
Output
[36.7, 38.57, 98.53, 101.3]
回答3:
I have no experience with pandas, but here is something you may find useful
...
time_mins = []
for i in time_list:
parts = i.split(':')
minutes_multiplier = 1/60
math = 0
for part in reversed(parts):
math += (minutes_multiplier * int(part))
minutes_multiplier *= 60
time_mins.append(math)
...
回答4:
I had earlier commented that @NileshIngle's response above was not working as it was giving me a
NameError: name 'h' is not defined.
A simple correction was required - moving h above m,s as it is the first variable referenced
h = 0 # move this above
m, s = i.split(':')
def get_time(i):
ilist = i.split(':')
if(len(ilist)==3):
h, m, s = i.split(':')
else:
h = 0
m, s = i.split(':')
math = (int(h) * 3600 + int(m) * 60 + int(s))/60
return np.round(math, 2)
I would like to thank @MrFuppes, @NileshIngle and @KaustubhBadrike for taking the time to respond. I have learned three different methods.
来源:https://stackoverflow.com/questions/61995881/how-to-convert-objects-or-string-to-time-format-when-the-input-string-object-is