问题
Here I have an example file of multiple objects each measured at the same time-points (also ND.T represents each unique time point). I would like to split this file into separate files (using a python script) containing all objects unique to each time-point still containing the header.
Original file:
ID ND.T Time [s] Position X [%s] Position Y [%s] Speed [%s] Area [%s] Width [%s] MeanIntensity
1 1 3.87 417.57 11.46 0.06 339.48 14.1 245.65
1 2 8.72 417.37 11.68 0.04 342.61 14.15 239.34
1 3 13.39 417.57 11.66 0.04 344.17 14.3 239.48
2 1 3.87 439.01 6.59 0.02 342.61 11.66 204.47
2 2 8.72 438.97 6.65 0.007 342.61 10.7 197.96
2 3 13.39 438.94 6.66 0.03 345.74 11.03 214.74
Time_3.87.csv
ID ND.T Time [s] Position X [%s] Position Y [%s] Speed [%s] Area [%s] Width [%s] MeanIntensity
1 1 3.87 417.57 11.46 0.06 339.48 14.1 245.65
2 1 3.87 439.01 6.59 0.02 342.61 11.66 204.47
Time_8.72.csv
ID ND.T Time [s] Position X [%s] Position Y [%s] Speed [%s] Area [%s] Width [%s] MeanIntensity
1 2 8.72 417.37 11.68 0.04 342.61 14.15 239.34
2 2 8.72 438.97 6.65 0.007 342.61 10.7 197.96
Time_13.39.csv
ID ND.T Time [s] Position X [%s] Position Y [%s] Speed [%s] Area [%s] Width [%s] MeanIntensity
1 3 13.39 417.57 11.66 0.04 344.17 14.3 239.48
2 3 13.39 438.94 6.66 0.03 345.74 11.03 214.74
Example 2:
ID ND.T Time [s] Position X [%s] Position Y [%s] Speed [%s] Area [%s] Width [%s] MeanIntensity
1 1 3.87 417.57 11.46 0.06 339.48 14.1 245.65
1 2 8.72 417.37 11.68 0.04 342.61 14.15 239.34
1 3 13.39 417.57 11.66 0.04 344.17 14.3 239.48
1 4 18.1 417.73 11.71 0.04 337.92 14.14 225.17
1 5 22.81 417.83 11.89 0.03 344.17 14.64 233.3
1 6 27.48 417.69 11.83 0.02 345.74 14.23 238
1 7 32.16 417.65 11.94 0.03 345.74 14.71 230.75
2 1 3.87 439.01 6.59 0.02 342.61 11.66 204.47
2 2 8.72 438.97 6.65 0.007 342.61 10.7 197.96
2 3 13.39 438.94 6.66 0.03 345.74 11.03 214.74
2 4 18.1 438.9 6.53 0.04 342.61 10.46 202.9
2 5 22.81 438.97 6.7 0.02 342.61 10.3 194.32
2 6 27.48 438.89 6.71 0.006 350.43 11 219.41
2 7 32.16 438.87 6.74 0.05 348.87 10.36 219.58
回答1:
You can use pandas to achieve this:
import pandas as pd
df = pd.read_csv(your_file)
df.groupby('Time [s]').apply(lambda x: x.to_csv(str(x.name) + '.csv'))
The above will load your csv using read_csv and then group on the Time [s] column and use this to name the file
You can see that the df is grouped on the Time [s]:
In [108]:
df.groupby('Time [s]').apply(lambda x: print(x))
ID ND.T Time [s] Position X [%s] Position Y [%s] Speed [%s] \
0 1 1 3.87 417.57 11.46 0.06
3 2 1 3.87 439.01 6.59 0.02
Area [%s] Width [%s] MeanIntensity
0 339.48 14.10 245.65
3 342.61 11.66 204.47
ID ND.T Time [s] Position X [%s] Position Y [%s] Speed [%s] \
0 1 1 3.87 417.57 11.46 0.06
3 2 1 3.87 439.01 6.59 0.02
Area [%s] Width [%s] MeanIntensity
0 339.48 14.10 245.65
3 342.61 11.66 204.47
ID ND.T Time [s] Position X [%s] Position Y [%s] Speed [%s] \
1 1 2 8.72 417.37 11.68 0.040
4 2 2 8.72 438.97 6.65 0.007
Area [%s] Width [%s] MeanIntensity
1 342.61 14.15 239.34
4 342.61 10.70 197.96
ID ND.T Time [s] Position X [%s] Position Y [%s] Speed [%s] \
2 1 3 13.39 417.57 11.66 0.04
5 2 3 13.39 438.94 6.66 0.03
Area [%s] Width [%s] MeanIntensity
2 344.17 14.30 239.48
5 345.74 11.03 214.74
Out[108]:
Empty DataFrame
Columns: []
Index: []
Here groupby will group on 'Time [s]' column, we then call apply to apply a lambda
where we call the method to_csv
on each grouping, we can access the group name using name
attribute which is of dtype
int
so we cast to str
and construct our csv name:
In [109]:
df.groupby('Time [s]').apply(lambda x: print(str(x.name) + '.csv'))
3.87.csv
8.72.csv
13.39.csv
Out[109]:
Empty DataFrame
Columns: []
Index: []
来源:https://stackoverflow.com/questions/37683085/splitting-csv-file-of-multiple-objects-over-time-by-time-point