问题
Pretty new to stackoverflow, please bear with me if the format looks odd..
I have a big set of data with 100+ columns of data structured like:
countrya countryb year variable1 variable2 ...... varaible100
I want to have the 100 variables separated into 100 new dataframes and save them into csvs.
Below is the code I have for creating 1 new csv.
dfm1=pd.melt(df, id_vars=['countrya','countryb','year'], value_vars=['variable1'],
value_name='variable1')
dfm1.drop('variable',axis=1)
dfm1.to_csv('newdf1.csv')
How can I automate the process? Thank you!
回答1:
Here is one way. First, create the data frame.
import pandas as pd
df = pd.DataFrame({
'country_a': [1, 2, 3],
'country_b': [4, 5, 6],
'year': [2018, 2019, 2020],
'var_a': ['a', 'b', 'c'],
'var_b': ['x', 'y', 'z']
})
print(df)
country_a country_b year var_a var_b
0 1 4 2018 a x
1 2 5 2019 b y
2 3 6 2020 c z
Second, iterate over the fields with your column names.
base_fields = df.columns[:3].to_list() # columns in every file
var_fields = df.columns[3:] # var_a, var_b, ...
for var_field in var_fields:
file_name = f'{var_field}.csv'
with open(file_name, 'wt') as handle:
fields = base_fields + [var_field]
df.loc[:, fields].to_csv(handle)
print(f'wrote {fields} to {file_name}')
wrote ['country_a', 'country_b', 'year', 'var_a'] to var_a.csv
wrote ['country_a', 'country_b', 'year', 'var_b'] to var_b.csv
^ ^
last field and file name change
回答2:
You can use a for loop against all variables, and call your function inside it(assuming your sample code is correct)
def split(df, variable_name):
dfm1=pd.melt(df, id_vars=['countrya','countryb',variable_name], value_vars=[variable_name], value_name=variable_name)
dfm1.drop('variable',axis=1) # I don't know what's this line used for
dfm1.to_csv('newdf_{}.csv'.format(variable_name))
for variable_name in ['variable1', 'variable2']:
split(df, variable_name)
来源:https://stackoverflow.com/questions/63386562/pandas-melt-100-variables-into-100-new-dataframes