I want to change names of two columns using spark withColumnRenamed function. Of course, I can write:
data = sqlCont
I couldn't find an easy pyspark solution either, so just built my own one, similar to pandas' df.rename(columns={'old_name_1':'new_name_1', 'old_name_2':'new_name_2'})
.
def rename_columns(df, columns):
if isinstance(columns, dict):
for old_name, new_name in columns.items():
df = df.withColumnRenamed(old_name, new_name)
return df
else:
raise ValueError("'columns' should be a dict, like {'old_name_1':'new_name_1', 'old_name_2':'new_name_2'}")
So your solution will look like data = rename_columns(data, {'x1': 'x3', 'x2': 'x4'})
It saves me some lines of code, hope it will help you too.