I am pretty new to Python and I am trying to cleanse some data. I\'ve attached a link to the data file (Two tabs: Raw data and desired outcome). Please help!
What I am t
Use:
# Read the excel file with sheet_name='Raw data' and skiprows=23 which are not necessary
data_xls = pd.read_excel("Example2.xlsx", sheet_name='Raw data', skiprows=23)
# Create the dummy columns names which are similar to desired output column
dummy_col_names = ['Internal Link Tracking (non','Campaign Name','Creative','Action','Action 2']
# Use str.split with expand=True to create a dataframe
dummy_df = data_xls['Internal Link Tracking (non-promotions) - ENT (c20)'].str.split('-',expand = True)
# Rename columns as per dummy column list
dummy_df.columns = dummy_col_names
# Drop the column which is not necessary
data_xls.drop('Internal Link Tracking (non-promotions) - ENT (c20)', axis=1, inplace=True)
# Use pd.concat along axis=1 to concat both data_xls and dummy_df along columns
data_xls = pd.concat((data_xls,dummy_df),sort=False,axis=1)
# To preserve oreder similar to desired output column use the following code
col_names = data_xls.columns.tolist()
data_xls = data_xls[col_names[:1]+dummy_col_names+col_names[1:-5]]