Python Pandas, write DataFrame to fixed-width file (to_fwf?)

前端 未结 7 895
予麋鹿
予麋鹿 2020-11-30 11:25

I see that Pandas has read_fwf, but does it have something like DataFrame.to_fwf? I\'m looking for support for field width, numerical precision, a

相关标签:
7条回答
  • 2020-11-30 11:40

    Python, Pandas : write content of DataFrame into text File

    The question aboves answer helped me. It is not the best, but until to_fwf exists this will do the trick for me...

    np.savetxt(r'c:\data\np.txt', df.values, fmt='%d')
    

    or

    np.savetxt(r'c:\data\np.txt', df.values, fmt='%10.5f')
    
    0 讨论(0)
  • 2020-11-30 11:43

    Based on others' answer, here is the snippet I wrote, not the best in coding and performance:

    import pandas as pd
    import pickle
    import numpy as np
    from tabulate import tabulate
    
    
    left_align_gen = lambda length, value: eval(r"'{:<<<length>>}'.format('''<<value>>'''[0:<<length>>])".replace('<<length>>', str(length)).replace('<<value>>', str(value)))
    right_align_gen = lambda length, value: eval(r"'{:><<length>>}'.format('''<<value>>'''[0:<<length>>])".replace('<<length>>', str(length)).replace('<<value>>', str(value)))
    
    # df = pd.read_pickle("dummy.pkl")
    with open("df.pkl", 'rb') as f:
        df = pickle.load(f)
    
    # field width defines here, width of each field
    widths=(22, 255, 14, 255, 14, 255, 255, 255, 255, 255, 255, 22, 255, 22, 255, 255, 255, 22, 14, 14, 255, 255, 255, 2, )
    
    # format datetime
    df['CREATED_DATE'] = df['CREATED_DATE'].apply(lambda x: x.to_pydatetime().strftime('%Y%m%d%H%M%S'))
    df['LAST_MODIFIED_DATE'] = df['LAST_MODIFIED_DATE'].apply(lambda x: x.to_pydatetime().strftime('%Y%m%d%H%M%S'))
    df['TERMS_ACCEPTED_DATE'] = df['TERMS_ACCEPTED_DATE'].apply(lambda x: x.to_pydatetime().strftime('%Y%m%d%H%M%S'))
    df['PRIVACY_ACCEPTED_DATE'] = df['PRIVACY_ACCEPTED_DATE'].apply(lambda x: x.to_pydatetime().strftime('%Y%m%d%H%M%S'))
    
    
    # print(type(df.iloc[0]['CREATED_DATE']))
    # print(df.iloc[0])
    record_line_list = []
    # for row in df.iloc[:10].itertuples():
    for row in [tuple(x) for x in df.to_records(index=False)]:
        record_line_list.append("".join(left_align_gen(length, value) for length, value in zip(widths, row)))
    
    with open('output.txt', 'w') as f:
        f.write('\n'.join(record_line_list))
    

    Github gist

    0 讨论(0)
  • 2020-11-30 11:45

    found a very simple solution! (Python). In the code snapped I am trying to write a DataFrame to a positional File. "finalDataFrame.values.tolist()" will return u a list in which each row of the DataFrame is turn into an another list just a [['Camry',2019,'Toyota'],['Mustang','2016','Ford']]. after that with the help of for loop and if statement i am trying to set its fix length. rest is obvious!

     with open (FilePath,'w') as f:
        for i in finalDataFrame.values.tolist():
            widths=(0,0,0,0,0,0,0)
            if i[2] == 'nan':
                i[2]=''
                for h in range(7):
                    i[2]= i[2] + ' '
            else:
                x=7-len(str(i[2]))
                a=''
                for k in range(x):
                   a=a+' '
                i[2]=str(i[2])+a
    
            if i[3] == '':
                i[3]=''
                for h in range(25):
                    i[3]=i[3]+' '
            else:
                x = 25 - len(i[3])
                print(x)
                a = ''
                for k in range(x):
                    a = a + ' '
                print(a)
                i[3] = i[3] + a
    
    
            i[4] = str(i[4])[:10]
    
            q="".join("%*s" % i for i in zip(widths, i))
            f.write(q+'\n')
    
    0 讨论(0)
  • 2020-11-30 11:52

    I'm sure you found a workaround for this issue but for anyone else who is curious... If you write the DF into a list, you can write it out to a file by giving the 'format as a string'.format(list indices) eg:

    df=df.fillna('')
    outF = 'output.txt'      
    dbOut = open(temp, 'w')
    v = df.values.T.tolist()        
    for i in range(0,dfRows):       
        dbOut.write(( \
        '{:7.2f}{:>6.2f}{:>2.0f}{:>4.0f}{:>5.0f}{:6.2f}{:6.2f}{:6.2f}{:6.1f {:>15}{:>60}'\
        .format(v[0][i],v[1][i],v[2][i],v[3][i],v[4][i],v[5][i],v[6][i],v[7][i],v[8][i],\
        v[9][i],v[10][i]) ))
        dbOut.write("\n")
    dbOut.close
    

    Just make sure to match up each index with the correct format :)

    Hope that helps!

    0 讨论(0)
  • 2020-11-30 11:53

    pandas.DataFrame.to_string() is all you need. The only trick is on how to manage the index.

    Ignore Index

    If you don’t care about the index:

    # write
    df.to_string(filepath, index=False)
    
    # read
    df = pd.read_fwf(filepath)
    

    Handle Index

    If you want to retrieve a pandas.Index or a pandas.MultiIndex:

    # write
    df.reset_index().to_string(filepath, index=False)
    
    # read
    df = pd.read_fwf(filepath).set_index(index_names)
    

    If your Index has no name when writing, reset_index() should assign it to column "index".

    If your MultiIndex has no names, it should be assigned to columns ["level_0", "level_1", ...].

    0 讨论(0)
  • 2020-11-30 11:57

    For custom format for each column you can set format for whole line. fmt param provides formatting for each line

    with open('output.dat') as ofile:
         fmt = '%.0f %02.0f %4.1f %3.0f %4.0f %4.1f %4.0f %4.1f %4.0f'
         np.savetxt(ofile, df.values, fmt=fmt)
    
    0 讨论(0)
提交回复
热议问题