Convert commas decimal separators to dots within a Dataframe

前端 未结 3 423
清酒与你
清酒与你 2020-11-27 19:07

I am importing a CSV file like the one below, using pandas.read_csv:

df = pd.read_csv(Input, delimiter=\";\")

Example of CSV f

相关标签:
3条回答
  • 2020-11-27 19:12

    I think the earlier mentioned answer of including decimal="," in pandas read_csv is the preferred option.

    However, I found it is incompatible with the Python parsing engine. e.g. when using skiprow=, read_csv will fall back to this engine and thus you can't use skiprow= and decimal= in the same read_csv statement as far as I know. Also, I haven't been able to actually get the decimal= statement to work (probably due to me though)

    The long way round I used to achieving the same result is with list comprehensions, .replace and .astype. The major downside to this method is that it needs to be done one column at a time:

    df = pd.DataFrame({'a': ['120,00', '42,00', '18,00', '23,00'], 
                    'b': ['51,23', '18,45', '28,90', '133,00']})
    
    df['a'] = [x.replace(',', '.') for x in df['a']]
    
    df['a'] = df['a'].astype(float)
    

    Now, column a will have float type cells. Column b still contains strings.

    Note that the .replace used here is not pandas' but rather Python's built-in version. Pandas' version requires the string to be an exact match or a regex.

    0 讨论(0)
  • 2020-11-27 19:30

    pandas.read_csv has a decimal parameter for this: doc

    I.e. try with:

    df = pd.read_csv(Input, delimiter=";", decimal=",")
    
    0 讨论(0)
  • 2020-11-27 19:38

    I answer to the question about how to change the decimal comma to the decimal dot with Python Pandas.

    $ cat test.py 
    import pandas as pd
    df = pd.read_csv("test.csv", quotechar='"', decimal=",")
    df.to_csv("test2.csv", sep=',', encoding='utf-8', quotechar='"', decimal='.')
    

    where we specify the reading in decimal separator as comma while the output separator is specified as dot. So

    $ cat test.csv 
    header,header2
    1,"2,1"
    3,"4,0"
    $ cat test2.csv 
    ,header,header2
    0,1,2.1
    1,3,4.0
    

    where you see that the separator has changed to dot.

    0 讨论(0)
提交回复
热议问题