Returning a dataframe in python function

前端 未结 4 1939
-上瘾入骨i
-上瘾入骨i 2021-02-05 21:03

I am trying to create and return a data frame from a python function

def create_df():
    data = {\'state\': [\'Ohio\',\'Ohio\',\'Ohio\',\'Nevada\',\'Nevada\'],         


        
相关标签:
4条回答
  • 2021-02-05 21:09

    I'm kind of late here, but what about creating a global variable within the function? It should save a step for you.

    def create_df():
    
        global df
    
        data = {
        'state': ['Ohio','Ohio','Ohio','Nevada','Nevada'],
        'year': [2000,2001,2002,2001,2002],
        'pop': [1.5,1.7,3.6,2.4,2.9]
        }
    
        df = pd.DataFrame(data)
    

    Then when you run create_df(), you'll be able to just use df.

    Of course, be careful in your naming strategy if you have a large program so that the value of df doesn't change as various functions execute.

    EDIT: I noticed I got some points for this. Here's another (probably worse) way to do this using exec. This also allows for multiple dataframes to be created, if desired.

    import pandas as pd
    
    def create_df():
        data = {'state': ['Ohio','Ohio','Ohio','Nevada','Nevada'],
               'year': [2000,2001,2002,2001,2002],
               'pop': [1.5,1.7,3.6,2.4,2.9]}
        df = pd.DataFrame(data)
        return df
    
    ### We'll create three dataframes for an example
    for i in range(3):
        exec(f'df_{i} = create_df()')
    

    Then, you can test them out:

    Input: df_0

    Output:

        state  year  pop
    0    Ohio  2000  1.5
    1    Ohio  2001  1.7
    2    Ohio  2002  3.6
    3  Nevada  2001  2.4
    4  Nevada  2002  2.9
    

    Input: df_1

    Output:

        state  year  pop
    0    Ohio  2000  1.5
    1    Ohio  2001  1.7
    2    Ohio  2002  3.6
    3  Nevada  2001  2.4
    4  Nevada  2002  2.9
    

    Etc.

    0 讨论(0)
  • 2021-02-05 21:11

    when you call create_df() python calls the function but doesn't save the result in any variable. that is why you got the error.

    assign the result of create_df() to df like this df = create_df()

    0 讨论(0)
  • 2021-02-05 21:27

    Function explicitly returns two DataFrames:

    import pandas as pd
    import numpy as np
    
    def return_2DF():
    
    date = pd.date_range('today', periods=20)
    DF1 = pd.DataFrame(np.random.rand(20, 2), index=date, columns=list('xyz'))
    
    DF2 = pd.DataFrame(np.random.rand(20, 4), index=date, columns='A B C D'.split())
    
    return DF1, DF2
    

    Calling and returning two data frame

    one, two = return_2DF()
    
    0 讨论(0)
  • 2021-02-05 21:33

    You can return dataframe from a function by making a copy of the dataframe like

    def my_function(dataframe):
      my_df=dataframe.copy()
      my_df=my_df.drop(0)
      return(my_df)
    
    new_df=my_function(old_df)
    print(type(new_df))
    

    Output: pandas.core.frame.DataFrame

    0 讨论(0)
提交回复
热议问题