What are Python pandas equivalents for R functions like str(), summary(), and head()?

后端 未结 7 1063
别跟我提以往
别跟我提以往 2020-11-29 21:26

I\'m only aware of the describe() function. Are there any other functions similar to str(), summary(), and head()?

相关标签:
7条回答
  • 2020-11-29 21:55

    In pandas the info() method creates a very similar output like R's str():

    > str(train)
    'data.frame':   891 obs. of  13 variables:
     $ PassengerId: int  1 2 3 4 5 6 7 8 9 10 ...
     $ Survived   : int  0 1 1 1 0 0 0 0 1 1 ...
     $ Pclass     : int  3 1 3 1 3 3 1 3 3 2 ...
     $ Name       : Factor w/ 891 levels "Abbing, Mr. Anthony",..: 109 191 358 277 16 559 520 629 417 581 ...
     $ Sex        : Factor w/ 2 levels "female","male": 2 1 1 1 2 2 2 2 1 1 ...
     $ Age        : num  22 38 26 35 35 NA 54 2 27 14 ...
     $ SibSp      : int  1 1 0 1 0 0 0 3 0 1 ...
     $ Parch      : int  0 0 0 0 0 0 0 1 2 0 ...
     $ Ticket     : Factor w/ 681 levels "110152","110413",..: 524 597 670 50 473 276 86 396 345 133 ...
     $ Fare       : num  7.25 71.28 7.92 53.1 8.05 ...
     $ Cabin      : Factor w/ 148 levels "","A10","A14",..: 1 83 1 57 1 1 131 1 1 1 ...
     $ Embarked   : Factor w/ 4 levels "","C","Q","S": 4 2 4 4 4 3 4 4 4 2 ...
     $ Child      : num  0 0 0 0 0 NA 0 1 0 1 ...
    
    
    train.info()
    <class 'pandas.core.frame.DataFrame'>
    RangeIndex: 891 entries, 0 to 890
    Data columns (total 12 columns):
    PassengerId    891 non-null int64
    Survived       891 non-null int64
    Pclass         891 non-null int64
    Name           891 non-null object
    Sex            891 non-null object
    Age            714 non-null float64
    SibSp          891 non-null int64
    Parch          891 non-null int64
    Ticket         891 non-null object
    Fare           891 non-null float64
    Cabin          204 non-null object
    Embarked       889 non-null object
    dtypes: float64(2), int64(5), object(5)
    memory usage: 83.6+ KB
    
    0 讨论(0)
  • 2020-11-29 21:55

    This provides output similar to R's str(). It presents unique values instead of initial values.

    def rstr(df): return df.shape, df.apply(lambda x: [x.unique()])
    
    print(rstr(iris))
    
    ((150, 5), sepal_length    [[5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.4, 4.8, 4.3,...
    sepal_width     [[3.5, 3.0, 3.2, 3.1, 3.6, 3.9, 3.4, 2.9, 3.7,...
    petal_length    [[1.4, 1.3, 1.5, 1.7, 1.6, 1.1, 1.2, 1.0, 1.9,...
    petal_width     [[0.2, 0.4, 0.3, 0.1, 0.5, 0.6, 1.4, 1.5, 1.3,...
    class            [[Iris-setosa, Iris-versicolor, Iris-virginica]]
    dtype: object)
    
    0 讨论(0)
  • 2020-11-29 21:59

    I still prefer str() because it list some examples. A confusing aspect of info is that its behavior depends on some environment settings like pandas.options.display.max_info_columns.

    I think the best alternative is to call info with some other parameters that will force a fixed behavior:

    df.info(null_counts=True, verbose=True)
    

    And for your other functions:

    summary(df)     | df.describe()
    head(df)        | df.head()
    dim(df)         | df.shape
    
    0 讨论(0)
  • 2020-11-29 22:08

    For a Python equivalent to the str() function in R, I use the method dtypes. This will provide the data types for each column.

    In [22]: df2.dtypes
    Out[22]: 
    Survived      int64
    Pclass        int64
    Sex          object
    Age         float64
    SibSp         int64
    Parch         int64
    Ticket       object
    Fare        float64
    Cabin        object
    Embarked     object
    dtype: object
    
    0 讨论(0)
  • 2020-11-29 22:08

    I don't know much about R, but here are some leads:

    str => 
    

    difficult one... for functions you can use dir(), dir() on datasets will give you all the methods, so maybe that's not what you want...

    summary => describe. 
    

    See the parameters to customize the results.

    head => your can use head(), or use slices. 
    

    head as you already do. To get the first 10 rows of a dataset called ds ds[:10] same for tail ds[:-10]

    0 讨论(0)
  • 2020-11-29 22:09
    • summary() ~ describe()
    • head() ~ head()

    I'm not sure about the str() equivalent.

    0 讨论(0)
提交回复
热议问题