pandas | 易学教程

Python Pandas set_index function: KeyError: “None of [] are in the columns”

阅读更多关于 Python Pandas set_index function: KeyError: “None of [] are in the columns”

问题 I am currently going through the book "Hands-On machine learning... " by Aurèlion Gèron. However, I am getting the following error message: (it is somewhat cumbersome to reproduce because the following two CSV downloads are required: OECD IMF. Error message : File "C:\Users\xxx\Miniconda3\lib\site-packages\pandas\core\frame.py", line 4548, in set_index raise KeyError(f"None of {missing} are in the columns") KeyError: "None of ['Country'] are in the columns" The code: import matplotlib.pyplot

Finding rows with same column values in pandas dataframe

阅读更多关于 Finding rows with same column values in pandas dataframe

问题 I have two dataframes with different column size, where four columns can have the same values in both dataframes. I want to make a new column in df1, that takes the value 1 if there is a row in df2 that has the same values for column 'A','B','C', and 'D' as a row in df1. If there isn't such a row, I want the value to be 0. Rows 'E' and 'F' are not important for checking the values. Is there a pandas function that can do this, or do I have to this in a loop. For example: df1 = A B C D E F 1 1

python pandas use map with regular expressions

阅读更多关于 python pandas use map with regular expressions

问题 I have a dict: dealer = { 'ESSELUNGA': 'Spesa', 'DECATHLON 00000120': 'Sport', 'LEROY MERLIN': 'Casa', 'CONAD 8429': 'Spesa', 'IKEA': 'Casa', 'F.LLI MADAFFARI': 'Spesa', 'SUPERMERCATO IL GIGANT': 'Spesa', 'NATURASI SPA': 'Spesa', 'ESSELUNGA SETTIMO MILANE': 'Spesa' } and I want to map it to a pandas df: entries.Categoria = entries.Commerciante.map(dealer) Is there a way to use regex to match map on "Commerciante" column? In this way I can rewrite dealer as this: dealer = { 'ESSELUNGA': 'Spesa

show error bar in multi line plot using matplotlib

阅读更多关于 show error bar in multi line plot using matplotlib

问题 I've created a multi line plot using marplot lib, and now I want to show the min-max value for each parameter on X-axis. My code is below: import numpy as np import pandas as pd from pandas import DataFrame import matplotlib.pyplot as plt from matplotlib import pyplot as plt import seaborn as sns df = pd.DataFrame({'Time': ['D=0','D=2','D=5','D=X'], 'Latency': [74.92, 75.32, 79.64, 100], 'Delay': [18.2,80,82,84] }) plt.plot( 'Time', 'Latency', data=df, marker='s', color='black', markersize=4,

show error bar in multi line plot using matplotlib

阅读更多关于 show error bar in multi line plot using matplotlib

Plotly dash refreshing global data on reload

阅读更多关于 Plotly dash refreshing global data on reload

问题 Imagine I have a dash application where I want the global data to refresh on page reload. I'm using a function to serve the layout as described here. However, I'm note sure how/where I should define df such that I can use it in callbacks (like in a case where I'd like to subset the df based on some input and pass it to a layout table). My code below reloads the data on page refresh, but the callback cannot access the df . I'm very new to dash so apologies in advance for potentially dumb

Plotly dash refreshing global data on reload

阅读更多关于 Plotly dash refreshing global data on reload

Moving average on pandas.groupby object that respects time

阅读更多关于 Moving average on pandas.groupby object that respects time

问题 Given a pandas dataframe in the following format: toy = pd.DataFrame({ 'id': [1,2,3, 1,2,3, 1,2,3], 'date': ['2015-05-13', '2015-05-13', '2015-05-13', '2016-02-12', '2016-02-12', '2016-02-12', '2018-07-23', '2018-07-23', '2018-07-23'], 'my_metric': [395, 634, 165, 144, 305, 293, 23, 395, 242] }) # Make sure 'date' has datetime format toy.date = pd.to_datetime(toy.date) The my_metric column contains some (random) metric I wish to compute a time-dependent moving average of, conditional on the

Handling error “TypeError: Expected tuple, got str” loading a CSV to pandas multilevel and multiindex (pandas)

阅读更多关于 Handling error “TypeError: Expected tuple, got str” loading a CSV to pandas multilevel and multiindex (pandas)

问题 I'm trying to load a CSV file (this file) to create a multiindex e multilevel dataframe. It has 5(five) indexes and 3(three) levels in columns. How I can do? Here is the code: df = pd.read_csv('./teste.csv' ,index_col=[0,1,2,3,4] ,header=[0,1,2,3] ,skipinitialspace=True ,tupleize_cols=True) df.columns = pd.MultiIndex.from_tuples(df.columns) Expected output: variables u \ level 1 days 1 2 times 00h 06h 12h 18h 00h wsid lat lon start prcp_24 329 -43.969397 -19.883945 2007-03-18 10:00:00 72.0 0

subtracting two columns from pandas dataframe and store the result in third column [closed]

阅读更多关于 subtracting two columns from pandas dataframe and store the result in third column [closed]

问题 Closed. This question needs debugging details. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 3 years ago . Improve this question I have a DataFrame, df , with 3 columns and I want to perform subtraction as follows: df['available'] = df['recommended'] - df['manual input'] But I am getting an error stating: unsupported operand type(s) for -: 'int' and 'str' I have also tried doing df['available'] = df[