pandas

Python Pandas set_index function: KeyError: “None of [] are in the columns”

自古美人都是妖i 提交于 2021-02-19 06:47:06
问题 I am currently going through the book "Hands-On machine learning... " by Aurèlion Gèron. However, I am getting the following error message: (it is somewhat cumbersome to reproduce because the following two CSV downloads are required: OECD IMF. Error message : File "C:\Users\xxx\Miniconda3\lib\site-packages\pandas\core\frame.py", line 4548, in set_index raise KeyError(f"None of {missing} are in the columns") KeyError: "None of ['Country'] are in the columns" The code: import matplotlib.pyplot

Finding rows with same column values in pandas dataframe

帅比萌擦擦* 提交于 2021-02-19 06:37:07
问题 I have two dataframes with different column size, where four columns can have the same values in both dataframes. I want to make a new column in df1, that takes the value 1 if there is a row in df2 that has the same values for column 'A','B','C', and 'D' as a row in df1. If there isn't such a row, I want the value to be 0. Rows 'E' and 'F' are not important for checking the values. Is there a pandas function that can do this, or do I have to this in a loop. For example: df1 = A B C D E F 1 1

python pandas use map with regular expressions

走远了吗. 提交于 2021-02-19 06:36:05
问题 I have a dict: dealer = { 'ESSELUNGA': 'Spesa', 'DECATHLON 00000120': 'Sport', 'LEROY MERLIN': 'Casa', 'CONAD 8429': 'Spesa', 'IKEA': 'Casa', 'F.LLI MADAFFARI': 'Spesa', 'SUPERMERCATO IL GIGANT': 'Spesa', 'NATURASI SPA': 'Spesa', 'ESSELUNGA SETTIMO MILANE': 'Spesa' } and I want to map it to a pandas df: entries.Categoria = entries.Commerciante.map(dealer) Is there a way to use regex to match map on "Commerciante" column? In this way I can rewrite dealer as this: dealer = { 'ESSELUNGA': 'Spesa

show error bar in multi line plot using matplotlib

落花浮王杯 提交于 2021-02-19 06:14:53
问题 I've created a multi line plot using marplot lib, and now I want to show the min-max value for each parameter on X-axis. My code is below: import numpy as np import pandas as pd from pandas import DataFrame import matplotlib.pyplot as plt from matplotlib import pyplot as plt import seaborn as sns df = pd.DataFrame({'Time': ['D=0','D=2','D=5','D=X'], 'Latency': [74.92, 75.32, 79.64, 100], 'Delay': [18.2,80,82,84] }) plt.plot( 'Time', 'Latency', data=df, marker='s', color='black', markersize=4,

show error bar in multi line plot using matplotlib

让人想犯罪 __ 提交于 2021-02-19 06:13:44
问题 I've created a multi line plot using marplot lib, and now I want to show the min-max value for each parameter on X-axis. My code is below: import numpy as np import pandas as pd from pandas import DataFrame import matplotlib.pyplot as plt from matplotlib import pyplot as plt import seaborn as sns df = pd.DataFrame({'Time': ['D=0','D=2','D=5','D=X'], 'Latency': [74.92, 75.32, 79.64, 100], 'Delay': [18.2,80,82,84] }) plt.plot( 'Time', 'Latency', data=df, marker='s', color='black', markersize=4,

Plotly dash refreshing global data on reload

家住魔仙堡 提交于 2021-02-19 06:11:14
问题 Imagine I have a dash application where I want the global data to refresh on page reload. I'm using a function to serve the layout as described here. However, I'm note sure how/where I should define df such that I can use it in callbacks (like in a case where I'd like to subset the df based on some input and pass it to a layout table). My code below reloads the data on page refresh, but the callback cannot access the df . I'm very new to dash so apologies in advance for potentially dumb

Plotly dash refreshing global data on reload

自古美人都是妖i 提交于 2021-02-19 06:11:10
问题 Imagine I have a dash application where I want the global data to refresh on page reload. I'm using a function to serve the layout as described here. However, I'm note sure how/where I should define df such that I can use it in callbacks (like in a case where I'd like to subset the df based on some input and pass it to a layout table). My code below reloads the data on page refresh, but the callback cannot access the df . I'm very new to dash so apologies in advance for potentially dumb

Moving average on pandas.groupby object that respects time

百般思念 提交于 2021-02-19 06:01:12
问题 Given a pandas dataframe in the following format: toy = pd.DataFrame({ 'id': [1,2,3, 1,2,3, 1,2,3], 'date': ['2015-05-13', '2015-05-13', '2015-05-13', '2016-02-12', '2016-02-12', '2016-02-12', '2018-07-23', '2018-07-23', '2018-07-23'], 'my_metric': [395, 634, 165, 144, 305, 293, 23, 395, 242] }) # Make sure 'date' has datetime format toy.date = pd.to_datetime(toy.date) The my_metric column contains some (random) metric I wish to compute a time-dependent moving average of, conditional on the

Handling error “TypeError: Expected tuple, got str” loading a CSV to pandas multilevel and multiindex (pandas)

生来就可爱ヽ(ⅴ<●) 提交于 2021-02-19 05:44:09
问题 I'm trying to load a CSV file (this file) to create a multiindex e multilevel dataframe. It has 5(five) indexes and 3(three) levels in columns. How I can do? Here is the code: df = pd.read_csv('./teste.csv' ,index_col=[0,1,2,3,4] ,header=[0,1,2,3] ,skipinitialspace=True ,tupleize_cols=True) df.columns = pd.MultiIndex.from_tuples(df.columns) Expected output: variables u \ level 1 days 1 2 times 00h 06h 12h 18h 00h wsid lat lon start prcp_24 329 -43.969397 -19.883945 2007-03-18 10:00:00 72.0 0

subtracting two columns from pandas dataframe and store the result in third column [closed]

我的梦境 提交于 2021-02-19 05:42:54
问题 Closed. This question needs debugging details. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 3 years ago . Improve this question I have a DataFrame, df , with 3 columns and I want to perform subtraction as follows: df['available'] = df['recommended'] - df['manual input'] But I am getting an error stating: unsupported operand type(s) for -: 'int' and 'str' I have also tried doing df['available'] = df[