问题
Made an api get request for the historical close prices of a stock for a specified company from the financialmodelingprep api. It returns every recorded date for the stock. The problem is that i need only the last date of the last 5 years, in order to compare it to the financial statements. Does anyone know how to filter the dataset to get the last date of the year, without specifying the exact date? The goal is to export the table to csv format and further combine it with other companies.
Is there a better way to get the result that i need?
symbols = ["MMM",
"ABT",
"ABBV",
"ABMD",
"ACN",
]
import requests
import pandas as pd
import datetime
API_KEY = 'my_key'
api_stock_price_url = "https://financialmodelingprep.com/api/v3/historical-price-full/" + symbols[0] + "?serietype=line&apikey=" + API_KEY
company_stock_price = pd.read_json(api_stock_price_url)
date_and_close = pd.json_normalize(company_stock_price["historical"])
company_stock_price["date"] = date_and_close["date"]
company_stock_price["close"] = date_and_close["close"]
company_stock_price.index = [company_stock_price["date"]]
# CHANGES THE INDEX TO BE THE NORMALIZED DATE
company_stock_price["date"] = pd.to_datetime(company_stock_price["date"])
# CHANGES THE FORMAT TO DATE
del company_stock_price['historical']
del company_stock_price['date']
# deletes the unwanted columns
the retuned company_stock_price table
symbol close
date
2020-12-04 MMM 172.460007
2020-12-03 MMM 171.830002
2020-12-02 MMM 171.850006
2020-12-01 MMM 170.520004
2020-11-30 MMM 172.729996
... ... ...
1970-09-14 MMM 0.322600
1970-09-11 MMM 0.321700
1970-09-10 MMM 0.323500
1970-09-09 MMM 0.324000
1970-09-08 MMM 0.318800
12675 rows × 2 columns
the desired output i need would look something like this:
symbol close
date
2020-12-31 MMM 172.460007
2019-12-31 MMM 131.112123
2018-12-31 MMM 123.123123
2017-12-31 MMM 111.111111
2016-11-31 MMM 101.111111
the problem in this case is that i cannot specify the exact date, because some of the s&p500 companies(which i am going for
to loop over) are missing the stock price for that date in the returned api responses.
回答1:
df2 = df.groupby(pd.DatetimeIndex(df['date']).year,
as_index=False).agg({'date': max}).reset_index(drop=True)
date symbol close
0 1970-09-14 MMM 0.322600
1 2020-12-04 MMM 172.460007
Here the dataframe is grouped by the year of date
column, then the rows with maximum date per year are returned. Then you can sort it by date and get the five last rows:
df2.sort_values('date').iloc[-5:]
回答2:
You can get the recent date and use .loc()
for getting the desired row like so:
recent_date = company_stock_price["date"].max()
latest_stock_price = company_stock_price.loc[company_stock_price['date'] == recent_date]
回答3:
You can use pandas function called iloc. This function gives out a certain number of rows of your pd dataframe. So you could ask for information from it like a list, ex: pandas.iloc[-1]
This is an example of how it works:
mydict = [{'a': 1, 'b': 2, 'c': 3, 'd': 4},
{'a': 100, 'b': 200, 'c': 300, 'd': 400},
{'a': 1000, 'b': 2000, 'c': 3000, 'd': 4000 }]
df = pd.DataFrame(mydict)
df.iloc[-1]
outputs the last row:
a 1000
b 2000
c 3000
d 4000
Name: 2, dtype: int64
using pd.iloc[0] outputs the first row:
a 1
b 2
c 3
d 4
Name: 0, dtype: int64
Now if you wanted the last date every 5 years you could loop over the array and take ranges of data. In this case 5 year ranges. So it would be something like this:
arrayofData=company_stock_price['close']
every5YearsData = []
for i in range(len(arrayofData)):
fiveYearList=[]
if arrayofData[i] % 5!=0:
fiveYearList.append(arrayofData[i])
else:
every5YearsData.append(fiveYearList[0])
来源:https://stackoverflow.com/questions/65183759/return-only-the-last-day-of-the-year-with-pandas