get financial data using Python

点点圈 提交于 2021-02-11 16:31:43

问题


I have managed to write some Python code and Selenium that navigates to a webpage that contains financial data that is in some tables.

I want to be able to extract the data and put it into excel.

The tables seem to be html based tables code below:

                <tr>
                <td class="bc2T bc2gt">Last update</td>
                <td class="bc2V bc2D">03/15/2018</td><td class="bc2V bc2D">03/14/2019</td><td class="bc2V bc2D">03/12/2020</td><td class="bc2V bc2D" style="background-color:#DEFEFE;">05/22/2020</td><td class="bc2V bc2D" style="background-color:#DEFEFE;">05/20/2020</td><td class="bc2V bc2D" style="background-color:#DEFEFE;">05/18/2020</td> 
            </tr>
    </table>

The table has the following class name: <table class='BordCollapseYear2' style="margin-right:20px; font-size:12px; width:100%;" cellspacing=0>

Is there a way I can extract this data? Ideally I want this to be dynamic so that it can extract information for different companies.

I've never used it before, but I've seen BeautifulSoup library mentioned a few times.

https://www.marketscreener.com/MICROSOFT-CORPORATION-4835/financials/

As an example Microsoft. I'd want to extract the income statement data, balance sheet etc.


回答1:


This script will scrape all tables found on the page and pretty prints them:

import requests
from bs4 import BeautifulSoup

url = 'https://www.marketscreener.com/MICROSOFT-CORPORATION-4835/financials/'

soup = BeautifulSoup(requests.get(url).content, 'html.parser')

all_data = {}
# for every table found on page...
for table in soup.select('table.BordCollapseYear2'):
    table_name = table.find_previous('b').text
    all_data[table_name] = []
    # ..scrap every row
    for tr in table.select('tr'):
        row = [td.get_text(strip=True, separator=' ') for td in tr.select('td')]
        if len(row) == 7:
            all_data[table_name].append(row)

#pretty print all data:
for k, v in all_data.items():
    print('Table name: {}'.format(k))
    print('-' * 160)
    for row in v:
        print(('{:<25}'*7).format(*row))
    print()

Prints:

Table name: Valuation
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Fiscal Period: June      2017                     2018                     2019                     2020                     2021                     2022                     
Capitalization 1         532 175                  757 640                  1 026 511                1 391 637                -                        -                        
Entreprise Value (EV) 1  485 388                  700 112                  964 870                  1 315 823                1 299 246                1 276 659                
P/E ratio                25,4x                    46,3x                    26,5x                    32,3x                    29,7x                    25,8x                    
Yield                    2,26%                    1,70%                    1,37%                    1,10%                    1,18%                    1,31%                    
Capitalization / Revenue 5,51x                    6,87x                    8,16x                    9,81x                    8,89x                    7,95x                    
EV / Revenue             5,02x                    6,34x                    7,67x                    9,28x                    8,30x                    7,30x                    
EV / EBITDA              12,7x                    15,4x                    17,7x                    20,2x                    18,3x                    15,9x                    
Cours sur Actif net      7,46x                    9,15x                    10,0x                    12,1x                    10,1x                    8,49x                    
Nbr of stocks (in thousands)7 720 510                7 683 198                7 662 818                7 583 440                -                        -                        
Reference price (USD)    68,9                     98,6                     134                      184                      184                      184                      
Last update              07/20/2017               07/19/2018               07/18/2019               05/08/2020               04/30/2020               04/30/2020               

Table name: Annual Income Statement Data
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Fiscal Period: June      2017                     2018                     2019                     2020                     2021                     2022                     
Net sales 1              96 657                   110 360                  125 843                  141 818                  156 534                  174 945                  
EBITDA 1                 38 117                   45 319                   54 641                   65 074                   70 966                   80 445                   
Operating profit (EBIT) 129 339                   35 058                   42 959                   52 544                   57 045                   65 289                   
Operating Margin         30,4%                    31,8%                    34,1%                    37,1%                    36,4%                    37,3%                    
Pre-Tax Profit (EBT) 1   23 149                   36 474                   43 688                   52 521                   57 042                   65 225                   
Net income 1             21 204                   16 571                   39 240                   43 693                   47 223                   53 905                   
Net margin               21,9%                    15,0%                    31,2%                    30,8%                    30,2%                    30,8%                    
EPS 2                    2,71                     2,13                     5,06                     5,68                     6,18                     7,11                     
Dividend per Share 2     1,56                     1,68                     1,84                     2,02                     2,16                     2,41                     
Last update              07/20/2017               07/19/2018               07/18/2019               05/22/2020               05/22/2020               05/22/2020               

Table name: Balance Sheet Analysis
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Fiscal Period: June      2017                     2018                     2019                     2020                     2021                     2022                     
Net Debt 1               -                        -                        -                        -                        -                        -                        
Net Cash position 1      46 787                   57 528                   61 641                   75 814                   92 392                   114 978                  
Leverage (Debt / EBITDA) -1,23x                   -1,27x                   -1,13x                   -1,17x                   -1,30x                   -1,43x                   
Free Cash Flow 1         31 378                   32 252                   38 260                   41 953                   46 887                   53 155                   
ROE (Net Profit / Equities)29,4%                    19,4%                    42,4%                    36,6%                    34,5%                    36,1%                    
Shareholders' equity 1   72 195                   85 215                   92 524                   119 417                  136 690                  149 484                  
ROA (Net Profit / Asset) 9,76%                    6,51%                    14,4%                    18,5%                    14,6%                    14,7%                    
Assets 1                 217 276                  254 580                  272 703                  235 800                  323 445                  366 702                  
Book Value Per Share 2   9,24                     10,8                     13,4                     15,2                     18,2                     21,6                     
Cash Flow per Share 2    5,04                     5,63                     6,73                     7,03                     8,02                     9,79                     
Capex 1                  8 129                    11 632                   13 925                   15 698                   17 922                   19 507                   
Capex / Sales            8,41%                    10,5%                    11,1%                    11,1%                    11,4%                    11,2%                    
Last update              07/20/2017               07/19/2018               07/18/2019               05/22/2020               05/22/2020               05/04/2020               

EDIT (to save all_data as csv file):

import csv

with open('data.csv', 'w', newline='') as csvfile:
    spamwriter = csv.writer(csvfile, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
    for k, v in all_data.items():
        spamwriter.writerow([k])
        for row in v:
            spamwriter.writerow(row)

Screenshot from LibreOffice:



来源:https://stackoverflow.com/questions/61974854/get-financial-data-using-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!