How to get the last table from this site?

人盡茶涼 提交于 2019-12-11 15:56:43

问题


I'm trying to get the last table from this site with python. Below you find my actual trying to do it.

The table is named as "Dados Colocação, nos Termos do Anexo VII da Instrução CVM nº 400, de 2003".

lin_cvm_oferta = 'http://web.cvm.gov.br/app/esforcosrestritos/#/enviarFormularioEncerramento?type=dmlldw%3D%3D&ofertaId=MTE3NDE%3D&state=eyJhbm8iOiJNakF4T0E9PSIsInZhbG9yIjoiTVRVPSIsImNvbXVuaWNhZG8iOiJNUT09Iiwic2l0dWFjYW8iOiJNZz09In0%3D'
html = requests.get(lin_cvm_oferta).text
print(html)

And when I print the html it doesn't get any of the data.

The first part of the table I already got with Json as my friend @JackFleeting helped me in this other question (here). PS: I know that there is a similar solution here. But I don't want to use Selenium.


回答1:


This one is different from your previous question - the page uses the post, not get method. You have to use the developer/network/xhr tool in your browser to extract the url and the payload, and then post it like this:

import requests      
import json  

url = 'http://web.cvm.gov.br/app/esforcosrestritos/comunicado/getUltimoComunicado'

payload = {"id":931,"dataInclusao":"2016-05-20T09:26:00Z", "dataInicio":"2016-05-18T00:00:00Z","dataEnceramento":"2016-07-05T00:00:00Z", "numeroEmissao":1,"quantidadeSerie":140,"valorMobiliario":{"id":11,
    "dataInclusao":"2015-12-01T00:00:00Z",
    "descricao":"CERTIFICADOS DE RECEBÍVEIS IMOBILIÁRIOS - CRI",
    "relacionadoFundoInvestimento":False,"situacao":"ATIVO"},
    "tipoEspecie":{"id":3,"descricao":"Sem Preferência"},
    "tipoClasse":{"id":4,"descricao":"Não Aplicável"},
    "tipoOferta":{"id":1,"descricao":"Primária"},"tipoForma":{"id":3,"descricao":"Nominativa e Escritural"},"ofertante":{"id":1860,"nomeResponsavel":"RB CAPITAL COMPANHIA DE SECURITIZAÇÃO","cnpj":2773542000122,"paginaWeb":"http://www.rbcapital.com/","tipoSocietario":{"id":4,"descricao":"Sociedade Anônima de Capital Aberto"}},"emissor":{"id":1859,"nomeResponsavel":"RB CAPITAL COMPANHIA DE SECURITIZAÇÃO","cnpj":2773542000122,"paginaWeb":"http://www.rbcapital.com/","tipoSocietario":{"id":4,"descricao":"Sociedade Anônima de Capital Aberto"}},"lider":{"id":931,"nrPfPj":17298092000130,"dataRegistro":"1998-10-15T00:00:00Z","codigoTipoPessoa":"PJ","codigoTipoParticipante":12},"instituicoesIntermediarias":[{"id":1089,"nrPfPj":59588111000103,"dataRegistro":"1991-08-12T00:00:00Z","codigoTipoPessoa":"PJ","codigoTipoParticipante":12,"denominacaoSocial":"BANCO VOTORANTIM SA"},{"id":1090,"nrPfPj":90400888000142,"dataRegistro":"1990-12-20T00:00:00Z","codigoTipoPessoa":"PJ","codigoTipoParticipante":12,"denominacaoSocial":"BANCO SANTANDER (BRASIL) S.A."}],
               "valorPrecoUnitario":"1.000,00","inativo":False,
               "qtdValoresMobiliarios":0,"valorTotalOferta":0,"variasSeries":True}


headers = {'content-type': 'application/json'}

resp = requests.post(url, data=json.dumps(payload), headers=headers)    
data = json.loads(resp.content)
print(data)

Note that, depending on your IDE, you may have to manually change boolean values to True and False (uppercase, as I did above), although the site's post request itself uses lowercase.



来源:https://stackoverflow.com/questions/58341926/how-to-get-the-last-table-from-this-site

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!