How to scrape not well structured html tables with Beautifulsoup in Python?
问题 This website https://itportal.ogauthority.co.uk/information/well_data/lithostratigraphy_hierarchy/rptLithoStrat_1Page2.html seems have a not well organized html table. the only identifier of table cells are width inside each tr tag. I want to scrape the information of all 60 pages. How I can find a way to scrape each row of tables appropriately? I know the size of header is 10 columns but since for some tr tags, I have 5 td tags and for some other I have more or less td tags, it's not easy to