compute the average height and the average width of div tag

前端 未结 1 1697
春和景丽
春和景丽 2021-01-26 22:23

I have need to get the average div height and width of an html doc.

I have try this solution but it doesn\'t work:

import numpy as np
average_width = np.         


        
相关标签:
1条回答
  • 2021-01-26 22:27

    There may be better way-

    Way -1

    Below is my tested code to extract width and height.

    from bs4 import BeautifulSoup
    
    html_doc = '''<div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:45px; top:81px; width:127px; height:9px;">
        <span style="font-family: EICMDA+AdvTrebu-R; font-size:8px">Journal of     Infection (2015) 
        </span>
        <span style="font-family: EICMDB+AdvTrebu-B; font-size:8px">xx</span>
        <span style="font-family: EICMDA+AdvTrebu-R; font-size:8px">, 1</span>
        <span style="font-family: EICMDD+AdvPS44A44B; font-size:7px">e</span>
        <span style="font-family: EICMDA+AdvTrebu-R; font-size:8px">4
        <br/>
        </span>
    </div>'''
    
    soup = BeautifulSoup(html_doc,'html.parser')    
    my_att = [i.attrs['style'] for  i in soup.find_all("div")]
    dd = ''.join(my_att).split(";")
    dd_cln= filter(None, dd)
    dd_cln= [i.strip() for i in dd_cln ]
    my_dict = dict(i.split(':') for i  in dd_cln)
    print my_dict['width']
    

    Way-2 Use regular expression as described here.

    Working code-

    import numpy as np
    import re
    from bs4 import BeautifulSoup
    
    html_doc = '''<div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:45px; top:81px; width:127px; height:9px;">
        <span style="font-family: EICMDA+AdvTrebu-R; font-size:8px">Journal of     Infection (2015) 
        </span>
        <span style="font-family: EICMDB+AdvTrebu-B; font-size:8px">xx</span>
        <span style="font-family: EICMDA+AdvTrebu-R; font-size:8px">, 1</span>
        <span style="font-family: EICMDD+AdvPS44A44B; font-size:7px">e</span>
        <span style="font-family: EICMDA+AdvTrebu-R; font-size:8px">4
        <br/>
        </span>
    </div>'''
    
    soup = BeautifulSoup(html_doc,'html.parser')    
    my_att = [i.attrs['style'] for  i in soup.find_all("div")]
    css = ''.join(my_att)
    print css
    width_list = map(float,re.findall(r'(?<=width:)(\d+)(?=px;)', css))
    height_list = map(float,re.findall(r'(?<=height:)(\d+)(?=px;)', css))
    print np.mean(height_list)
    print np.mean(width_list)
    
    0 讨论(0)
提交回复
热议问题