Using Beautiful Soup to convert CSS attributes to individual HTML attributes?

后端 未结 2 608
南笙
南笙 2021-01-06 21:36

I\'m trying to write a program that will take an HTML file and make it more email friendly. Right now all the conversion is done manually because none of the online converte

相关标签:
2条回答
  • 2021-01-06 22:28

    Instead of reinvent the wheel use the stoneage package http://pypi.python.org/pypi/StoneageHTML

    0 讨论(0)
  • 2021-01-06 22:34

    For this type of thing, I'd recommend an HTML parser (like BeautifulSoup or lxml) in conjunction with a specialized CSS parser. I've had success with the cssutils package. You'll have a much easier time than trying to come up with regular expressions to match any possible CSS you might find in the wild.

    For example:

    >>> import cssutils
    >>> css = 'width:150px;height:50px;float:right;'
    >>> s = cssutils.parseStyle(css)
    >>> s.width
    u'150px'
    >>> s.height
    u'50px'
    >>> s.keys()
    [u'width', u'height', u'float']
    >>> s.cssText
    u'width: 150px;\nheight: 50px;\nfloat: right'
    >>> del s['width']
    >>> s.cssText
    u'height: 50px;\nfloat: right'
    

    So, using this you can pretty easily extract and manipulate the CSS properties you want and plug them into the HTML directly with BeautifulSoup. Be a little careful of the newline characters that pop up in the cssText attribute, though. I think cssutils is more designed for formatting things as standalone CSS files, but it's flexible enough to mostly work for what you're doing here.

    0 讨论(0)
提交回复
热议问题