I am parsing pages using Simple DOM parser. It is neat, but I would like to get the applied css style for each element. Not only the inline styles, but every style it applie
As Martin says, in doing this you're almost writing a browser in PHP - it's a big ask! As with any big project, the key is to break it down into more manageable steps (although some of these aren't exactly straightforward).
You'll need to:
I wouldn't say it's impossible, as things like MPDF do almost the same thing (and may provide a good starting point) but I don't think there's a neat quick-fix.
That's a pretty tall order. Consider this simple example:
<style>
p .foo {
color: yellow;
}
span > *[href] {
color: red;
}
img + .foo {
color: green;
}
span #bar {
color: blue;
}
.baz #bar {
color: black;
}
</style>
<p class="baz">Lorem ipsum <span>dolor sit
<img src="x.png"><a id="bar" class="foo" href="#top">amet</a>,</span>
consectetur adipiscing elit.
</p>
What color is the link? Each of the 5 styles applies directly to the link element. Even when you only consider CSS2.1, you still have 3 styles to process.
As Gumbo says, without a full CSS parser and interpreter, this cannot be solved. I haven't seen one written in PHP yet, although it should be theoretically possible to write one.
(There are classes for CSS parsing, yes - see the answers to this question, but those would only tell you "for this file, you have these CSS declarations". The interpreter is the hardest part, and I'm not aware of a PHP one)
Your best bet would be rendering the page in some webpage rendering engine (e.g. Gecko or Webkit) and querying the CSS properties. That, unfortunately, is far beyond the scope of a simple PHP class.
You might wanna check out the CSS part of QUAIL accessibility library - we needed that feature too and have been basically building a psuedo-browser that is based on DOMDocument. Because of some of the weird things with Xpaths in DOMDocument we had to hack an additional attribute to every node on the page that acts as a pointer to a central array of computed styles, but we're about 70% of the way there in terms of passing the W3C tests.