I am trying to find out if there are any principles in defining which pages should be gzip-compressed and to draw a line when to send plain html content.
It would be he
We made the decision to gzip all content since spending time determining what to gzip or what not to gzip did not seem worth the effort. The overhead of gzipping everything is not significantly higher than gzipping nothing.
This webpage suggests:
"Servers choose what to gzip based on file type, but are typically too limited in what they decide to compress. Most web sites gzip their HTML documents. It's also worthwhile to gzip your scripts and stylesheets, but many web sites miss this opportunity. In fact, it's worthwhile to compress any text response including XML and JSON. Image and PDF files should not be gzipped because they are already compressed. Trying to gzip them not only wastes CPU but can potentially increase file sizes."
If you care about cpu time, I would suggest not gzipping already compressed content. Remember when adding complexity to a system that Programmers/sys admins are expensive, servers are cheap.