An example of how micro-optimization leads to macro slowdowns:
If you're seriously considering manually inlining functions, consider manually unrolling loops.
JMPs are expensive, and if you can eliminate loops by unrolling and also eliminate all conditional blocks, you'll eliminate all that time wasted merely seeking around the CPU's cache.
Variable augmentation at runtime is slow too, as is pulling things out of a database, so you should inline all that data into your code as well.
Actually, loading up an interpreter for merely executing code and copying memory out to a user is exhaustively wasteful, why don't we just pre-compute all the possible pages and store each page in memory ready to go so its just a mem-copy? surely thats fast!
Ah, now we've got that slow thing called the internet between us, which is hindering user experience and limiting how much content we can use, how about we pre-compute the pages in advance, and archive them all and run them on the users local machine? that'll be really fast!
But that's going to waste cpu cycles, lots of them, what with page load time and browser content rendering etc, we'll skip the middleman and just deliver the pages to them on printed media!. Genius!.
/me watches your company collapse on its face while you spend 10 years precomputing (by hand) and printing pages nobody wants to see.
This may sound silly to you, but to the rest of us, what you proposed is just that ridiculous.
Optimisation is good, but draw the line somewhere sensible so you don't have to worry about future people whom work on the code tracking you down in your sleep for having such a crappy codebase thats unmaintainable.
note: yes, I use gentoo. how did you guess?