I could not find an objective study regarding ARC performance impact in a real life project. The official doc says
The compiler efficiently eliminates many
Of course temporary variables are strong be default. That's explicit and clearly documented. And, if you think things through, it's what people will typically want.
MyWidget *widget=[[MyWidget alloc]init]; // 1
[myWidget mill]; // 2
If widget isn't strong, the new MyWidget will be created in line 1 and can be released and zeroed before line 2!
Now, it's certainly true that if you use lots of temporary variables -- for example, if you are rigorously obeying the Law of Demeter -- in the middle of a tight loop, and if you're assuming that those temporary variables have no performance cost at all because the world has plenty of registers, then you're going to be surprised.
And that might be the corner you're inhabiting right now.
But that's an exotic and special place! Most code isn't in the middle of a tight loop. Most tight loops aren't performance bottlenecks. And most tight loops don't need lots of intermediate variables.
Conversely, ARC can do the autorelease optimization in ways that you can't do manually (though perhaps the optimizer can). So, if there's a function returning an autoreleased variable in your tight loop, you may be better off with ARC.
Premature optimization is a bad idea. You may be in an inescapable performance corner, but most people are not. I spend most of my time in OS X, to be sure, but it's been years since I've had a performance issue where the answer wasn't a better algorithm.
(Finally, if ARC is causing a 70% performance hit to your application, then you're doing an awful lot of memory management on your critical path! Think about that: you're spending 70% of your time allocating and releasing objects. This sounds like a textbook case for something like a Flyweight or an object cache or a recycling pool!)