So I was reading this article about an attempt to remove the global interpreter lock (GIL) from the Python interpreter to improve multithreading performance and saw somethin
There is overhead and it can be significant in rare cases (like, for example, micro-benchmarks ;), regardless of the optimizations that are in place (of which, there are many). The normal case, though, is optimized for un-contended manipulation of the reference count for the object.
So the question is, if reference counting is so lousy for threading, how does Objective-C do it?
There are multiple locks in play and, effectively, a retain/release on any given object selects a random lock (but always the same lock) for that object. Thus, reducing lock contention while not requiring one lock per object.
(And what Catfish_man said; some classes will implement their own reference counting scheme to use class-specific locking primitives to avoid contention and/or optimize for their specific needs.)
The implementation details are more complex.
Is Objectice-C's reference counting actually technically unsafe with threads?
Nope -- it is safe in regards to threads.
In reality, typical code will call retain
and release
quite infrequently, compared to other operations. Thus, even if there were significant overhead on those code paths, it would be amortized across all the other operations in the app (where, say, pushing pixels to the screen is really expensive, by comparison).
If an object is shared across threads (bad idea, in general), then the locking overhead protecting the data access and manipulation will generally be vastly greater than the retain/release overhead because of the infrequency of retaining/releasing.
As far as Python's GIL overhead is concerned, I would bet that it has more to do with how often the reference count is incremented and decremented as a part of normal interpreter operations.
In addition to what bbum said, a lot of the most frequently thrown around objects in Cocoa override the normal reference counting mechanisms and store a refcount inline in the object, which they manipulate with atomic add and subtract instructions rather than locking.
(edit from the future: Objective-C now automatically does this optimization on modern Apple platforms, by mixing the refcount in with the 'isa' pointer)