I\'m a Java developer who\'s toyed around with Python on and off. I recently stumbled upon this article which mentions common mistakes Java programmers make when they pick u
Great answer by BrenBarn, but I would change 'If it doesn't need access to the class or the instance, make it a function' to:
'If it doesn't need access to the class or the instance...but is thematically related to the class (typical example: helper functions and conversion functions used by other class methods or used by alternate constructors), then use staticmethod
else make it a module function
The most straightforward way to think about it is to think in terms of what type of object the method needs in order to do its work. If your method needs access to an instance, make it a regular method. If it needs access to the class, make it a classmethod. If it doesn't need access to the class or the instance, make it a function. There is rarely a need to make something a staticmethod, but if you find you want a function to be "grouped" with a class (e.g., so it can be overridden) even though it doesn't need access to the class, I guess you could make it a staticmethod.
I would add that putting functions at the module level doesn't "pollute" the namespace. If the functions are meant to be used, they're not polluting the namespace, they're using it just as it should be used. Functions are legitimate objects in a module, just like classes or anything else. There's no reason to hide a function in a class if it doesn't have any reason to be there.
The best answer depends on how the function is going to be used. In my case, I write application packages that will be used in Jupyter notebooks. My main goal is to make things easy for the user.
The main advantage of function definitions is that the user can import their defining file using the "as" keyword. This allows the user to call the functions in the same way that they would call a function in numpy or matplotlib.
One of the disadvantages of Python is that names cannot be protected against further assignment. However, if "import numpy as np" appears at the top of the notebook, it's a strong hint that "np" should not be used as a common variable name. You can accomplish the same thing with class names, obviously, but user familiarity counts for a lot.
Inside the packages, however, I prefer to use static methods. My software architecture is object oriented, and I write with Eclipse, which I use for multiple target languages. It's convenient to open the source file and see the class definition at the top level, method definitions indented one level, and so on. The audience for the code at this level is mainly other analysts and developers, so it's better to avoid language-specific idioms.
I don't have a lot of confidence in Python namespace management, especially when using design patterns where (say) an object passes a reference to itself so that the called object can call a method defined on the caller. So I try not to force it too far. I use a lot of fully qualified names and explicit instance variables (with self) where in other languages I could count on the interpreter or the compiler managing the scope more closely. It's easier to do this with classes and static methods, which is why I think they are the better choice for complex packages where abstraction and information hiding are most useful.
This is not really an answer, but rather a lengthy comment:
Even more puzzling is that this code:
class A: def foo(x): print(x) A.foo(5)
Fails as expected in Python 2.7.3 but works fine in 3.2.3 (although you can't call the method on an instance of A, only on the class.)
I'll try to explain what happens here.
This is, strictly speaking, an abuse of the "normal" instance method protocol.
What you define here is a method, but with the first (and only) parameter not named self
, but x
. Of course you can call the method in an instance of A
, but you'll have to call it like this:
A().foo()
or
a = A()
a.foo()
so the instance is given to the function as first argument.
The possibility to call regular methods via the class has always been there and works by
a = A()
A.foo(a)
Here, as you call the method of the class rather than on the instance, it doesn't get its first parameter given automaticvally, but you'll have to provide it.
As long as this is an instance of A
, everything is ok. Giving it something else is IMO an abuse of the protocol, and thus the difference between Py2 and Py3:
In Py2, A.foo
gets transformed to an unbound method and thus requires its first argument be an instance of the class it "lives" in. Calling it with something else will fail.
In Py3, this check has been dropped and A.foo
is just the original function object. So you can call it with everything as first argument, but I wouldn't do it. The first parameter of a method should always be named self
and have the semantics of self
.