A few times I accidentally modified the input to a function. Since Python has no constant references, I\'m wondering what coding techniques might help me avoid making this mista
Making copies of parameters 'just-in-case' is a bad idea: you end up paying for it in lousy performance; or you have to keep track of the sizes of your arguments instead.
Better to get a good understanding of objects and names and how Python deals with them. A good start being this article.
The importart point being that
def modi_list(alist):
alist.append(4)
some_list = [1, 2, 3]
modi_list(some_list)
print(some_list)
has exactly the same affect as
some_list = [1, 2, 3]
same_list = some_list
same_list.append(4)
print(some_list)
because in the function call no copying of arguments is taking place, no creating pointers is taking place... what is taking place is Python saying alist = some_list
and then executing the code in the function modi_list()
. In other words, Python is binding (or assigning) another name to the same object.
Finally, when you do have a function that is going to modify its arguments, and you don't want those changes visible outside the function, you can usually just do a shallow copy:
def dont_modi_list(alist):
alist = alist[:] # make a shallow copy
alist.append(4)
Now some_list
and alist
are two different list objects that happen to contain the same objects -- so if you are just messing with the list object (inserting, deleting, rearranging) then you are fine, buf if you are going to go even deeper and cause modifications to the objects in the list then you will need to do a deepcopy()
. But it's up to you to keep track of such things and code appropriately.
You can use a metaclass as follows:
import copy, new
class MakeACopyOfConstructorArguments(type):
def __new__(cls, name, bases, dct):
rv = type.__new__(cls, name, bases, dct)
old_init = dct.get("__init__")
if old_init is not None:
cls.__old_init = old_init
def new_init(self, *a, **kw):
a = copy.deepcopy(a)
kw = copy.deepcopy(kw)
cls.__old_init(self, *a, **kw)
rv.__init__ = new.instancemethod(new_init, rv, cls)
return rv
class Test(object):
__metaclass__ = MakeACopyOfConstructorArguments
def __init__(self, li):
li[0]=3
print li
li = range(3)
print li
t = Test(li)
print li
First general rule: don't modify containers: create new ones.
So don't modify your incoming dictionary, create a new dictionary with a subset of the keys.
self.fields = dict( key, value for key, value in fields.items()
if accept_key(key, data) )
Such methods are typically slightly more efficient then going through and deleting the bad elements anyways. More generally, its often easier to avoid modifying objects and instead create new ones.
Second general rule: don't modify containers after passing them off.
You can't generally assume that containers to which you have passed data have made their own copies. As result, don't try to modify the containers you've given them. Any modifications should be done before handing off the data. Once you've passed the container to somebody else you are no longer the sole master of it.
Third general rule: don't modify containers you didn't create.
If you get passed some sort of container, you do not know who else might be using the container. So don't modify it. Either use the unmodified version or invoke rule1, creating a new container with the desired changes.
Fourth general rule: (stolen from Ethan Furman)
Some functions are supposed to modify the list. That is their job. If this is the case make that apparent in the function name (such as the list methods append and extend).
Putting it all together:
A piece of code should only modify a container when it is the only piece of code with access to that container.
There is a best practice for this, in Python, and is called unit testing.
The main point here is that dynamic languages allows for much rapid development even with full unit tests; and unit tests are a much tougher safety net than static typing.
As writes Martin Fowler:
The general argument for static types is that it catches bugs that are otherwise hard to find. But I discovered that in the presence of SelfTestingCode, most bugs that static types would have were found just as easily by the tests.