I mentioned it in the comments, that I'm not sure why this is needed.
But one could simply override __setitem__
of a dictionary class. Alltho this might (most likely) cause problems down the line. A minimal example of this would be:
class autodict(dict):
def __init__(self, *args, **kwargs):
super(autodict, self).__init__(*args, **kwargs)
def __getitem__(self, key):
val = dict.__getitem__(self, key)
return val
def __setitem__(self, key, val):
pass
x = autodict({'a' : 1, 'b' : 2})
x['c'] = 3
print(x)
Which will produce {'a': 1, 'b': 2}
and thus ignoring the x['c'] = 3
set.
Some benefits
The speed difference is some where between 40-1000 times faster using dictionary inheritance compared to named tuples. (See below for crude speed tests)
The in
operator works on dictionaries, not so well on named tuples when used like this:
'a' in nt == False
'a' in x == True
You can use key access dictionary style instead of (for lack of a better term) JavaScript style
x['a'] == nt.a
Although that's a matter of taste.
You also don't have to be picky about keys, since dictionaries support essentially any key identifier:
x[1] = 'a number'
nt = foo({1 : 'a number'})
Named tuples will result in Type names and field names must be valid identifiers: '1'
Optimizations (timing the thing)
Now, this is a crude example, and it would vary a lot depending on the system, the place of the moon in the sky etc.. But as a crude example:
import time
from collections import namedtuple
class autodict(dict):
def __init__(self, *args, **kwargs):
super(autodict, self).__init__(*args, **kwargs)
#self.update(*args, **kwargs)
def __getitem__(self, key):
val = dict.__getitem__(self, key)
return val
def __setitem__(self, key, val):
pass
def __type__(self, *args, **kwargs):
return dict
def foo(bar):
MyNamedTuple = namedtuple("MyNamedTuple", [k for k in bar.keys()])
d = {k: v for k, v in bar.items()}
return MyNamedTuple(**d)
start = time.time()
for i in range(1000000):
nt = foo({'x'+str(i) : i})
end = time.time()
print('Named tuples:', end - start,'seconds.')
start = time.time()
for i in range(1000000):
x = autodict({'x'+str(i) : i})
end = time.time()
print('Autodict:', end - start,'seconds.')
Results in:
Named tuples: 59.21987843513489 seconds.
Autodict: 1.4844810962677002 seconds.
The dictionary setup is in my book, insanely quicker. Although that most likely has to do with multiple for
loops in the named tuple setup, and that can probably be easily remedied some how. But for basic understanding this is a big difference. The example obviously doesn't test larger one-time-creations or access times. Just, "what if you use these options to create data-sets over a period of time, how much time would you loose" :)
Bonus: What if you have a large base dictionary, and want to freeze it?
base_dict = {'x'+str(i) : i for i in range(1000000)}
start = time.time()
nt = foo(base_dict)
end = time.time()
print('Named tuples:', end - start,'seconds.')
start = time.time()
x = autodict(base_dict)
end = time.time()
print('Autodict:', end - start,'seconds.')
Well, the difference was bigger than I expected.. x1038.5
times faster.
(I was using the CPU for other stuff, but I think this is fair game)
Named tuples: 154.0662612915039 seconds.
Autodict: 0.1483476161956787 seconds.