I have a data structure which essentially amounts to a nested dictionary. Let\'s say it looks like this:
{\'new jersey\': {\'mercer county\': {\'plumbers\':
As for "obnoxious try/catch blocks":
d = {}
d.setdefault('key',{}).setdefault('inner key',{})['inner inner key'] = 'value'
print d
yields
{'key': {'inner key': {'inner inner key': 'value'}}}
You can use this to convert from your flat dictionary format to structured format:
fd = {('new jersey', 'mercer county', 'plumbers'): 3,
('new jersey', 'mercer county', 'programmers'): 81,
('new jersey', 'middlesex county', 'programmers'): 81,
('new jersey', 'middlesex county', 'salesmen'): 62,
('new york', 'queens county', 'plumbers'): 9,
('new york', 'queens county', 'salesmen'): 36}
for (k1,k2,k3), v in fd.iteritems():
d.setdefault(k1, {}).setdefault(k2, {})[k3] = v
Just because I haven't seen one this small, here's a dict that gets as nested as you like, no sweat:
# yo dawg, i heard you liked dicts
def yodict():
return defaultdict(yodict)
Since you have a star-schema design, you might want to structure it more like a relational table and less like a dictionary.
import collections
class Jobs( object ):
def __init__( self, state, county, title, count ):
self.state= state
self.count= county
self.title= title
self.count= count
facts = [
Jobs( 'new jersey', 'mercer county', 'plumbers', 3 ),
...
def groupBy( facts, name ):
total= collections.defaultdict( int )
for f in facts:
key= getattr( f, name )
total[key] += f.count
That kind of thing can go a long way to creating a data warehouse-like design without the SQL overheads.