I have a data structure which essentially amounts to a nested dictionary. Let\'s say it looks like this:
{\'new jersey\': {\'mercer county\': {\'plumbers\':
You could create a YAML file and read it in using PyYaml.
Step 1: Create a YAML file, "employment.yml":
new jersey:
mercer county:
pumbers: 3
programmers: 81
middlesex county:
salesmen: 62
programmers: 81
new york:
queens county:
plumbers: 9
salesmen: 36
Step 2: Read it in Python
import yaml
file_handle = open("employment.yml")
my_shnazzy_dictionary = yaml.safe_load(file_handle)
file_handle.close()
and now my_shnazzy_dictionary
has all your values. If you needed to do this on the fly, you can create the YAML as a string and feed that into yaml.safe_load(...)
.
I find setdefault
quite useful; It checks if a key is present and adds it if not:
d = {}
d.setdefault('new jersey', {}).setdefault('mercer county', {})['plumbers'] = 3
setdefault
always returns the relevant key, so you are actually updating the values of 'd
' in place.
When it comes to iterating, I'm sure you could write a generator easily enough if one doesn't already exist in Python:
def iterateStates(d):
# Let's count up the total number of "plumbers" / "dentists" / etc.
# across all counties and states
job_totals = {}
# I guess this is the annoying nested stuff you were talking about?
for (state, counties) in d.iteritems():
for (county, jobs) in counties.iteritems():
for (job, num) in jobs.iteritems():
# If job isn't already in job_totals, default it to zero
job_totals[job] = job_totals.get(job, 0) + num
# Now return an iterator of (job, number) tuples
return job_totals.iteritems()
# Display all jobs
for (job, num) in iterateStates(d):
print "There are %d %s in total" % (job, num)
You can use Addict: https://github.com/mewwts/addict
>>> from addict import Dict
>>> my_new_shiny_dict = Dict()
>>> my_new_shiny_dict.a.b.c.d.e = 2
>>> my_new_shiny_dict
{'a': {'b': {'c': {'d': {'e': 2}}}}}
I used to use this function. its safe, quick, easily maintainable.
def deep_get(dictionary, keys, default=None):
return reduce(lambda d, key: d.get(key, default) if isinstance(d, dict) else default, keys.split("."), dictionary)
Example :
>>> from functools import reduce
>>> def deep_get(dictionary, keys, default=None):
... return reduce(lambda d, key: d.get(key, default) if isinstance(d, dict) else default, keys.split("."), dictionary)
...
>>> person = {'person':{'name':{'first':'John'}}}
>>> print (deep_get(person, "person.name.first"))
John
>>> print (deep_get(person, "person.name.lastname"))
None
>>> print (deep_get(person, "person.name.lastname", default="No lastname"))
No lastname
>>>
As others have suggested, a relational database could be more useful to you. You can use a in-memory sqlite3 database as a data structure to create tables and then query them.
import sqlite3
c = sqlite3.Connection(':memory:')
c.execute('CREATE TABLE jobs (state, county, title, count)')
c.executemany('insert into jobs values (?, ?, ?, ?)', [
('New Jersey', 'Mercer County', 'Programmers', 81),
('New Jersey', 'Mercer County', 'Plumbers', 3),
('New Jersey', 'Middlesex County', 'Programmers', 81),
('New Jersey', 'Middlesex County', 'Salesmen', 62),
('New York', 'Queens County', 'Salesmen', 36),
('New York', 'Queens County', 'Plumbers', 9),
])
# some example queries
print list(c.execute('SELECT * FROM jobs WHERE county = "Queens County"'))
print list(c.execute('SELECT SUM(count) FROM jobs WHERE title = "Programmers"'))
This is just a simple example. You could define separate tables for states, counties and job titles.
defaultdict()
is your friend!
For a two dimensional dictionary you can do:
d = defaultdict(defaultdict)
d[1][2] = 3
For more dimensions you can:
d = defaultdict(lambda :defaultdict(defaultdict))
d[1][2][3] = 4