convert tuple keys of dict into a new dict

后端 未结 3 1340
暖寄归人
暖寄归人 2021-01-24 01:46

I have a dict like this:

{
    (\'America\', 25, \'m\', \'IT\'): 10000,
    (\'America\', 22, \'m\', \'IT\'): 8999,
    (\'Japan\',   24, \'f\', \'I         


        
相关标签:
3条回答
  • 2021-01-24 02:15

    You can use a dict comprehension to do this:

    >>> data = {
    ...     ('America', 25, 'm', 'IT'): 10000,
    ...     ('America', 22, 'm', 'IT'): 8999,
    ...     ('Japan',   24, 'f', 'IT'): 9999,
    ...     ('Japan',   23, 'f', 'IT'): 9000
    ... }
    >>> {x: value for (w, x, y, z), value in data.items() if w == "America" and y == "m" and z == "IT"}
    {25: 10000, 22: 8999}
    
    0 讨论(0)
  • 2021-01-24 02:16

    Since I like namedtuples, here would be an alternative suggestion:

    Store your dictionary as a list or set of namedtuples, e.g.,

    >>> from collections import namedtuple
    >>> Entry = namedtuple('entry', ('country', 'age', 'sex', 'job', 'count'))
    

    To convert your existing dictionary dt:

    >>> nt = [Entry(*list(k) + [dt[k]]) for k in dt]
    

    Now, you could fetch the desired entries in a quite readable way, e.g.,

    >>> results = [i for i in nt if (i.country=='America' and i.sex=='m' and i.job=='IT')]
    

    Or, for example, to get the counts:

    >>> [i.count for i in nt if (i.country=='America' and i.sex=='m' and i.job=='IT')]
    [8999, 10000]
    

    Edit: Performance

    Was not sure if you were looking after performance since you mentioned "easier way to do it". You are right, the pure "comprehension is faster:

    dt = {
        ('America', 25, 'm', 'IT'): 10000,
        ('America', 22, 'm', 'IT'): 8999,
        ('Japan',   24, 'f', 'IT'): 9999,
        ('Japan',   23, 'f', 'IT'): 9000
    }
    
    nt = [Entry(*list(k) + [dt[k]]) for k in dt]
    
    %timeit {i.age:i.count for i in nt if (i.country=='America' and i.sex=='m' and i.job=='IT')}
    
    100000 loops, best of 3: 3.62 µs per loop
    
    %timeit {x: value for (w, x, y, z), value in dt.items() if w == "America" and y == "m" and z == "IT"}
    
    100000 loops, best of 3: 2.42 µs per loop
    

    But if you have a larger dataset and querying it over and over again I would also think about Pandas or SQLite.

    df = pd.DataFrame([list(x[0]) + [x[1]] for x in dt.items()])
    df.columns = ['country', 'age', 'sex', 'job', 'count']
    df
    

    enter image description here

    df.loc[(df.country=='America') & (df.sex=='m') & (df.job=='IT')]
    

    enter image description here

    0 讨论(0)
  • 2021-01-24 02:37

    You can replace your entire try/except with this:

    res.setdefault((country, sex, job), {})[age] = cnt
    

    Or you could make res a defaultdict(dict) and then it becomes:

    res[country, sex, job][age] = cnt
    
    0 讨论(0)
提交回复
热议问题