Python pretty print dictionary of lists, abbreviate long lists

北慕城南 提交于 2020-01-09 10:01:27

问题


I have a dictionary of lists and the lists are quite long. How can I print it in a way that only a few elements of the list show up? Obviously, I can write a custom function for that but is there any built-in way or library that can achieve this? For example when printing large data frames, pandas prints it nicely in a short way.

This example better illustrates what I mean:

obj = {'key_1': ['EG8XYD9FVN',
  'S2WARDCVAO',
  'J00YCU55DP',
  'R07BUIF2F7',
  'VGPS1JD0UM',
  'WL3TWSDP8E',
  'LD8QY7DMJ3',
  'J36U3Z9KOQ',
  'KU2FUGYB2U',
  'JF3RQ315BY'],
 'key_2': ['162LO154PM',
  '3ROAV881V2',
  'I4T79LP18J',
  'WBD36EM6QL',
  'DEIODVQU46',
  'KWSJA5WDKQ',
  'WX9SVRFO0G',
  '6UN63WU64G',
  '3Z89U7XM60',
  '167CYON6YN']}

Desired output: something like this:

{'key_1':
    ['EG8XYD9FVN', 'S2WARDCVAO', '...'],
 'key_2':
    ['162LO154PM', '3ROAV881V2', '...']
}

回答1:


If it weren't for the pretty printing, the reprlib module would be the way to go: Safe, elegant and customizable handling of deeply nested and recursive / self-referencing data structures is what it has been made for.

However, it turns out combining the reprlib and pprint modules isn't trivial, at least I couldn't come up with a clean way without breaking (some) of the pretty printing aspects.

So instead, here's a solution that just subclasses PrettyPrinter to crop / abbreviate lists as necessary:

from pprint import PrettyPrinter


obj = {
    'key_1': [
        'EG8XYD9FVN', 'S2WARDCVAO', 'J00YCU55DP', 'R07BUIF2F7', 'VGPS1JD0UM',
        'WL3TWSDP8E', 'LD8QY7DMJ3', 'J36U3Z9KOQ', 'KU2FUGYB2U', 'JF3RQ315BY',
    ],
    'key_2': [
        '162LO154PM', '3ROAV881V2', 'I4T79LP18J', 'WBD36EM6QL', 'DEIODVQU46',
        'KWSJA5WDKQ', 'WX9SVRFO0G', '6UN63WU64G', '3Z89U7XM60', '167CYON6YN',
    ],
    # Test case to make sure we didn't break handling of recursive structures
    'key_3': [
        '162LO154PM', '3ROAV881V2', [1, 2, ['a', 'b', 'c'], 3, 4, 5, 6, 7],
        'KWSJA5WDKQ', 'WX9SVRFO0G', '6UN63WU64G', '3Z89U7XM60', '167CYON6YN',
    ]
}


class CroppingPrettyPrinter(PrettyPrinter):

    def __init__(self, *args, **kwargs):
        self.maxlist = kwargs.pop('maxlist', 6)
        return PrettyPrinter.__init__(self, *args, **kwargs)

    def _format(self, obj, stream, indent, allowance, context, level):
        if isinstance(obj, list):
            # If object is a list, crop a copy of it according to self.maxlist
            # and append an ellipsis
            if len(obj) > self.maxlist:
                cropped_obj = obj[:self.maxlist] + ['...']
                return PrettyPrinter._format(
                    self, cropped_obj, stream, indent,
                    allowance, context, level)

        # Let the original implementation handle anything else
        # Note: No use of super() because PrettyPrinter is an old-style class
        return PrettyPrinter._format(
            self, obj, stream, indent, allowance, context, level)


p = CroppingPrettyPrinter(maxlist=3)
p.pprint(obj)

Output with maxlist=3:

{'key_1': ['EG8XYD9FVN', 'S2WARDCVAO', 'J00YCU55DP', '...'],
 'key_2': ['162LO154PM',
           '3ROAV881V2',
           [1, 2, ['a', 'b', 'c'], '...'],
           '...']}

Output with maxlist=5 (triggers splitting the lists on separate lines):

{'key_1': ['EG8XYD9FVN',
           'S2WARDCVAO',
           'J00YCU55DP',
           'R07BUIF2F7',
           'VGPS1JD0UM',
           '...'],
 'key_2': ['162LO154PM',
           '3ROAV881V2',
           'I4T79LP18J',
           'WBD36EM6QL',
           'DEIODVQU46',
           '...'],
 'key_3': ['162LO154PM',
           '3ROAV881V2',
           [1, 2, ['a', 'b', 'c'], 3, 4, '...'],
           'KWSJA5WDKQ',
           'WX9SVRFO0G',
           '...']}

Notes:

  • This will create copies of lists. Depending on the size of the data structures, this can be very expensive in terms of memory use.
  • This only deals with the special case of lists. Equivalent behavior would have to be implemented for dicts, tuples, sets, frozensets, ... for this class to be of general use.



回答2:


You could use the pprint module:

pprint.pprint(obj)

Would output:

{'key_1': ['EG8XYD9FVN',
           'S2WARDCVAO',
           'J00YCU55DP',
           'R07BUIF2F7',
           'VGPS1JD0UM',
           'WL3TWSDP8E',
           'LD8QY7DMJ3',
           'J36U3Z9KOQ',
           'KU2FUGYB2U',
           'JF3RQ315BY'],
 'key_2': ['162LO154PM',
           '3ROAV881V2',
           'I4T79LP18J',
           'WBD36EM6QL',
           'DEIODVQU46',
           'KWSJA5WDKQ',
           'WX9SVRFO0G',
           '6UN63WU64G',
           '3Z89U7XM60',
           '167CYON6YN']}

And,

pprint.pprint(obj,depth=1)

Would output:

{'key_1': [...], 'key_2': [...]}

And,

pprint.pprint(obj,compact=True)

would output:

{'key_1': ['EG8XYD9FVN', 'S2WARDCVAO', 'J00YCU55DP', 'R07BUIF2F7',
           'VGPS1JD0UM', 'WL3TWSDP8E', 'LD8QY7DMJ3', 'J36U3Z9KOQ',
           'KU2FUGYB2U', 'JF3RQ315BY'],
 'key_2': ['162LO154PM', '3ROAV881V2', 'I4T79LP18J', 'WBD36EM6QL',
           'DEIODVQU46', 'KWSJA5WDKQ', 'WX9SVRFO0G', '6UN63WU64G',
           '3Z89U7XM60', '167CYON6YN']}



回答3:


You could use IPython.lib.pretty.

from IPython.lib.pretty import pprint

> pprint(obj, max_seq_length=5)
{'key_1': ['EG8XYD9FVN',
  'S2WARDCVAO',
  'J00YCU55DP',
  'R07BUIF2F7',
  'VGPS1JD0UM',
  ...],
 'key_2': ['162LO154PM',
  '3ROAV881V2',
  'I4T79LP18J',
  'WBD36EM6QL',
  'DEIODVQU46',
  ...]}

> pprint(dict(map(lambda i: (i, range(i + 5)), range(100))), max_seq_length=10)
{0: [0, 1, 2, 3, 4],
 1: [0, 1, 2, 3, 4, 5],
 2: [0, 1, 2, 3, 4, 5, 6],
 3: [0, 1, 2, 3, 4, 5, 6, 7],
 4: [0, 1, 2, 3, 4, 5, 6, 7, 8],
 5: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
 6: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ...],
 7: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ...],
 8: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ...],
 9: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ...],
 ...}

For older versions of IPython, you might exploit RepresentationPrinter:

from IPython.lib.pretty import RepresentationPrinter
import sys

def compact_pprint(obj, max_seq_length=10):
    printer = RepresentationPrinter(sys.stdout)
    printer.max_seq_length = max_seq_length
    printer.pretty(obj)
    printer.flush()



回答4:


This recursive function I wrote does something you're asking for.. You can choose the indentation you want too

def pretty(d, indent=0):
    for key in sorted(d.keys()):
        print '\t' * indent + str(key)
        if isinstance(d[key], dict):
            pretty(d[key], indent+1)
        else:
            print '\t' * (indent+1) + str(d[key])

The output of your dictionary is:

key_1
    ['EG8XYD9FVN', 'S2WARDCVAO', 'J00YCU55DP', 'R07BUIF2F7', 'VGPS1JD0UM', 'WL3TWSDP8E', 'LD8QY7DMJ3', 'J36U3Z9KOQ', 'KU2FUGYB2U', 'JF3RQ315BY']
key_2
    ['162LO154PM', '3ROAV881V2', 'I4T79LP18J', 'WBD36EM6QL', 'DEIODVQU46', 'KWSJA5WDKQ', 'WX9SVRFO0G', '6UN63WU64G', '3Z89U7XM60', '167CYON6YN']



回答5:


Use reprlib. The formatting is not that pretty, but it actually abbreviates.

> import repr
> repr.repr(map(lambda _: range(100000), range(10)))
'[[0, 1, 2, 3, 4, 5, ...], [0, 1, 2, 3, 4, 5, ...], [0, 1, 2, 3, 4, 5, ...], [0, 1, 2, 3, 4, 5, ...], [0, 1, 2, 3, 4, 5, ...], [0, 1, 2, 3, 4, 5, ...], ...]'
> repr.repr(dict(map(lambda i: (i, range(100000)), range(10))))
'{0: [0, 1, 2, 3, 4, 5, ...], 1: [0, 1, 2, 3, 4, 5, ...], 2: [0, 1, 2, 3, 4, 5, ...], 3: [0, 1, 2, 3, 4, 5, ...], ...}'


来源:https://stackoverflow.com/questions/38533282/python-pretty-print-dictionary-of-lists-abbreviate-long-lists

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!