How can I extract keywords from a Python format string?

前端 未结 4 1670
梦谈多话
梦谈多话 2020-11-29 06:13

I want to provide automatic string formatting in an API such that:

my_api(\"path/to/{self.category}/{self.name}\", ...)

can be replaced wit

相关标签:
4条回答
  • 2020-11-29 06:23

    You can use the string.Formatter() class to parse out the fields in a string, with the Formatter.parse() method:

    from string import Formatter
    
    fieldnames = [fname for _, fname, _, _ in Formatter().parse(yourstring) if fname]
    

    Demo:

    >>> from string import Formatter
    >>> yourstring = "path/to/{self.category}/{self.name}"
    >>> [fname for _, fname, _, _ in Formatter().parse(yourstring) if fname]
    ['self.category', 'self.name']
    >>> yourstring = "non-keyword {keyword1} {{escaped brackets}} {} {keyword2}"
    >>> [fname for _, fname, _, _ in Formatter().parse(yourstring) if fname]
    ['keyword1', 'keyword2']
    

    You can parse those field names further; for that you can use the str._formatter_field_name_split() method (Python 2) / _string.formatter_field_name_split() function (Python 3) (this internal implementation detail is not otherwise exposed; Formatter.get_field() uses it internally). This function returns the first part of the name, the one that'd be looked up on in the arguments passed to str.format(), plus a generator for the rest of the field.

    The generator yields (is_attribute, name) tuples; is_attribute is true if the next name is to be treated as an attribute, false if it is an item to look up with obj[name]:

    try:
        # Python 3
        from _string import formatter_field_name_split
    except ImportError:
        formatter_field_name_split = str._formatter_field_name_split
    from string import Formatter
    
    field_references = {formatter_field_name_split(fname)[0]
     for _, fname, _, _ in Formatter().parse(yourstring) if fname}
    

    Demo:

    >>> from string import Formatter
    >>> from _string import formatter_field_name_split
    >>> yourstring = "path/to/{self.category}/{self.name}"
    >>> {formatter_field_name_split(fname)[0]
    ...  for _, fname, _, _ in Formatter().parse(yourstring) if fname}
    {'self'}
    

    Take into account that this function is part of the internal implementation details of the Formatter() class and can be changed or removed from Python without notice, and may not even be available in other Python implementations.

    0 讨论(0)
  • 2020-11-29 06:24

    You can do "path/to/{self.category}/{self.name}".format(self=self). You could thus work with those kwargs in __getattr__.

    0 讨论(0)
  • 2020-11-29 06:37

    Building off Martijn's answer, an easier format for the comprehensive list that I've used is:

    >>> yourstring = "path/to/{self.category}/{self.name}"
    >>> [x[1] for x in yourstring._formatter_parser() if x[1]]
    ['self.category', 'self.name']
    

    It's functionally exactly the same, just much easier to digest.

    0 讨论(0)
  • 2020-11-29 06:49

    If all placeholders are named, a special dictionary could be used to intercept which keys are tried to be accessed and logged to an array.

    def format_keys(str_):
        class HelperDict(dict):
            def __init__(self):
                self._keys = []
            def __getitem__(self, key):
                self._keys.append(key)    
        d = HelperDict()
        str_.format_map(d)
        return d._keys
    

    Note that if there are unnamed placeholders, an IndexError will be raised by .format() (tuple index out of range).

    0 讨论(0)
提交回复
热议问题