Correct way to detect sequence parameter?

后端 未结 12 1478
情书的邮戳
情书的邮戳 2020-11-28 13:48

I want to write a function that accepts a parameter which can be either a sequence or a single value. The type of value is str, int, etc., but I don\'t want

相关标签:
12条回答
  • 2020-11-28 13:55

    I'm new here so I don't know what's the correct way to do it. I want to answer my answers:

    The problem with all of the above mentioned ways is that str is considered a sequence (it's iterable, has __getitem__, etc.) yet it's usually treated as a single item.

    For example, a function may accept an argument that can either be a filename or a list of filenames. What's the most Pythonic way for the function to detect the first from the latter?

    Should I post this as a new question? Edit the original one?

    0 讨论(0)
  • 2020-11-28 13:56

    I think what I would do is check whether the object has certain methods that indicate it is a sequence. I'm not sure if there is an official definition of what makes a sequence. The best I can think of is, it must support slicing. So you could say:

    is_sequence = '__getslice__' in dir(X)
    

    You might also check for the particular functionality you're going to be using.

    As pi pointed out in the comment, one issue is that a string is a sequence, but you probably don't want to treat it as one. You could add an explicit test that the type is not str.

    0 讨论(0)
  • 2020-11-28 13:57

    The problem with all of the above mentioned ways is that str is considered a sequence (it's iterable, has getitem, etc.) yet it's usually treated as a single item.

    For example, a function may accept an argument that can either be a filename or a list of filenames. What's the most Pythonic way for the function to detect the first from the latter?

    Based on the revised question, it sounds like what you want is something more like:

    def to_sequence(arg):
        ''' 
        determine whether an arg should be treated as a "unit" or a "sequence"
        if it's a unit, return a 1-tuple with the arg
        '''
        def _multiple(x):  
            return hasattr(x,"__iter__")
        if _multiple(arg):  
            return arg
        else:
            return (arg,)
    
    >>> to_sequence("a string")
    ('a string',)
    >>> to_sequence( (1,2,3) )
    (1, 2, 3)
    >>> to_sequence( xrange(5) )
    xrange(5)
    

    This isn't guaranteed to handle all types, but it handles the cases you mention quite well, and should do the right thing for most of the built-in types.

    When using it, make sure whatever receives the output of this can handle iterables.

    0 讨论(0)
  • 2020-11-28 14:01

    You're asking the wrong question. You don't try to detect types in Python; you detect behavior.

    1. Write another function that handles a single value. (let's call it _use_single_val).
    2. Write one function that handles a sequence parameter. (let's call it _use_sequence).
    3. Write a third parent function that calls the two above. (call it use_seq_or_val). Surround each call with an exception handler to catch an invalid parameter (i.e. not single value or sequence).
    4. Write unit tests to pass correct & incorrect parameters to the parent function to make sure it catches the exceptions properly.
    
        def _use_single_val(v):
            print v + 1  # this will fail if v is not a value type
    
        def _use_sequence(s):
            print s[0]   # this will fail if s is not indexable
    
        def use_seq_or_val(item):    
            try:
                _use_single_val(item)
            except TypeError:
                pass
    
            try:
                _use_sequence(item)
            except TypeError:
                pass
    
            raise TypeError, "item not a single value or sequence"
    

    EDIT: Revised to handle the "sequence or single value" asked about in the question.

    0 讨论(0)
  • 2020-11-28 14:05

    In cases like this, I prefer to just always take the sequence type or always take the scalar. Strings won't be the only types that would behave poorly in this setup; rather, any type that has an aggregate use and allows iteration over its parts might misbehave.

    0 讨论(0)
  • 2020-11-28 14:08

    As of 2.6, use abstract base classes.

    >>> import collections
    >>> isinstance([], collections.Sequence)
    True
    >>> isinstance(0, collections.Sequence)
    False
    

    Furthermore ABC's can be customized to account for exceptions, such as not considering strings to be sequences. Here an example:

    import abc
    import collections
    
    class Atomic(object):
        __metaclass__ = abc.ABCMeta
        @classmethod
        def __subclasshook__(cls, other):
            return not issubclass(other, collections.Sequence) or NotImplemented
    
    Atomic.register(basestring)
    

    After registration the Atomic class can be used with isinstance and issubclass:

    assert isinstance("hello", Atomic) == True
    

    This is still much better than a hard-coded list, because you only need to register the exceptions to the rule, and external users of the code can register their own.

    Note that in Python 3 the syntax for specifying metaclasses changed and the basestring abstract superclass was removed, which requires something like the following to be used instead:

    class Atomic(metaclass=abc.ABCMeta):
        @classmethod
        def __subclasshook__(cls, other):
            return not issubclass(other, collections.Sequence) or NotImplemented
    
    Atomic.register(str)
    

    If desired, it's possible to write code which is compatible both both Python 2.6+ and 3.x, but doing so requires using a slightly more complicated technique which dynamically creates the needed abstract base class, thereby avoiding syntax errors due to the metaclass syntax difference. This is essentially the same as what Benjamin Peterson's six module'swith_metaclass()function does.

    class _AtomicBase(object):
        @classmethod
        def __subclasshook__(cls, other):
            return not issubclass(other, collections.Sequence) or NotImplemented
    
    class Atomic(abc.ABCMeta("NewMeta", (_AtomicBase,), {})):
        pass
    
    try:
        unicode = unicode
    except NameError:  # 'unicode' is undefined, assume Python >= 3
        Atomic.register(str)  # str includes unicode in Py3, make both Atomic
        Atomic.register(bytes)  # bytes will also be considered Atomic (optional)
    else:
        # basestring is the abstract superclass of both str and unicode types
        Atomic.register(basestring)  # make both types of strings Atomic
    

    In versions before 2.6, there are type checkers in theoperatormodule.

    >>> import operator
    >>> operator.isSequenceType([])
    True
    >>> operator.isSequenceType(0)
    False
    
    0 讨论(0)
提交回复
热议问题