Iterating over a dictionary in python and stripping white space

后端 未结 7 546
佛祖请我去吃肉
佛祖请我去吃肉 2021-01-13 01:10

I am working with the web scraping framework Scrapy and I am a bit of a noob when it comes to python. So I am wondering how do I iterate over all of the scraped items which

相关标签:
7条回答
  • 2021-01-13 01:25

    Not a direct answer to the question, but I would suggest you look at Item Loaders and input/output processors. A lot of your cleanup can be take care of here.

    An example which strips each entry would be:

    class ItemLoader(ItemLoader):
    
        default_output_processor = MapCompose(unicode.strip)
    
    0 讨论(0)
  • 2021-01-13 01:31

    What you should note is that lstrip() returns a copy of the string rather than modify the object. To actually update your dictionary, you'll need to assign the stripped value back to the item.

    For example:

    for k, v in your_dict.iteritems():
        your_dict[k] = v.lstrip()
    

    Note the use of .iteritems() which returns an iterator instead of a list of key value pairs. This makes it somewhat more efficient.

    I should add that in Python3, .item() has been changed to return "views" and so .iteritems() would not be required.

    0 讨论(0)
  • 2021-01-13 01:31

    Although @zquare had the best answer for this question, I feel I need to chime in with a Pythonic method that will also account for dictionary values that are not strings. This is not recursive mind you, as it only works with one dimensional dictionary objects.

    d.update({k: v.lstrip() for k, v in d.items() if isinstance(v, str) and v.startswith(' ')})
    

    This updates the original dictionary value if the value is a string and starts with a space.

    UPDATE: If you want to use Regular Expressions and avoid using starts with and endswith. You can use this:

    import re
    rex = re.compile(r'^\s|\s$')
    d.update({k: v.strip() for k, v in d.items() if isinstance(v, str) and rex.search(v)})
    

    This version strips if the value has a leading or trailing white space character.

    0 讨论(0)
  • 2021-01-13 01:34

    Assuming you would like to strip the values of yourDict creating a new dict called newDict:

    newDict = dict(zip(yourDict.keys(), [v.strip() if isinstance(v,str) else v for v in yourDict.values()]))
    

    This code can handle multi-type values, so will avoid stripping int, float, etc.

    0 讨论(0)
  • 2021-01-13 01:35

    Try

    for k,v in item.items():
       item[k] = v.replace(' ', '')
    

    or in a comprehensive way as suggested by monkut:

    newDic = {k,v.replace(' ','') for k,v in item.items()}
    
    0 讨论(0)
  • 2021-01-13 01:35

    I use the following. You can pass any object as an argument, including a string, list or dictionary.

    # strip any type of object
    def strip_all(x):
      if isinstance(x, str): # if using python2 replace str with basestring to include unicode type
        x = x.strip()
      elif isinstance(x, list):
        x = [strip_all(v) for v in x]
      elif isinstance(x, dict):
        for k, v in x.iteritems():
          x.pop(k)  # also strip keys
          x[ strip_all(k) ] = strip_all(v)
      return x
    
    0 讨论(0)
提交回复
热议问题