Get first word of all strings in lists

前端 未结 4 1405
粉色の甜心
粉色の甜心 2021-01-22 13:37

I have a CSV file which I\'m reading in like below. I need to get the first word of all the strings. I know how to get first letter but I\'m not sure how I can get words.

相关标签:
4条回答
  • 2021-01-22 13:50

    You can use comprehension

    >>> l = [['diffuse systemic sclerosis', 'back', 'public on july 15 2008']
    ,['diffuse systemic sclerosis', 'forearm', 'public on may 9 2014']]
    
    >>> list({i.split()[0] for j in l for i in j})
    ['back', 'diffuse', 'forearm', 'public']
    
    0 讨论(0)
  • l = [
        ['diffuse systemic sclerosis', 'back', 'public on july 15 2008'],
        ['diffuse systemic sclerosis', 'forearm', 'public on may 9 2014']
        ]
    d = lambda o: [a.split().pop(0) for a in o]
    r = lambda a,b: d(a) + d(b)
    print "\n".join(set(reduce(r, l)))
    >>> 
    public
    forearm
    diffuse
    back
    
    0 讨论(0)
  • 2021-01-22 13:55

    You can use a list comprehension , and split() function :

    >>> l=['diffuse systemic sclerosis', 'back', 'public on july 15 2008']
    >>> [i.split()[0] for i in l]
    ['diffuse', 'back', 'public']
    
    0 讨论(0)
  • 2021-01-22 14:14

    You can use str.split in a list comprehension, noting that you can specify maxsplit to reduce the number of operations:

    L = ['diffuse systemic sclerosis', 'back', 'public on july 15 2008']
    
    res = [i.split(maxsplit=1)[0] for i in L]
    # ['diffuse', 'back', 'public']
    

    You can also perform the same operations functionally:

    from operator import itemgetter, methodcaller
    
    splitter = methodcaller('split', maxsplit=1)
    res = list(map(itemgetter(0), map(splitter, L)))
    

    Across multiple lists, if you wish to maintain the order with which you observe unique first words, you can use the itertool unique_everseen recipe, also found in the more_itertools library:

    from itertools import chain
    from more_itertool import unique_everseen
    
    L1 = ['diffuse systemic sclerosis', 'back', 'public on july 15 2008']
    L2 = ['diffuse systemic sclerosis', 'forearm', 'public on may 9 2014']
    
    res = list(unique_everseen(i.split(maxsplit=1)[0] for i in chain(L1, L2)))
    
    # ['diffuse', 'back', 'public', 'forearm']
    
    0 讨论(0)
提交回复
热议问题