正负索引就不提及了,都是比较基本的用法。
1234567891011121314151617181920212223242526272829303132333435363738 | a = list(range(11))a[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]a[2:8][2, 3, 4, 5, 6, 7]#负数索引切片a[-4:-2][7, 8]#指定步长切片a[::2][0, 2, 4, 6, 8, 10]>>> a[::-2][10, 8, 6, 4, 2, 0]#列表赋值切片>>> b = [1,2,3,4,5]>>> b[2:3] = [0,0]>>> b[1, 2, 0, 0, 4, 5]>>> b[1:1] = [8,9]>>> b[1, 8, 9, 2, 0, 0, 4, 5]>>> b[1:-1] = []>>> b[1, 5]#用slice()函数切片>>> a = list(range(6))>>> a[0, 1, 2, 3, 4, 5]>>> sa = slice(-3,None)>>> saslice(-3, None, None)>>> a[sa][3, 4, 5] |
迭代
基本迭代
给定 list 或 tuple 可以通过 for 循环来遍历,这种遍历称为迭代(iteration)
许多语言中迭代是通过下标完成的,比如 java:
123 | for (i=0; i<list.length; i++) { n = list[i];} |
而 python 的 for 循环不仅仅可以用在 list 或是 tuple 上,还可以用在其他的可迭代对象上,比如 dict。
1234567 | >>> d = {'a': 1, 'b': 2, 'c': 3}>>> for key in d:... print(key)...acb |
要注意的是,dict 的存储顺序不是按照 list 顺序排列,所以迭代出的顺序很可能不一样
默认情况下,dict 迭代的是 key。如果要迭代 value,可以用 for value in d.values(),如果要同时迭代 key 和 value ,可以用 for k, v in d.items() 。
字符串也可迭代,例如:
123456 | >>> for ch in 'ABC':... print(ch)...ABC |
判断是否是可迭代对象:
通过 collections 模块的 Iterable 类型 判断
1234567 | >>> from collections import Iterable>>> isinstance('abc', Iterable) # str是否可迭代True>>> isinstance([1,2,3], Iterable) # list是否可迭代True>>> isinstance(123, Iterable) # 整数是否可迭代False |
如果在 python 中想实现上面 java 一样的通过下表循环的需要使用 python 内置的 enumerate 函数,使得 list 变成 索引-元素 对,这样就可以在 for 循环中同时迭代索引和元素本身:
123456 | >>> for i , value in enumerate(['A','B','C']):... print(i,value)...0 A1 B2 C |
Iterator
可以直接作用于for循环的对象统称为可迭代对象:Iterable .
可以被 next()函数 调用并不断返回下一个值的对象称为迭代器: Iterator 。
生成器都是 Iterator 对象,但 list、dict、str 虽然是 Iterable ,却不是 Iterator 。
在官方文档中的原文:
iterable
An object capable of returning its members one at a time. Examples of iterables include all sequence types (such as list, str, and tuple) and some non-sequence types like dict, file objects, and objects of any classes you define with an __iter__() or __getitem__() method. Iterables can be used in a for loop and in many other places where a sequence is needed (zip(), map(), ...). When an iterable object is passed as an argument to the built-in function iter(), it returns an iterator for the object. This iterator is good for one pass over the set of values. When using iterables, it is usually not necessary to call iter() or deal with iterator objects yourself. The for statement does that automatically for you, creating a temporary unnamed variable to hold the iterator for the duration of the loop. See also iterator, sequence, and generator.
iterator
An object representing a stream of data. Repeated calls to the iterator’s __next__() method (or passing it to the built-in function next()) return successive items in the stream. When no more data are available a StopIteration exception is raised instead. At this point, the iterator object is exhausted and any further calls to its __next__() method just raise StopIteration again. Iterators are required to have an __iter__() method that returns the iterator object itself so every iterator is also iterable and may be used in most places where other iterables are accepted. One notable exception is code which attempts multiple iteration passes. A container object (such as a list) produces a fresh new iterator each time you pass it to the iter() function or use it in a for loop. Attempting this with an iterator will just return the same exhausted iterator object used in the previous iteration pass, making it appear like an empty container.
把list、dict、str等 Iterable 变成 Iterator 可以使用iter()函数:
123456789101112131415161718192021222324252627282930313233343536373839404142 | #这是在 python2.7 中的 iter 对象>>> t = (123,'abc',3.14)>>> i = iter(t)>>> i<tupleiterator object at 0x10b2d2210>>>> i.next()123>>> i.next()'abc'>>> i.next()3.14>>> i.next()Traceback (most recent call last): File "<stdin>", line 1, in <module>StopIteration#当你或是一个循环机制(例如 for 语句)需要下一个项时,调用迭代器的 next() 方法就可以获得它。条目全部取出后,会引发一个 StopIteration 异常,这并不表示错误发生,只是告诉外部调用者,迭代完成.#在 python3 会报错 原因是没有 iter 对象没有这个方法,可以用 next() 函数实现相同作用>>> t = (123,'abc',3.14)>>> i = iter(t)>>> i<tuple_iterator object at 0x1022288d0>>>> i.next()Traceback (most recent call last): File "<stdin>", line 1, in <module>AttributeError: 'tuple_iterator' object has no attribute 'next'>>> next(i)123>>> next(i)'abc'>>> next(i)3.14>>> next(i)Traceback (most recent call last): File "<stdin>", line 1, in <module>StopIteration#看一下这个是不是可迭代对象>>> from collections import Iterable>>> isinstance(i,Iterable)True |
在 python 中的 Iterator 对象表示的是一个数据流,它可以被 next() 函数调用并且不断返回下一个数据,直到没有数据时,抛出 StopIteration 异常。可以把这个数据流看做是一个有序序列,但我们却不能提前知道序列的长度,只能不断通过 next() 函数实现按需计算下一个数据,所以 Iterator 的计算是惰性的,只有在需要返回下一个数据时它才会计算。
Iterator 甚至可以表示一个无限大的数据流,例如全体自然数。而使用 list 是永远不可能存储全体自然数的。
python 的 for 循环的本质上就是不断调用 next() 实现的 例如:
123456789101112131415 | for x in [1, 2, 3, 4, 5]: pass###等价于#### 首先获得Iterator对象:it = iter([1, 2, 3, 4, 5])# 循环:while True: try: # 获得下一个值: x = next(it) except StopIteration: # 遇到StopIteration就退出循环 break |
迭代器的解压缩
12345678910111213 | #列表的解压缩>>> a = [1,2,3]>>> b = ['a','b','c']>>> z = zip(a,b)>>> z<zip object at 0x102229b48>#如果是 python2 那么可以直接输出列表形式的 z :[(1, 'a'), (2, 'b'), (3, 'c')]#若果是 python3 则显示如上#python3中 zip() 是可迭代对象,使用时必须包含在一个 list 的中,可以显示所有结果。>>> list(z)[(1, 'a'), (2, 'b'), (3, 'c')]>>> isinstance(z,Iterable)True |
- 列表相邻元素的压缩:
1234567891011121314151617 | >>> a = list(range(1,7))>>> a[1, 2, 3, 4, 5, 6]>>> zip(*([iter(a)]*2))<zip object at 0x102229dc8>>>> z = zip(*([iter(a)]*2))>>> list(z)[(1, 2), (大专栏 Python常见的高级特性r">3, 4), (5, 6)]#这就产生了一个很实用的分组函数>>> group_adjacent = lambda a,k:zip(*([iter(a)]*k))>>> z = list(range(9))>>> gz = group_adjacent(z,3)>>> list(gz)[(0, 1, 2), (3, 4, 5), (6, 7, 8)]>>> gz = group_adjacent(z,1)>>> list(gz)[(0,), (1,), (2,), (3,), (4,), (5,), (6,), (7,), (8,)] |
- 滑动取值窗口
1234567 | >>> a = list(range(1,7))>>> a[1, 2, 3, 4, 5, 6]>>> n_grams = lambda a,n:zip(*([iter(a[i:]) for i in range(n)]))>>> list(n_grams(a,3))[(1, 2, 3), (2, 3, 4), (3, 4, 5), (4, 5, 6)]#这里的这个就很像 nlp 中的滑窗 |
- 展开列表
123456789101112131415161718192021 | #方法1>>> a = [[1,2],[3,4],[5,6]]>>> import itertools>>> list(itertools.chain.from_iterable(a))[1, 2, 3, 4, 5, 6]#itertools模块提供的全部是处理迭代功能的函数,它们的返回值不是list,而是迭代对象,只有用for循环迭代的时候才真正计算。#方法2>>> sum(a,[])[1, 2, 3, 4, 5, 6]#方法3>>> [x for l in a for x in l][1, 2, 3, 4, 5, 6]>>> a = [[[1, 2], [3, 4]], [[5, 6], [7, 8]]]>>> [x for l1 in a for l2 in l1 for x in l2][1, 2, 3, 4, 5, 6, 7, 8]>>> a = [1, 2, [3, 4], [[5, 6], [7, 8]]]>>> flatten = lambda x: [y for l in x for y in flatten(l)] if type(x) is list else [x]>>> flatten(a)[1, 2, 3, 4, 5, 6, 7, 8] |
生成式
- 列表生成式
12345678910111213141516171819202122 | >>> list(range(1, 11))[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]#但如果要生成[1x1, 2x2, 3x3, ..., 10x10]>>> L = []>>> for x in range(1, 11):... L.append(x * x)...>>> L[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]#可以用一行语句代上面的式子>>> [x * x for x in range(1,11)][1, 4, 9, 16, 25, 36, 49, 64, 81, 100]#可以加入判断>>> L1 = ['Hello', 'World', 18, 'Apple', None]>>> L2 = [s.lower() for s in L1 if isinstance(s,str)]>>> L2['hello', 'world', 'apple']#之前的 flatten() 函数>>> flatten = lambda x: [y for l in x for y in flatten(l)] if type(x) is list else [x] |
- 其他
生成式也可用于字典中,也被称为字典推导:
123456 | >>> n = {x : x ** 2 for x in range(5)}>>> n{0: 0, 1: 1, 2: 4, 3: 9, 4: 16}>>> m = {x: 'A' + str(x) for x in range(5)}>>> m{0: 'A0', 1: 'A1', 2: 'A2', 3: 'A3', 4: 'A4'} |
字典推导反转字典:
12345 | >>> m = {'a':1,'b':2,'c':3,'d':4}>>> m{'a': 1, 'd': 4, 'c': 3, 'b': 2}>>> {v : k for k,v in m.items()}{1: 'a', 2: 'b', 3: 'c', 4: 'd'} |
生成器
如果把列表的生成式的 [] 变为 () 就得到了一个 generator
12345678910111213141516171819202122 | >>> L = [x * x for x in range(10)]>>> L[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]>>> g = (x * x for x in range(10))>>> g<generator object <genexpr> at 0x102a0dba0>#g的值可以通过 next() 函数打印出来>>> next(g)0>>> next(g)1>>> next(g)4...>>> next(g)81>>> next(g)Traceback (most recent call last): File "<stdin>", line 1, in <module>StopIteration#和之前提到的 Iterator 类似,也可以通过 for 循环迭代 |
使用生成器很容易写出前 n 个斐波那契数
123456789101112131415161718192021222324252627282930313233343536 | # 函数定义>>> def (max):... n, a, b = 0, 0, 1... while n < max:... print(b)... a, b = b, a + b... n = n + 1... return 'done'...>>> fib(5)11235'done'# 定义 generator>>> def (max):... n, a, b = 0, 0, 1... while n < max:... yield b... a, b = b, a + b... n = n + 1... return 'done'...>>> f = fib(5)>>> f<generator object fib at 0x102a0db48>>>> for n in f:... print(n)...11235 |
但是用for循环调用generator时,发现拿不到generator的return语句的返回值。如果想要拿到返回值,必须捕获StopIteration错误,返回值包含在StopIteration的value中:
12345678910111213141516 | >>> g = fib(6)>>> while True:... try:... x = next(g)... print('g:', x)... except StopIteration as e:... print('Generator return value:', e.value)... break...g: 1g: 1g: 2g: 3g: 5g: 8Generator return value: done |
参考:
以及官方文档、 Stack Overflow、百度、 Google 等。。。# 切片
正负索引就不提及了,都是比较基本的用法。