From the python.org tutorial
Slice indices have useful defaults; an omitted first index defaults to zero, an omitted second index defaults to the size
The docs simply aren't correct about the default values as you've pointed out. However, they're consistent other than that minor error. You can view the docs I am referring to here: https://docs.python.org/3/library/stdtypes.html#common-sequence-operations
The slice of s from i to j with step k is defined as the sequence of items with index x = i + n*k such that 0 <= n < (j-i)/k. In other words, the indices are i, i+k, i+2*k, i+3*k and so on, stopping when j is reached (but never including j).
When you do:
>>> a = "hello"
>>> y = a[0:5:-1]
we have that i == 0
, j == 5
, and k == -1
. So we are grabbing items at index x = i + n*k
for n
starting at 0
and going up to (j-i)/k
. However, observe that (j-i)/k == (5-0)/-1 == -5
. There are no n
such that 0 <= n < -5
, so you get the empty string:
>>> y
''
a[start:stop][::step]
when in doubt (it's almost always what we want)It's almost always the case that when you pass a negative step to something like x[start:stop:step]
, what you want to happen is for the sub selection to happen first, and then just go backwards by step
(i.e. we usually want x[start:stop][::step]
.
Futhermore, to add to the confusion, it happens to be the case that
x[start:stop:step] == x[start:stop][::step]
if step > 0
. For example:
>>> x = list(range(10))
>>> x
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> x[2:6:2]
[2, 4]
>>> x[2:6][::2]
[2, 4]
>>> x[1:10][::3]
[1, 4, 7]
>>> x[1:10:3]
[1, 4, 7]
Unfortunately, this doesn't hold when step < 0
, even though it's tempting to think that it should.
After being burned by this a couple times, I realized it's just safer to always do the step clause after you perform the start:stop
slice. So I almost always start with y = x[start:stop][::step]
, at least when prototyping or creating a new module where correctness/readability is the primiary concern. This is less performant than doing a single slice, but if performance is an issue, then you can do the less readable:
y = x[start:stop:step] if step > 0 else x[stop:start:step]
HTH.
For Python slicing for a sequence[start:stop:step], have derived these rules:
# start:stop: +step Rules
# start:stop:-step Rules
a[0:5:-1]
does not make much sense, since when you use this notation the indices mean: a[start:end:step]
. When you use a negative step your end value needs to be at an "earlier" position than your start value.
All it does is slice. You pick. start stop and step so basically you're saying it should start at the beginning until the beginning but going backwards (-1).
If you do it with -2 it will skip letters:
>>> a[::-2]
'olh'
When doing [0:5:-1]
your'e starting at the first letter and going back directly to 5 and thus it will stop. only if you try [-1::-1]
will it correctly be able to go to the beginning by doing steps of negative 1.
Edit to answer comments
As pointed out the documentation says
an omitted second index defaults to the size of the string being sliced.
Lets assume we have str
with len(str) = 5
. When you slice the string and omit, leave out, the second number it defaults to the length of the string being sliced, in this case - 5.
i.e str[1:] == str[1:5]
, str[2:] == str[2:5]
. The sentence refers to the length of the original object and not the newly sliced object.
Also, this answer is great
You'll notice that the third slice argument, the step
, is not presented in the part of the tutorial you quoted. That particular snippet assumes a positive step.
When you add in the possibility of a negative step, the behavior is actually pretty intuitive. An empty start
parameter refers to whichever end of the sequence one would start at to step through the whole sequence in the direction indicated by the step
value. In other words it refers to the lowest index (to count up) if you have a positive step, and the highest index (to count down) if you have a negative step. Likewise, an empty end
parameter refers to whichever end of the sequence one would end up at after stepping through in the appropriate direction.
I think the docs are perhaps a little misleading on this, but the optional arguments of slicing if omitted are the same as using None
:
>>> a = "hello"
>>> a[::-1]
'olleh'
>>> a[None:None:-1]
'olleh'
You can see that these 2 above slices are identical from the CPython bytecode:
>>> import dis
>>> dis.dis('a[::-1]') # or dis.dis('a[None:None:-1]')
1 0 LOAD_NAME 0 (a)
3 LOAD_CONST 0 (None)
6 LOAD_CONST 0 (None)
9 LOAD_CONST 2 (-1)
12 BUILD_SLICE 3
15 BINARY_SUBSCR
16 RETURN_VALUE
For a negative step
, the substituted values for None
are len(a) - 1
for the start
and -len(a) - 1
for the end
:
>>> a[len(a)-1:-len(a)-1:-1]
'olleh'
>>> a[4:-6:-1]
'olleh'
>>> a[-1:-6:-1]
'olleh'
This may help you visualize it:
h e l l o
0 1 2 3 4 5
-6 -5 -4 -3 -2 -1