Slice is one of Python’s most distinctive features. Before we begin, let’s review what we know about slicing.
Slicing is mainly used in sequence objects to intercept the contents of an index according to the index interval.
Section writing form: [I: I +n: m]; Where, I is the initial index value of the slice and can be omitted if it is the first place in the list. I +n is the end position of the slice, which can be omitted if it is the end of the list. M may not be provided. The default value is 1 and cannot be 0. When m is negative, the list is flipped.
The basic meaning of slicing is: from the index of the i-th bit of the sequence to the right until the last n-bit element, filtering by m interval.
Here are some typical examples of slicing syntax:
# @ Python cat
li = [1.4.5.6.7.9.11.14.16]
X >= len(li)
li[0:X] == li[0:] == li[:X] == li[:] == li[::] == li[-X:X] == li[-X:]
li[1:5] = = [4.5.6.7] # From 1, take 5-1 bit elements
li[1:5:2] = = [4.6] # From 1, take 5-1 bit elements, filter by 2
li[-1:] = = [16] # take the reciprocal element
li[-4: -2] = = [9.11] # from the fourth to last, take -2-(-4)=2 elements
li[:-2] == li[-len(li):-2] = = [1.4.5.6.7.9.11] -len(li) =7 bits
When the step size is negative, the list is flipped and then truncated
li[::-1] = = [16.14.11.9.7.6.5.4.1] # Flip the entire list
li[::-2] = = [16.11.7.5.1] # Flip the entire list and filter by 2
li[:-5: -1] = = [16.14.11.9] # Flip the entire list and take -5-(-len(li))=4 elements
li[:-5: -3] = = [16.9] Select -5-(-len(li))=4 digits and filter by 3
The step size of slice cannot be 0
li[::0] ValueError: Slice step cannot be zero
Copy the code
Languages such as C/C++, Java, and JavaScript do support some “slicing” functions, such as snatching fragments of arrays or strings, but they do not have syntactic generality support.
According to Wikipedia, Fortran was the first language to support slicing syntax (1966), and Python is one of the most representative.
Also, languages like Perl, Ruby, Go, and Rust, while they have slicing, are not as flexible or free as Python (because it supports step, negative indexes, default indexes).
The basic use of slicing will do most of the job, but there are more advanced uses of Python slicing, such as slicing placeholders (to implement list assignment, delete, and concatenate operations), slicing of custom objects, slicing of iterators (itertools.islice()), slicing of file objects, and more. Related Reading: Advanced Python: A comprehensive slice of advanced features!
So much for the introduction and review of slicing.
Now to the question of the article’s title: Why doesn’t Python’s slicing syntax have index overruns?
When the value is evaluated by a single index, an error message “IndexError: list index out of range” is displayed if the index is out of range.
>>> li = [1.2]
>>> li[5]
Traceback (most recent call last):
File "<stdin>", line 1.in <module>
IndexError: list index out of range
Copy the code
For a non-empty sequence object, assuming length, its valid index is 0 to (length-1). If negative indexes are taken into account, the valid interval for a single index value is [-length, length-1] closed.
However, when an index in a Python slice falls outside this range, the program does not report an error.
>>> li = [1.2]
>>> li[1:5] The right index is exceeded
[2]
>>> li[5:6] The left and right indexes are exceeded
[]
Copy the code
In fact, this phenomenon is described in the official documentation:
The slice of s from i to j is defined as the sequence of items with index k such that
i <= k < j
. If i or j is greater thanlen(s)
, uselen(s)
. If i is omitted orNone
, use0
. If j is omitted orNone
, uselen(s)
. If i is greater than or equal to j, the slice is empty.
In other words:
- When the left or right index value is greater than the length value of the sequence, the length value is used as the index value.
- When the left index defaults or is None, 0 is used as the left index.
- When the right index defaults or is None, the sequence length value is used as the right index value.
- When the left index value is greater than or equal to the right index value, the result of slicing is empty.
Comparing with the above example, we can get:
>>> li = [1.2]
>>> li[1:5] # equivalent to li[1:2]
[2]
>>> li[5:6] # equivalent to li[2:2]
[]
Copy the code
It all comes down to one thing: The Python interpreter shields you from any actions that might cause an index to be out of bounds. You can write freely, but the end result will be strictly within the legal index range.
I’m actually a little confused about this phenomenon. Why doesn’t Python just report index overbounds, why does it fix the edge value of the slice, and why does it have to return a value, even though it might be an empty sequence?
When we use “li[5:6]” we mean, at least in the literal sense, “fetching the values from 5 to 6 in the index”, which is like saying “fetching the sixth and seventh books from left to right on the shelf”.
If the program had followed our instructions faithfully, it would have reported an error. It would have said, sorry, there aren’t enough books on the shelf.
To be honest, I have found no explanation for this, and this article is not intended to give you some insight into Python design. On the contrary, one of the main goals of this article is to get answers.
In the Go language, when encountering the same scenario, it does this by reporting “Runtime error: Slice bounds out of range.”
In Rust, when encountering the same scenario, it does this by reporting an error “Byte index 5 is out of bounds of…….” .
In other languages that support slicing syntax, there may be a similar design to Python. But I don’t know if…
Finally, returning to the question of the title, “Why Do Python slicing not index out of bounds?” I really want to ask two questions:
- Why is Python able to return a result when an index in the slicing syntax is out of bounds, and how does it compute the return result?
- Why should Python’s slicing syntax allow indexes to go beyond bounds, and why not design them to throw index errors?
The answer to the first question is clearly written in the official documentation.
For the second question, this paper has no answer.
I may find the answer soon, but it may take a long time. Anyway, that’s enough for this article.
If you enjoy studying the nitty-gritty of Python design and are interested in finding answers to the “why” question, follow the “Why Python” series.
Recommend reading favorite past topics:
(1) Why does Python recommend snake-like nomenclature?
(2) Why does Python use # as a comment?
(3) Why did the fathers of Python dislike lambda anonymous functions?
(4) Why does Python not support switch statements?
(5) Python: which is faster, [] or list()? Why fast? How much faster?
(6) why does Python not support the i++ increment syntax and does not provide the ++ operator?
This article is part of the “Why Python” series, which focuses on the syntax, design, and development of Python. The series tries to show the charm of Python by asking “why” questions. All posts will be archived on Github at github.com/chinesehuaz…