问题
How large can the input I supply to the input()
function be?
Unfortunately, there was no easy way to test it. After using a lot of copy-pasting I couldn't get input
to fail on any input I supplied. (and I eventually gave up)
The documentation for the input
function doesn't mention anything regarding this:
If the
prompt
argument is present, it is written to standard output without a trailing newline. The function then reads a line from input, converts it to a string (stripping a trailing newline), and returns that. When EOF is read,EOFError
is raised.
So, I'm guessing there is no limit? Does anyone know if there is and, if so, how much is it?
回答1:
Of course there is, it can't be limitless*. The key sentence from the documentation that I believe needs highlighting is:
[...] The function then reads a line from input, converts it to a string (stripping a trailing newline) [...]
(emphasis mine)
Since it converts the input you supply into a Python str
object it essentially translates to: "Its size has to be less than or equal to the largest string Python can create".
The reason why no explicit size is given is probably because this is an implementation detail. Enforcing a maximum size to all other implementations of Python wouldn't make much sense.
*In CPython, at least, the largest size of a string is bounded by how big its index is allowed to be (see PEP 353). That is, how big the number in the brackets []
is allowed to be when you try and index it:
>>> s = ''
>>> s[2 ** 63]
IndexErrorTraceback (most recent call last)
<ipython-input-10-75e9ac36da20> in <module>()
----> 1 s[2 ** 63]
IndexError: cannot fit 'int' into an index-sized integer
(try the previous with 2 ** 63 - 1
, that's the positive acceptable limit, -2 ** 63
is the negative limit.)
For indices, it isn't Python numbers that are internally used; instead, it is a Py_ssize_t
which is a signed 32/64 bit int on 32/64 bit machines respectively. So, that's the hard limit from what it seems.
(as the error message states, int and intex-sized integer are two different things)
It also seems like input() explicitly checks if the input supplied is larger than PY_SSIZE_T_MAX
(the maximum size of Py_ssize_t
) before converting:
if (len > PY_SSIZE_T_MAX) {
PyErr_SetString(PyExc_OverflowError,
"input: input too long");
result = NULL;
}
Then it converts the input to a Python str
with PyUnicode_Decode
.
To put that in perspective for you; if the average book is 500.000
characters long and the estimation for the total number of books is around 130 million, you could theoretically input
around:
>>> ((2 ** 63) - 1) // 500000 * 130000000
141898
times those characters; it would probably take you some time, though :-) (and you'd be limited by available memory first!)
回答2:
We can find the answer experimentally quite easily. Make two files:
make_lines.py
:
num_lines = 34
if __name__ == '__main__':
for i in range(num_lines):
print('a' * (2 ** i))
read_input.py
:
from make_lines import num_lines
for i in range(num_lines):
print(len(input()))
Then run this command in Linux or OSX (I don't know the Windows equivalent):
python make_lines.py | python3 read_input.py
On my computer it manages to finish but struggles by the end, slowing down other processes significantly. The last thing it prints is 8589934592
, i.e. 8 GiB. You can find out the value for yourself according to your definition of what's acceptable in terms of time and memory limits.
来源:https://stackoverflow.com/questions/40598483/how-big-can-the-input-to-the-input-function-be