问题
I couldn't figure out how to perform line.startswith("substring")
for a set of substrings, so I tried a few variations on the code at bottom: since I have the luxury of known 4-character beginning substrings, but I'm pretty sure I've got the syntax wrong, since this doesn't reject any lines.
(Context: my aim is to throw out header lines when reading in a file. Header lines start with a limited set of strings, but I can't just check for the substring anywhere, because a valid content line may include a keyword later in the string.)
cleanLines = []
line = "sample input here"
if not line[0:3] in ["node", "path", "Path"]: #skip standard headers
cleanLines.append(line)
回答1:
Your problem stems from the fact that string slicing is exclusive of the stop index:
In [7]: line = '0123456789'
In [8]: line[0:3]
Out[8]: '012'
In [9]: line[0:4]
Out[9]: '0123'
In [10]: line[:3]
Out[10]: '012'
In [11]: line[:4]
Out[11]: '0123'
Slicing a string between i
and j
returns the substring starting at i
, and ending at (but not including) j
.
Just to make your code run faster, you might want to test membership in sets, instead of in lists:
cleanLines = []
line = "sample input here"
blacklist = set(["node", "path", "Path"])
if line[:4] not in blacklist: #skip standard headers
cleanLines.append(line)
Now, what you're actually doing with that code is a startswith
, which is not restricted by any length parameters:
In [12]: line = '0123456789'
In [13]: line.startswith('0')
Out[13]: True
In [14]: line.startswith('0123')
Out[14]: True
In [15]: line.startswith('03')
Out[15]: False
So you could do this to exclude headers:
cleanLines = []
line = "sample input here"
headers = ["node", "path", "Path"]
if not any(line.startswith(header) for header in headers) : #skip standard headers
cleanLines.append(line)
来源:https://stackoverflow.com/questions/33573706/check-if-string-begins-with-one-of-several-substrings-in-python