问题
Hi I'm pretty new to programming and Python, and this is my first post, so I apologize for any poor form.
I am scraping a website's download counts and am receiving the following error when attempting to convert the list of string numbers to integers to get the sum. ValueError: invalid literal for int() with base 10: '1,015'
I have tried .replace() but it does not seem to be doing anything.
And tried to build an if statement to take the commas out of any string that contains them: Does Python have a string contains substring method?
Here's my code:
downloadCount = pageHTML.xpath('//li[@class="download"]/text()')
downloadCount_clean = []
for download in downloadCount:
downloadCount_clean.append(str.strip(download))
for item in downloadCount_clean:
if "," in item:
item.replace(",", "")
print(downloadCount_clean)
downloadCount_clean = map(int, downloadCount_clean)
total = sum(downloadCount_clean)
回答1:
Strings are not mutable in Python. So when you call item.replace(",", "")
, the method returns what you want, but it is not stored anywhere (thus not in item
).
EDIT :
I suggest this :
for i in range(len(downloadCount_clean)):
if "," in downloadCount_clean[i]:
downloadCount_clean[i] = downloadCount_clean[i].replace(",", "")
SECOND EDIT :
For a bit more simplicity and/or elegance :
for index,value in enumerate(downloadCount_clean):
downloadCount_clean[index] = int(value.replace(",", ""))
回答2:
For simplicities sake:
>>> aList = ["abc", "42", "1,423", "def"]
>>> bList = []
>>> for i in aList:
... bList.append(i.replace(',',''))
...
>>> bList
['abc', '42', '1423', 'def']
or working just with a single list:
>>> aList = ["abc", "42", "1,423", "def"]
>>> for i, x in enumerate(aList):
... aList[i]=(x.replace(',',''))
...
>>> aList
['abc', '42', '1423', 'def']
Not sure if this one breaks any python rules or not :)
来源:https://stackoverflow.com/questions/39644486/how-can-i-remove-all-extra-characters-from-list-of-strings-to-convert-to-ints