问题
I am using SAX Parser. I am trying to send the 'content' I retrieved using below code:
After checking the startElement and endElement, I have the below code:
def characters(self, content):
text = format.formatter(content)
this format.formatter is expected to read this data that I sent as 'content' for any processing like removing junk characters etc and return it. I do that by using string.replace function:
remArticles = {' ! ':'', ' $ ':''}
for line in content:
for i in remArticles:
line= line.replace(i, remArticles[i])
#FormattedFileForIndexing.write(line)
return line
However the output is not coming up as expected.
It will be great if some one can help on this.
source will some thing like:
"Oh! That's lots and 1000s of $$$$"
Expected: Oh That's lot of 1000s
回答1:
You are iterating over each character not each line:
def characters(content):
remArticles = {'!': '', '$': ''} # remove spaces from " ! "
for i in remArticles:
content = content.replace(i, remArticles[i])
return content
You are also trying to match !
and $
with spaces around them which according to your expected output is incorrect.
In [6]: content = "Oh! That's lots and 1000s of $$$$"
In [7]: characters(content)
Out[7]: "Oh That's lots and 1000s of "
Just using replace is the most efficient option:
In [20]: timeit characters(content)
1000000 loops, best of 3: 746 ns per loop
In [21]: timeit format_this(content)
100000 loops, best of 3: 2.57 µs per loop
回答2:
How about this:
def format_this(content):
bad_keys = {'!', '$'}
return "".join([element for element in content if element not in bad_keys])
if __name__ == '__main__':
content = "Oh! That's lots and 1000s of $$$$"
formatted_content = format_this(content)
print formatted_content
>>> Oh That's lots and 1000s of
回答3:
Your return line
is excessively aligned, assuming your Q is showing your actual code, so you end after the first replacement. De-indent that return
by 4 spaces, so it aligns with the for
keyword, not with the body of the for
loop.
Added: {' ! ':'', ' $ ':''}
matches exclamation marks and dollar signs only if they have spaces before and after them. But then the OP says a sample input is "Oh! That's lots and 1000s of $$$$"
-- no spaces before and after those punctuation marks, so nothing will be replaced.
来源:https://stackoverflow.com/questions/28091913/pass-content-to-function-of-another-module-in-python