Is str.replace(..).replace(..) ad nauseam a standard idiom in Python?

前端 未结 9 1368
迷失自我
迷失自我 2020-12-20 11:19

For instance, say I wanted a function to escape a string for use in HTML (as in Django\'s escape filter):

    def escape(string):
        \"\"\"
        Retu         


        
相关标签:
9条回答
  • 2020-12-20 11:38

    Do you have an application that is running too slow and you profiled it to find that a line like this snippet is causing it to be slow? Bottlenecks occur at unexpected places.

    The current snippet traverses the string 5 times, doing one thing each time. You are suggesting traversing it once, probably doing doing five things each time (or at least doing something each time). It isn't clear that this will automatically do a better job to me. Currently the algorithm used is O(n*m) (assuming the length of the string is longer than the stuff in the rules), where n is the length of the string and m is the number of substitution rules. You could, I think, reduce the algorithmic complexity to something like O(n*log(m)) and in the specific case we're in—where the original things are all only one character (but not in the case of multiple calls to replace in general)—O(n), but this doesn't matter since m is 5 but n is unbounded.

    If m is held constant, then, the complexity of both solutions really goes to O(n). It is not clear to me that it is going to be a worthy task to try to turn five simple passes into one complex one, the actual time of which I cannot guess at the current moment. If there was something about it that could make it scale better, I would have thought it was much more worthwhile task.

    Doing everything on one pass rather than consecutive passes also demands questions be answered about what to do about conflicting rules and how they are applied. The resolution to these questions is clear with a chain of replace.

    0 讨论(0)
  • 2020-12-20 11:38

    You can use reduce:

    reduce(lambda s,r: s.replace(*r),
           [('&', '&'),
            ('<', '&lt;'),
            ('>', '&gt;'),
            ("'", '&#39;'),
            ('"', '&quot;')],
           string)
    
    0 讨论(0)
  • 2020-12-20 11:39

    Apparently it's pretty common to implement that via regex. You can find an example of this at ASPN and here.

    0 讨论(0)
提交回复
热议问题