How to have gsub handle multiple patterns and replacements

為{幸葍}努か 提交于 2020-01-03 04:58:10

问题


A while ago I created a function in PHP to "twitterize" the text of tweets pulled via Twitter's API.

Here's what it looked like:

function twitterize($tweet){
$patterns = array ( "/((([A-Za-z]{3,9}:(?:\/\/)?)(?:[-;:&=\+\$,\w]+@)?[A-Za-z0-9.-]+|(?:www.|[-;:&=\+\$,\w]+@)[A-Za-z0-9.-]+)((?:\/[\+~%\/.\w-_]*)?\??(?:[-\+=&;%@.\w_]*)#?(?:[\w]*))?)/", 
                    "/(?<=^|(?<=[^a-zA-Z0-9-\.]))@([A-Za-z_]+[A-Za-z0-9_]+)/",
                    "/(?<=^|(?<=[^a-zA-Z0-9-\.]))#([A-Za-z_]+[A-Za-z0-9_]+)/");
$replacements = array ("<a href='\\0' target='_blank'>\\0</a>", "<a href='http://twitter.com/\\1' target='_blank'>\\0</a>", "<a href='http://twitter.com/search?q=\\1&src=hash' target='_blank'>\\0</a>");

return preg_replace($patterns, $replacements, $tweet);

}

Now I'm a little stuck with Ruby's gsub, I tried:

def twitterize(text)
patterns = ["/((([A-Za-z]{3,9}:(?:\/\/)?)(?:[-;:&=\+\$,\w]+@)?[A-Za-z0-9.-]+|(?:www.|[-;:&=\+\$,\w]+@)[A-Za-z0-9.-]+)((?:\/[\+~%\/.\w-_]*)?\??(?:[-\+=&;%@.\w_]*)#?(?:[\w]*))?)/", "/(?<=^|(?<=[^a-zA-Z0-9-\.]))@([A-Za-z_]+[A-Za-z0-9_]+)/", "/(?<=^|(?<=[^a-zA-Z0-9-\.]))#([A-Za-z_]+[A-Za-z0-9_]+)/"]
replacements =  ["<a href='\\0' target='_blank'>\\0</a>",
                "<a href='http://twitter.com/\\1' target='_blank'>\\0</a>",
                "<a href='http://twitter.com/search?q=\\1&src=hash' target='_blank'>\\0</a>"]

return text.gsub(patterns, replacements)
end

Which obviously didn't work and returned an error:

No implicit conversion of Array into String

And after looking at the Ruby documentation for gsub and exploring a few of the examples they were providing, I still couldn't find a solution to my problem: How can I have gsub handle multiple patterns and multiple replacements at once?


回答1:


Well, as you can read from the docs, gsub does not handle multiple patterns and replacements at once. That's what causing your error, quite explicit otherwise (you can read that as "give me a String, not an Array!!1").

You can write that like this:

def twitterize(text)
  patterns = [/((([A-Za-z]{3,9}:(?:\/\/)?)(?:[-;:&=\+\$,\w]+@)?[A-Za-z0-9.-]+|(?:www.|[-;:&=\+\$,\w]+@)[A-Za-z0-9.-]+)((?:\/[\+~%\/.\w-_]*)?\??(?:[-\+=&;%@.\w_]*)#?(?:[\w]*))?)/, /(?<=^|(?<=[^a-zA-Z0-9-\.]))@([A-Za-z_]+[A-Za-z0-9_]+)/, /(?<=^|(?<=[^a-zA-Z0-9-\.]))#([A-Za-z_]+[A-Za-z0-9_]+)/]
  replacements =  ["<a href='\\0' target='_blank'>\\0</a>",
            "<a href='http://twitter.com/\\1' target='_blank'>\\0</a>",
            "<a href='http://twitter.com/search?q=\\1&src=hash' target='_blank'>\\0</a>"]

  patterns.each_with_index do |pattern, i|
    text.gsub!(pattern, replacements[i])
  end

  text
end

This can be refactored into more elegant rubyish code, but I think it'll do the job.




回答2:


The error was because you tried to use an array of replacements in the place of a string in the gsub function. Its syntax is:

text.gsub(matching_pattern,replacement_text)

You need to do something like this:

replaced_text = text.gsub(pattern1, replacement1)
replaced_text = replaced_text.gsub(pattern2, replacement2)

and so on, where the pattern 1 is one of your matching patterns and replacement is the replacement text you would like.



来源:https://stackoverflow.com/questions/28423345/how-to-have-gsub-handle-multiple-patterns-and-replacements

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!