Best way to escape and unescape strings in Ruby?

匿名 (未验证) 提交于 2019-12-03 02:11:02

问题:

Does Ruby have any built-in method for escaping and unescaping strings? In the past, I've used regular expressions; however, it occurs to me that Ruby probably does such conversions internally all the time. Perhaps this functionality is exposed somewhere.

So far I've come up with these functions. They work, but they seem a bit hacky:

def escape(s)   s.inspect[1..-2] end  def unescape(s)   eval %Q{"#{s}"} end 

Is there a better way?

回答1:

If you don't want to use eval, but are willing to use the YAML module, you can use it instead:

require 'yaml'  def unescape(s)   YAML.load(%Q(---\n"#{s}"\n)) end 

The advantage to YAML over eval is that it is presumably safer. cane disallows all usage of eval. I've seen recommendations to use $SAFE along with eval, but that is not available via JRuby currently.

For what it is worth, Python does have native support for unescaping backslashes.



回答2:

There are a bunch of escaping methods, some of them:

# Regexp escapings >> Regexp.escape('\*?{}.')    => \\\*\?\{\}\.  >> URI.escape("test=100%") => "test=100%25" >> CGI.escape("test=100%") => "test%3D100%25" 

So, its really depends on the issue you need to solve. But I would avoid using inspect for escaping.

Update - there is a dump, inspect uses that, and it looks like it is what you need:

>> "\n\t".dump => "\"\\n\\t\"" 


回答3:

Caleb function was the nearest thing to the reverse of String #inspect I was able to find, however it contained two bugs:

  • \\ was not handled correctly.
  • \x.. retained the backslash.

I fixed the above bugs and this is the updated version:

UNESCAPES = {     'a' => "\x07", 'b' => "\x08", 't' => "\x09",     'n' => "\x0a", 'v' => "\x0b", 'f' => "\x0c",     'r' => "\x0d", 'e' => "\x1b", "\\\\" => "\x5c",     "\"" => "\x22", "'" => "\x27" }  def unescape(str)   # Escape all the things   str.gsub(/\\(?:([#{UNESCAPES.keys.join}])|u([\da-fA-F]{4}))|\\0?x([\da-fA-F]{2})/) {     if $1       if $1 == '\\' then '\\' else UNESCAPES[$1] end     elsif $2 # escape \u0000 unicode       ["#$2".hex].pack('U*')     elsif $3 # escape \0xff or \xff       [$3].pack('H2')     end   } end  # To test it while true     line = STDIN.gets     puts unescape(line) end 


回答4:

YAML's ::unescape doesn't seem to escape quote characters, e.g. ' and ". I'm guessing this is by design, but it makes me sad.

You definitely do not want to use eval on arbitrary or client-supplied data.

This is what I use. Handles everything I've seen and doesn't introduce any dependencies.

UNESCAPES = {     'a' => "\x07", 'b' => "\x08", 't' => "\x09",     'n' => "\x0a", 'v' => "\x0b", 'f' => "\x0c",     'r' => "\x0d", 'e' => "\x1b", "\\\\" => "\x5c",     "\"" => "\x22", "'" => "\x27" }  def unescape(str)   # Escape all the things   str.gsub(/\\(?:([#{UNESCAPES.keys.join}])|u([\da-fA-F]{4}))|\\0?x([\da-fA-F]{2})/) {     if $1       if $1 == '\\' then '\\' else UNESCAPES[$1] end     elsif $2 # escape \u0000 unicode       ["#$2".hex].pack('U*')     elsif $3 # escape \0xff or \xff       [$3].pack('H2')     end   } end 


回答5:

Ruby's inspect can help:

    "a\nb".inspect => "\"a\\nb\"" 

Normally if we print a string with an embedded line-feed, we'd get:

puts "a\nb" a b 

If we print the inspected version:

puts "a\nb".inspect "a\nb" 

Assign the inspected version to a variable and you'll have the escaped version of the string.

To undo the escaping, eval the string:

puts eval("a\nb".inspect) a b 

I don't really like doing it this way. It's more of a curiosity than something I'd do in practice.



回答6:

I suspect that Shellwords.escape will do what you're looking for

https://ruby-doc.org/stdlib-1.9.3/libdoc/shellwords/rdoc/Shellwords.html#method-c-shellescape



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!