Ruby — looking for some sort of “Regexp unescape” method

你说的曾经没有我的故事 提交于 2019-12-12 17:34:08

问题


I have a bunch of string with special escape codes that I want to store unescaped- eg, the interpreter shows

"\\014\"\\000\"\\016smoothing\"\\011mean\"\\022color\"\\011zero@\\016" but I want it to show (when inspected) as "\014\"\000\"\016smoothing\"\011mean\"\022color\"\011zero@\016"

What's the method to unescape them? I imagine that I could make a regex to remove 1 backslash from every consecutive n backslashes, but I don't have a lot of regex experience and it seems there ought to be a "more elegant" way to do it.

For example, when I puts MyString it displays the output I'd like, but I don't know how I might capture that into a variable.

Thanks!

Edited to add context: I have this class that is being used to marshal / restore some stuff, but when I restore some old strings it spits out a type error which I've determined is because they weren't -- for some inexplicable reason -- stored as base64. They instead appear to have just been escaped, which I don't want, because trying to restore them similarly gives the TypeError TypeError: incompatible marshal file format (can't be read) format version 4.8 required; 92.48 given because Marshal looks at the first characters of the string to determine the format.

require 'base64'
class MarshaledStuff < ActiveRecord::Base

  validates_presence_of :marshaled_obj

  def contents
    obj = self.marshaled_obj
    return Marshal.restore(Base64.decode64(obj))
  end

  def contents=(newcontents)
    self.marshaled_obj = Base64.encode64(Marshal.dump(newcontents))
  end
end

Edit 2: Changed wording -- I was thinking they were "double-escaped" but it was only single-escaped. Whoops!


回答1:


If your strings give you the correct output when you print them then they are already escaped correctly. The extra backslashes you see are probably because you are displaying them in the interactive interpreter which adds extra backslashes for you when you display variables to make them less ambiguous.

> x
=> "\\"
> puts x
\
=> nil
> x.length
=> 1

Note that even though it looks like x contains two backslashes, the length of the string is one. The extra backslash is added by the interpreter and is not really part of the string.

If you still think there's a problem, please be more specific about how you are displaying the strings that you mentioned in your question.


Edit: In your example the only thing that need unescaping are octal escape codes. You could try this:

x = x.gsub(/\\[0-2][0-7]{2}/){ |c| c[1,3].to_i(8).chr }


来源:https://stackoverflow.com/questions/2602498/ruby-looking-for-some-sort-of-regexp-unescape-method

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!