Challenge: Regex-only tokenizer for shell-assignment-like config lines

别等时光非礼了梦想. 提交于 2019-12-09 02:33:32

I'm not sure about the full specs and what exactly you want in the value capturing group, but this should work for your test cases:

/
^\s*+

(?:export\s++)?
(?<key>\w++)

\s*+
=
\s*+

(?<value>
  (?>  "(?:[^"\\]+|\\.)*+"
  |    '(?:[^'\\]+|\\.)*+'
  |    `(?:[^`\\]+|\\.)*+`
  |    [^#\n\r]++
  )
)

\s*+
(?:#.*+)?
$
/mx;

Handles comments and quotes with escapes.

Perl/PCRE flavor and quoting.


Example usage in Perl:

my $re = qr/
    ^\s*+

    (?:export\s++)?
    (?<key>\w++)

    \s*+
    =
    \s*+

    (?<value>
      (?>  "(?:[^"\\]+|\\.)*+"
      |    '(?:[^'\\]+|\\.)*+'
      |    `(?:[^`\\]+|\\.)*+`
      |    [^#\n\r]++
      )
    )

    \s*+
    (?:\#.*+)?
    $
/mx;

my $str = <<'_TESTS_';
RAILS_ENV=development     # Don't forget to change this for TechCrunch
HOSTNAME=`cat /etc/hostname`
plist=`cat "/Applications/Sublim\`e Text 2.app/Content's/Info.plist"`

# Optional bonus input: "#" present in the string
FORMAT="  ##0.00 passe\`" #comment

listen_addresses = 127.0.0.1 #localhost only by default
# listen_addresses = 0.0.0.0 commented out, should not match

TEST="foo'bar\"baz#"
TEST='foo\'bar"baz#\\'
_TESTS_


for(split /[\r\n]+/, $str){
    print "line: $_\n";
    print /$re/? "match: $1, $2\n": "no match\n";
    print "\n";
}

Output:

line: RAILS_ENV=development     # Don't forget to change this for TechCrunch
match: RAILS_ENV, development

line: HOSTNAME=`cat /etc/hostname`
match: HOSTNAME, `cat /etc/hostname`

line: plist=`cat "/Applications/Sublim\`e Text 2.app/Content's/Info.plist"`
match: plist, `cat "/Applications/Sublim\`e Text 2.app/Content's/Info.plist"`

line: # Optional bonus input: "#" present in the string
no match

line: FORMAT="  ##0.00 passe\`" #comment
match: FORMAT, "  ##0.00 passe\`"

line: listen_addresses = 127.0.0.1 #localhost only by default
match: listen_addresses, 127.0.0.1

line: # listen_addresses = 0.0.0.0 commented out, should not match
no match

line: TEST="foo'bar\"baz#"
match: TEST, "foo'bar\"baz#"

line: TEST='foo\'bar"baz#\\'
match: TEST, 'foo\'bar"baz#\\'
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!