Python: Split a string, respect and preserve quotes [duplicate]

假如想象 提交于 2019-11-29 02:47:52

问题


Using python, I want to split the following string:

a=foo, b=bar, c="foo, bar", d=false, e="false"

This should result in the following list:

['a=foo', 'b=bar', 'c="foo, bar"', 'd=false', 'e="false'"']

When using shlex in posix-mode and splitting with ", ", the argument for cgets treated correctly. However, it removes the quotes. I need them because false is not the same as "false", for instance.

My code so far:

import shlex

mystring = 'a=foo, b=bar, c="foo, bar", d=false, e="false"'

splitter = shlex.shlex(mystring, posix=True)
splitter.whitespace += ','
splitter.whitespace_split = True
print list(splitter) # ['a=foo', 'b=bar', 'c=foo, bar', 'd=false', 'e=false']

回答1:


>>> s = r'a=foo, b=bar, c="foo, bar", d=false, e="false", f="foo\", bar"'
>>> re.findall(r'(?:[^\s,"]|"(?:\\.|[^"])*")+', s)
['a=foo', 'b=bar', 'c="foo, bar"', 'd=false', 'e="false"', 'f="foo\\", bar"']
  1. The regex pattern "[^"]*" matches a simple quoted string.
  2. "(?:\\.|[^"])*" matches a quoted string and skips over escaped quotes because \\. consumes two characters: a backslash and any character.
  3. [^\s,"] matches a non-delimiter.
  4. Combining patterns 2 and 3 inside (?: | )+ matches a sequence of non-delimiters and quoted strings, which is the desired result.



回答2:


Regex can solve this easily enough:

import re

mystring = 'a=foo, b=bar, c="foo, bar", d=false, e="false"'

splitString = re.split(',?\s(?=\w+=)',mystring)

The regex pattern here looks for a whitespace followed by a word character and then an equals sign which splits your string as you desire and maintains any quotes.



来源:https://stackoverflow.com/questions/16710076/python-split-a-string-respect-and-preserve-quotes

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!