Is it possible to write a regular expression that matches a particular pattern and then does a replace with a part of the pattern

后端 未结 2 1801
梦谈多话
梦谈多话 2021-01-28 10:01

I\'m working with some comma delimited text files. The file is comprised of approximately 400 rows and 94 columns all comma delimited and withing double quotes:

         


        
相关标签:
2条回答
  • 2021-01-28 10:22

    Use a look around

    (?<!"),(?!")
    

    replacing it with a pipe.

    which means

    (?<!")    - character before is not a "
    ,         - match a comma
    (?!")     - character after is not a "
    
    0 讨论(0)
  • 2021-01-28 10:37

    Rather than interfere with what is evidently source data, i.e. the stuff inside the quotes, you might consider replacing the field-separator commas instead:

    s/,([^,"]*|"[^"]*")(?=(,|$))/|$1/g
    

    Note that this also handles non-quoted fields.

    On this data: "H",9,"YES","NO","4,5","Y","N"

    $ perl -pe 's/,([^,"]*|"[^"]*")(?=(,|$))/|$1/g' commasep
    "H"|9|"YES"|"NO"|"4,5"|"Y"|"N"
    

    Which can afterwards be split on "|":

    $ perl -ne 's/,([^,"]*|"[^"]*")(?=(,|$))/|$1/g;print join "---",split "\\|"' commasep
    "H"---9---"YES"---"NO"---"4,5"---"Y"---"N"
    
    0 讨论(0)
提交回复
热议问题