Remove namespace prefix with sed

前端 未结 2 761
说谎
说谎 2021-01-22 19:27

I want to convert this piece of xml:


  
    Apples
    Bananas
  

        
相关标签:
2条回答
  • 2021-01-22 19:45

    Here's how you could do it with hxpipe and hxunpipe from the W3C HTML-XML-utils (packaged for many distributions):

    $ hxpipe infile | sed 's/^\([()]\)v1:/\1/g' | hxunpipe
    <table>
      <tr>
        <td>Apples</td>
        <td>Bananas</td>
      </tr>
    </table>
    

    hxpipe parses XML/HTML and turns it into an awk/sed-friendly line based format:

    $ hxpipe infile
    (v1:table
    -\n  
    (v1:tr
    -\n    
    (v1:td
    -Apples
    )v1:td
    -\n    
    (v1:td
    -Bananas
    )v1:td
    -\n  
    )v1:tr
    -\n
    )v1:table
    -\n
    

    where lines starting with ( and ) are opening and closing tags, so removing the first v1: from lines starting with ( or ) (which is what the sed command above does) achieves the desired effect. Notice that text lines start with a -, so there can't be any false positives.

    0 讨论(0)
  • 2021-01-22 19:49

    This sed works for your example:

    sed -E 's~(</?)v1:~\1~g' file
    
    <table>
      <tr>
        <td>Apples</td>
        <td>Bananas</td>
      </tr>
    </table>
    

    However just a note that sed is not the best tool for parsing HTML/XML. Consider using HTML parsers.

    0 讨论(0)
提交回复
热议问题