Populate array from XML end tags

前端 未结 1 1262
遇见更好的自我
遇见更好的自我 2021-01-17 06:18

I am trying to create an array of field names that I can use later in my script. Regular expressions are kicking my butt. I haven\'t written code in a long time. The fiel

1条回答
  •  野的像风
    2021-01-17 06:51

    Your sample data isn't XML. Your slashes are backwards. Assuming it is XML you're trying to parse, the answer is 'don't use regular expressions'.

    They're simply not able to cope with the recursion and nesting to the degree necessary.

    So with that in mind - assuming your sample data is actually well formed XML and that is a typo, something like XML::Twig will do it quite handily:

    #!/usr/bin/env perl
    use strict;
    use warnings;
    
    use XML::Twig;
    
    my $twig = XML::Twig -> parse ( \*DATA );
    
    #extract a single field value
    print $twig -> root -> first_child_text('title'),"\n";
    #get a field name
    print $twig -> root -> first_child -> tag,"\n";
    #can also use att() if you have attributes
    
    
    print "Field names:\n";
    #children() returns all the children of the current (in this case root) node
    #We use map to access all, and tag to read their 'name'. 
    #att or trimmed_text would do other parts of the XML. 
    print join ( "\n", map { $_ -> tag } $twig -> root -> children );
    
    __DATA__
    
    DEFECT000179ApprovedSomething is broken
    
    

    This prints:

    Something is broken
    record
    Field names:
    record
    state
    title
    

    You also have a variety of other really useful tools, such as pretty_print for formatting your output XML, twig_handlers that let you manipulate XML as you parse (particularly handy for purge), cut and paste to move nodes around, and get_xpath to let you use an xpath expression to find elements based on path and attributes.

    Edit: Based on comments, if you really want to extract data from:

    
    

    The thing that's going wrong in your thingy is that .* is greedy. You either need to use a negated match - like:

    m,]>,g 
    

    Or a nongreedy match:

    m,,g
    

    Oh, and given you've a backslash - you need to escape it:

    my $firstLineOfXMLFile = 'DEFECT000179<\record>Approved<\state>Something is broken<\title>';
    my @fieldNames = $firstLineOfXMLFile =~ m(<\\(.*?)>)g;
    print @fieldNames;
    </code></pre>
    
    <p>Will do the trick. (but seriously - deliberately creating something that looks like XML that isn't is a really bad thing to do)</p>
        </p>
                 <div class="appendcontent">
                                                            </div>
                </div>
                <div class="jieda-reply">
                  <span class="jieda-zan button_agree" type="zan" data-id='1753641'>
                    <i class="iconfont icon-zan"></i>
                    <em>0</em>
                  </span>
                       <span type="reply" class="showpinglun" data-id="1753641">
                    <i class="iconfont icon-svgmoban53"></i>
                   讨论(0)
                  </span>
                                                      
                  
                  <div class="jieda-admin">
                              
                 
           
              
                  </div>
                                        </div>
                             <div class="comments-mod "  style="display: none; float:none;padding-top:10px;" id="comment_1753641">
                        <div class="areabox clearfix">
    
    <form class="layui-form" action="">
                   
                <div class="layui-form-item">
        <label class="layui-form-label" style="padding-left:0px;width:60px;">发布评论:</label>
        <div class="layui-input-block" style="margin-left:90px;">
             <input type="text" placeholder="不少于5个字" AUTOCOMPLETE="off" class="comment-input layui-input" name="content" />
                            <input type='hidden' value='0' name='replyauthor' />
        </div>
        <div class="mar-t10"><span class="fr layui-btn layui-btn-sm addhuidapinglun" data-id="1753641">提交评论 </span></div>
      </div>
      
    </form>
                        </div>
                        <hr>
                        <ul class="my-comments-list nav">
                            <li class="loading">
                            <img src='https://www.e-learn.cn/qa/static/css/default/loading.gif' align='absmiddle' />
                             加载中...
                            </li>
                        </ul>
                    </div>
              </li>
                                  			
            </ul>
            
            <div class="layui-form layui-form-pane">
              <form id="huidaform"  name="answerForm"  method="post">
                
                <div class="layui-form-item layui-form-text">
                  <a name="comment"></a>
                  <div class="layui-input-block">
                
        
    <script type="text/javascript" src="https://www.e-learn.cn/qa/static/js/neweditor/ueditor.config.js"></script>
    <script type="text/javascript" src="https://www.e-learn.cn/qa/static/js/neweditor/ueditor.all.js"></script>
    <script type="text/plain" id="editor"  name="content"  style="width:100%;height:200px;"></script>                                 
    <script type="text/javascript">
                                     var isueditor=1;
                var editor = UE.getEditor('editor',{
                    //这里可以选择自己需要的工具按钮名称,此处仅选择如下五个
                    toolbars:[['source','fullscreen',  '|', 'undo', 'redo', '|', 'bold', 'italic', 'underline', 'fontborder', 'strikethrough', 'removeformat', 'formatmatch', 'autotypeset', 'blockquote', 'pasteplain', '|', 'forecolor', 'backcolor', 'insertorderedlist', 'insertunorderedlist', 'selectall', 'cleardoc', '|', 'rowspacingtop', 'rowspacingbottom', 'lineheight', '|', 'customstyle', 'paragraph', 'fontfamily', 'fontsize', '|', 'indent', '|', 'justifyleft', 'justifycenter', 'justifyright', 'justifyjustify', '|', 'link', 'unlink', 'anchor', '|', 'simpleupload', 'insertimage', 'scrawl', 'insertvideo', 'attachment', 'map', 'insertcode', '|', 'horizontal', '|', 'preview', 'searchreplace', 'drafts']],
                
                    initialContent:'',
                    //关闭字数统计
                    wordCount:false,
                    zIndex:2,
                    //关闭elementPath
                    elementPathEnabled:false,
                    //默认的编辑区域高度
                    initialFrameHeight:250
                    //更多其他参数,请参考ueditor.config.js中的配置项
                    //更多其他参数,请参考ueditor.config.js中的配置项
                });
                            editor.ready(function() {
                	editor.setDisabled();
                	});
                                $("#editor").find("*").css("max-width","362px");
            </script>              </div>
                </div>
                              
        
    
            
             <div class="layui-form-item">
                    <label for="L_vercode" class="layui-form-label">验证码</label>
                    <div class="layui-input-inline">
                      <input type="text"  id="code" name="code"   value="" required lay-verify="required" placeholder="图片验证码" autocomplete="off" class="layui-input">
                    </div>
                    <div class="layui-form-mid">
                      <span style="color: #c00;"><img class="hand" src="https://www.e-learn.cn/qa/user/code.html" onclick="javascript:updatecode();" id="verifycode"><a class="changecode"  href="javascript:updatecode();"> 看不清?</a></span>
                    </div>
                  </div>
                                      <div class="layui-form-item">
                        <input type="hidden" value="775553" id="ans_qid" name="qid">
       <input type="hidden" id="tokenkey" name="tokenkey" value=''/>
                    <input type="hidden" value="Populate array from XML end tags" id="ans_title" name="title"> 
                 
                  <div class="layui-btn    layui-btn-disabled"  id="ajaxsubmitasnwer" >提交回复</div>
                </div>
              </form>
            </div>
          </div>
          <input type="hidden" value="775553" id="adopt_qid"	name="qid" /> 
          <input type="hidden" id="adopt_answer" value="0"	name="aid" />
        </div>
        <div class="layui-col-md4">
              
     <!-- 热门讨论问题 -->
         
     <dl class="fly-panel fly-list-one">
            <dt class="fly-panel-title">热议问题</dt>
                <!-- 本周热门讨论问题显示10条-->