How to get Matlab to read correct amount of xml nodes

前端未结

关注

 2  518

I\'m reading a simple xml file using matlab\'s xmlread internal function.


                      
              相关标签:


      
      
        
          2条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  滥情空心        
                
              
                            
                2021-01-12 03:28
              
            
            
                                                                       
I felt that @cholland answer was good, but I didn't like the extra xml work. So here is a solution to strip the whitespace from a copy of the xml file which is the root cause of the unwanted elements.

fid = fopen('tmpCopy.xml','wt');
str = regexprep(fileread(filename),'[\n\r]+',' ');
str = regexprep(str,'>[\s]*<','><');
fprintf(fid,'%s', str);
fclose(fid);

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  醉梦人生        
                
              
                            
                2021-01-12 03:42
              
            
            
                                                                       
The XML parser behind the scenes is creating #text nodes for all whitespace between the node elements.  Whereever there is a newline or indentation it will create a #text node with the newline and following indentation spaces in the data portion of the node.  So in the xml example you provided when it is parsing the child nodes of the "ref" element it returns 5 nodes


Node 1: #text with newline and indentation spaces
Node 2: "requestor" node which in turn has a #text child with "John Doe" in the data    portion 
Node 3: #text with newline and indentation spaces
Node 4: "project" node which in turn has a #text child with "X" in the data portion
Node 5: #text with newline and indentation spaces


This function removes all of these useless #text nodes for you.  Note that if you intentionally have an xml element composed of nothing but whitespace then this function will remove it but for the 99.99% of xml cases this should work just fine.

function removeIndentNodes( childNodes )

numNodes = childNodes.getLength;
remList = [];
for i = numNodes:-1:1
   theChild = childNodes.item(i-1);
   if (theChild.hasChildNodes)
      removeIndentNodes(theChild.getChildNodes);
   else
      if ( theChild.getNodeType == theChild.TEXT_NODE && ...
           ~isempty(char(theChild.getData()))         && ...
           all(isspace(char(theChild.getData()))))
         remList(end+1) = i-1; % java indexing
      end
   end
end
for i = 1:length(remList)
   childNodes.removeChild(childNodes.item(remList(i)));
end

end


Call it like this

tree = xmlread( xmlfile );
removeIndentNodes( tree.getChildNodes );

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复