how to escape xml entities in javascript?

前端未结

关注

 10  1114

In JavaScript (server side nodejs) I\'m writing a program which generates xml as output.

I am building the xml by concatenating a string:

str += \'&l


                      
              相关标签:


      
      
        
          10条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  花落未央        
                
              
                            
                2020-11-27 15:15
              
            
            
                                                                       
HTML encoding is simply replacing &, ", ', < and > chars with their entity equivalents. Order matters, if you don't replace the & chars first, you'll double encode some of the entities:

if (!String.prototype.encodeHTML) {
  String.prototype.encodeHTML = function () {
    return this.replace(/&/g, '&amp;')
               .replace(/</g, '&lt;')
               .replace(/>/g, '&gt;')
               .replace(/"/g, '&quot;')
               .replace(/'/g, '&apos;');
  };
}


_{As @Johan B.W. de Vries pointed out, this will have issues with the tag names, I would like to clarify that I made the assumption that this was being used for the value only}

Conversely if you want to decode HTML entities¹, make sure you decode &amp; to & after everything else so that you don't double decode any entities:

if (!String.prototype.decodeHTML) {
  String.prototype.decodeHTML = function () {
    return this.replace(/&apos;/g, "'")
               .replace(/&quot;/g, '"')
               .replace(/&gt;/g, '>')
               .replace(/&lt;/g, '<')
               .replace(/&amp;/g, '&');
  };
}


_{1 just the basics, not including &copy; to © or other such things}



As far as libraries are concerned. Underscore.js (or Lodash if you prefer) provides an _.escape method to perform this functionality.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  忘了有多久        
                
              
                            
                2020-11-27 15:22
              
            
            
                                                                       
I originally used the accepted answer in production code and found that it was actually really slow when used heavily. Here is a much faster solution (runs at over twice the speed):

   var escapeXml = (function() {
        var doc = document.implementation.createDocument("", "", null)
        var el = doc.createElement("temp");
        el.textContent = "temp";
        el = el.firstChild;
        var ser =  new XMLSerializer();
        return function(text) {
            el.nodeValue = text;
            return ser.serializeToString(el);
        };
    })();

console.log(escapeXml("<>&")); //&lt;&gt;&amp;

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  无人及你        
                
              
                            
                2020-11-27 15:27
              
            
            
                                                                       
if something is escaped from before, you could try this since this will not double escape like many others

function escape(text) {
    return String(text).replace(/(['"<>&'])(\w+;)?/g, (match, char, escaped) => {
        if(escaped) 
            return match

        switch(char) {
            case '\'': return '&quot;'
            case '"': return '&apos;'
            case '<': return '&lt;'
            case '>': return '&gt;'
            case '&': return '&amp;'
        }
    })
}

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  长发绾君心        
                
              
                            
                2020-11-27 15:28
              
            
            
                                                                       
you can use the below method. I have added this in prototype for easier access.
I have also used negative look-ahead so it wont mess things, if you call the method twice or more.

Usage:

 var original = "Hi&there";
 var escaped = original.EncodeXMLEscapeChars();  //Hi&amp;there


Decoding is automaticaly handeled in XML parser.

Method :

//String Extenstion to format string for xml content.
//Replces xml escape chracters to their equivalent html notation.
String.prototype.EncodeXMLEscapeChars = function () {
    var OutPut = this;
    if ($.trim(OutPut) != "") {
        OutPut = OutPut.replace(/</g, "&lt;").replace(/>/g, "&gt;").replace(/"/g, "&quot;").replace(/'/g, "&#39;");
        OutPut = OutPut.replace(/&(?!(amp;)|(lt;)|(gt;)|(quot;)|(#39;)|(apos;))/g, "&amp;");
        OutPut = OutPut.replace(/([^\\])((\\\\)*)\\(?![\\/{])/g, "$1\\\\$2");  //replaces odd backslash(\\) with even.
    }
    else {
        OutPut = "";
    }
    return OutPut;
};

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  终归单人心        
                
              
                            
                2020-11-27 15:28
              
            
            
                                                                       
Caution, all the regexing isn't good if you have XML inside XML. 

Instead loop over the string once, and substitute all escape characters. 

That way, you can't run over the same character twice. 


function _xmlAttributeEscape(inputString)
{
    var output = [];

    for (var i = 0; i < inputString.length; ++i)
    {
        switch (inputString[i])
        {
            case '&':
                output.push("&amp;");
                break;
            case '"':
                output.push("&quot;");
                break;
            case "<":
                output.push("&lt;");
                break;
            case ">":
                output.push("&gt;");
                break;
            default:
                output.push(inputString[i]);
        }


    }

    return output.join("");
}

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  暖寄归人        
                
              
                            
                2020-11-27 15:29
              
            
            
                                                                       
Technically, &, < and > aren't valid XML entity name characters. If you can't trust the key variable, you should filter them out.

If you want them escaped as HTML entities, you could use something like http://www.strictly-software.com/htmlencode .
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     1
2
下一页
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复