Preventing HTML character entities in locale files from getting munged by Rails3 xss protection

后端 未结 5 1285
遇见更好的自我
遇见更好的自我 2021-02-03 23:58

We\'re building an app, our first using Rails 3, and we\'re having to build I18n in from the outset. Being perfectionists, we want real typography to be used in our views: dashe

相关标签:
5条回答
  • 2021-02-04 00:21

    Well. I bookmarked this question yesterday because of the i18n angle, but didn't answer it as I'm a Python person who's never used Rails. I'm still not going to answer it, but given you aren't being overrun by helpful Railsians who could point you at a good way of getting around Rails' innards, here's my perspective nonetheless.

    First of all I think it's great that you're thinking about the problem from the outset. That's pretty rare. Second, I completely agree that using raw strings or selectively picking strings with entities to give a special treatment to sounds like a brittle, ugly, bug-prone hack.

    Now if I understand Rails correctly (I read this i18n guide), the YAML files contain the localised string for each language. In this case, I'd strongly recommend to use regular characters in them (in UTF-8). Otherwise, maintaining localizations, or even reading through a translation file -- think of languages in non-Latin scripts! -- is going to be hell.

    Yeah, it would mean you have to figure out input methods, but the solution is clean and straightforward.

    0 讨论(0)
  • 2021-02-04 00:25

    If you don't wanna expose to the possibility of a mistake by simple adding .html_safe (through alias_method_chain or w/e) to everything, the best solution is simply to use it whenever it is necessary.

    In our site we use markup language to get HTML output from i18n locale files, since who translates those files are not developers, just translators.

    If it is only on a few places that you need your HTML to really be HTML, use .html_safe

    t('views.signup.organisation_details').html_safe
    

    The simple markup language we have works pretty well for us, but that is really case-specific :)

    0 讨论(0)
  • 2021-02-04 00:29

    I think it isn't a good idea to use use "raw", you could try with yml string like this

    en:
      hello:
        This generates a text paragraph for HTML. " " à @ ' All this text, which you can find in
        these lines, is being concatenated together to one single text node, and then put
        into the body of the <p> ... </p> tag. ↂↀऊᎣᏍᏮ⁜℺℻⊛⍟⎬⎨⏏♞♝⚫⚬✱✰✭❺❻➣➱➲⬡⬕
    

    HTML

    This generates a text paragraph for HTML. &quot; &quot; à @ ' All this text, which you can find in these lines, is being concatenated together to one single text node, and then put into the body of the &lt;p&gt; ... &lt;/p&gt; tag. ↂↀऊᎣᏍᏮ⁜℺℻⊛⍟⎬⎨⏏♞♝⚫⚬✱✰✭❺❻➣➱➲⬡⬕
    

    browser view

    This generates a text paragraph for HTML. " " à @ ' All this text, which you can find in these lines, is being concatenated together to one single text node, and then put into the body of the <p> ... </p> tag. ↂↀऊᎣᏍᏮ⁜℺℻⊛⍟⎬⎨⏏♞♝⚫⚬✱✰✭❺❻➣➱➲⬡⬕
    
    0 讨论(0)
  • 2021-02-04 00:29

    Are you aware of the html_safe method that can be used in helpers? I am not sure if I totally understand the problem here since I have never worked with I18n, but would it be possible to use a custom helper that determines if the characters should not be escaped and return "string".html_safe, and if it should be escaped, return "string".

    Or possibly override the "t" helper and add your escaping logic conditions + .html_safe

    0 讨论(0)
  • 2021-02-04 00:35

    There is a ticket in lighthouse for this problem, and the resolution is to append _html to the i18n key in the locales/xx.yml file and use the t alias1 to denote an html_safe string. For example:

    en:
      hello: "This is a string with an accent: &oacute;"
    

    becomes:

    en:
      hello_html: "This is a string with an accent: &oacute;"
    

    And it would create the following output:

    This is a string with an accent: ó

    This would prevent you from having to write raw t('views.signup.organisation_details') and would result in a cleaner output of: t('views.signup.organisation_details_html'). And while exchanging raw for _html doesn't seem like the greatest of trades, it does make things clear that you're outputting what is assumed to be an html_safe string.


    1 I've tested the code suggested in the lighthouse ticket. What I found was that you had to specifically use the t alias. If you used I18n.t or I18n.translate the translation didn't treat _html as html_safe:

    I18n.t('hello_html') 
    I18n.translate('hello_html') 
    # Produces => "This is a string with an accent: &oacute;"
    
    t('hello_html')      
    # Produces => "This is a string with an accent: ó"
    

    I don't think this is the intended behavior per the RoR TranslationHelper documentation.

    0 讨论(0)
提交回复
热议问题