Why do Arabic characters behave as separate characters when styling single Arabic character?

后端 未结 6 1398
心在旅途
心在旅途 2020-12-31 06:56

Basically what I am trying to accomplish is Arabic characters misuse highlighter !

To make it easy for understand I will try to explain a similar functionality but f

相关标签:
6条回答
  • 2020-12-31 07:27

    This is a longstanding bug in WebKit browsers (Chrome, Safari): HTML markup breaks joining behavior. Explicit use of ZWJ (zero-width joiner) used to help (see question Partially colored Arabic word in HTML), but it seems that the bug has become worse.

    As a clumsy (but probably the only) workaround, you could use contextual forms for Arabic letters. This can be tested first using just static HTML markup and CSS, e.g.

    بطﻠ<span style="color:red">ﺔ</span>
    

    Here I am using, inside the span element, ﺔ U+FE94 ARABIC LETTER TEH MARBUTA FINAL FORM instead of the normal U+0629 ARABIC LETTER TEH MARBUTA and ﻠ U+FEE0 ARABIC LETTER LAM MEDIAL FORM instead of U+0644 ARABIC LETTER LAM.

    To implement this in JavaScript, you would need, when inserting markup into a word Arabic letters, change characters before and after the break (caused by markup) to initial, medial, or final representation form according to its position in the word.

    0 讨论(0)
  • 2020-12-31 07:27

    i know that this solution i'm giving you is not very elegant but it kinda works so tell me what you think:

    <script>
        function check1(){
        englishanswer.innerHTML = englishWord.value.replace(/t/,'<span style="color:red">T</span>');
    }
    function check2(){
    arabicanswer.innerHTML = 
        arabicWord.value.replace(/\u0647/,'<span style="color:red">'+
        unescape("%u0640%u0629")+'</span>')+
        '<br>'+arabicWord.value.replace(/\u0647/,unescape('%u0629'));
    }
    </script>
    
    <fieldset>
    <legend>English:</legend>
    <input id='englishWord' value='test'/>
    <input type='submit' value='Check' onclick='check1()'/>
    <p id='englishanswer'></p>
    </fieldset>
    
    <fieldset style="direction:rtl">
    <legend>عربي</legend>
    <input id='arabicWord' value='بطلـه'/>
    <input type='submit' value='Check' onclick='check2()'/>
    <p id='arabicanswer'></p>
    </fieldset>
    
    0 讨论(0)
  • 2020-12-31 07:31

    instead of using span, use HTML5 ruby element and add the Arabic-tatweel character "ـ" (U+0640), you know the character that extends letters (shift+j).

    so your code becomes:

    arabicanswer.innerHTML = 
            (arabicWord.value).replace(/\u0647/,'ـ<ruby style="color:red"> ـ'+
            unescape("%u0629")+'</ruby>')+
            '<br>'+arabicWord.value.replace(/\u0647/,unescape('%u0629'));
        }
    

    and here is an updated fiddle: http://jsfiddle.net/fjz5C/28/

    0 讨论(0)
  • 2020-12-31 07:33

    As Jukka K. Korpela indicated, This is mostly a bug in most WebKit-based browsers(chrome, safari, etc).

    A simple hack other than the TAMDEED char or getting contextual forms for Arabic letters would be to put the zero-width-joiner (&zwj; or &#x200d;) before/after the letter you want to be treated as single Arabic ligature - two chars making up another one. e.g.

    <p>عرب&#x200d;<span style="color: Red;">&#x200d;ي</span></p>  
    

    demo: jsfiddle
    see also the webkit bug report.

    0 讨论(0)
  • 2020-12-31 07:39

    I would try adding a ligature/taweel to the character before and after. It won't actually fix the problem, but it will make it difficult to notice, since it will force the lam into medial form and the taa marbuta into final form. If it works, that would be a lot less brittle than actually converting the letters to their medial or final forms.

    You seem to have other problems, though. I went to your website and put in a misspelling of hadha , just to see what it would do with it, and it caused the ha to disconnect in both words, which doesn't make sense if the only problem is the formatting tags. (I'm using Firefox on a Mac.)

    enter image description here

    Good luck!

    0 讨论(0)
  • 2020-12-31 07:41

    You should take care of Beginning , Middle, End and Isolated characters. The complete list is available here

    Use ufe94 instead of u0629

    arabicWord.value.replace(/\u0647/,'<span style="color:red">'+ unescape("%ufe94")+'</span>')+
    
    0 讨论(0)
提交回复
热议问题