Yielding spellchecker

让人想犯罪 __ 提交于 2019-12-03 06:03:06

Your suggested approach (separating each word in a span and storing additional data in it) at first glance seems to be the most sensible approach. On the editor level, you just need to ensure all text is inside some span, and that each of them contains only a single word (splitting it if necessary). On the word level, just listen for changes in the spans (binding input and propertyChange) and act according to its class/data.

However, the real pain is to keep the caret position consistent. When you change the contents of either a textarea or an element with contentEditable, the caret moves rather unpredictably, and there's no easy (cross-browser) way of keeping track of the caret. I searched for solutions both here at SO and elsewhere, and the simplest working solution I found was this blog post. Unfortunatly it only applied to textarea, so the "each word in a span" solution couldn't be used.

So, I suggest the following approach:

  • Keep a list of words in an Array, where each word stores both the current value and the original;
  • When the contents of the textarea changes, keep the set of unchanged words and redo the rest;
  • Only apply the spell check if the caret is just after a non-word character (room for improvement) and you're not hitting backspace;
  • If the user was unsatisfied with the correction, hitting backspace once will undo it, and it won't be checked again unless modified.
    • If many corrections were done at once (for instance, if a lot of text were copy-pasted), each backspace will undo one correction until no one is left.
    • Hitting any other key will commit the correction, so if the user is still unsatisfied he'll have to go back and change it again.
    • Note: differently from the OP requirements, the changed version will be autocorrected again if the user inputs a non-word character; he'll need to hit backspace once to "protect" it.

I created a simple proof-of-concept at jsFiddle. Details below. Note that you can combine it with other approaches (for instance, detecting a "down arrow" key and displaying a menu with some auto-correcting options) etc.


Steps of the proof-of-concept explained in detail:

  • Keep a list of words in an Array, where each word stores both the current value and the original;

    var words = [];
    

    This regex splits the text into words (each word has a word property and a sp one; the latter stores non-word characters immediatly following it)

    delimiter:/^(\w+)(\W+)(.*)$/,
    ...
    regexSplit:function(regex,text) {
        var ret = [];
        for ( var match = regex.exec(text) ; match ; match = regex.exec(text) ) {
            ret.push({
                word:match[1],
                sp:match[2],
                length:match[1].length + match[2].length
            });
            text = match[3];
        }
        if ( text )
            ret.push({word:text, sp:'', length:text.length});
         return ret;
    }
    
  • When the contents of the textarea changes, keep the set of unchanged words and redo the rest;

        // Split all the text
        var split = $.autocorrect.regexSplit(options.delimiter, $this.val());
        // Find unchanged words in the beginning of the field
        var start = 0;
        while ( start < words.length && start < split.length ) {
            if ( !words[start].equals(split[start]) )
                break;
            start++;
        }
        // Find unchanged words in the end of the field
        var end = 0;
        while ( 0 < words.length - end && 0 < split.length - end ) {
            if ( !words[words.length-end-1].equals(split[split.length-end-1]) ||
                 words.length-end-1 < start )
                break;
            end++;
        }
        // Autocorrects words in-between
        var toSplice = [start, words.length-end - start];
        for ( var i = start ; i < split.length-end ; i++ )
            toSplice.push({
                word:check(split[i], i),
                sp:split[i].sp,
                original:split[i].word,
                equals:function(w) {
                    return this.word == w.word && this.sp == w.sp;
                }
            });
        words.splice.apply(words, toSplice);
        // Updates the text, preserving the caret position
        updateText();
    
  • Only apply the spell check if the caret is just after a non-word character (room for improvement) and you're not hitting backspace;

    var caret = doGetCaretPosition(this);
    var atFirstSpace = caret >= 2 &&
                       /\w\W/.test($this.val().substring(caret-2,caret));
    function check(word, index) {
        var w = (atFirstSpace && !backtracking ) ?
                options.checker(word.word) :
                word.word;
        if ( w != word.word )
            stack.push(index); // stack stores a list of auto-corrections
        return w;
    }
    
  • If the user was unsatisfied with the correction, hitting backspace once will undo it, and it won't be checked again unless modified.

    $(this).keydown(function(e) {
        if ( e.which == 8 ) {
            if ( stack.length > 0 ) {
                var last = stack.pop();
                words[last].word = words[last].original;
                updateText(last);
                return false;
            }
            else
                backtracking = true;
            stack = [];
        }
    });
    
  • The code for updateText simply joins all words again into a string, and set the value back to the textarea. The caret is preserved if nothing was changed, or placed just after the last autocorrection done/undone, to account for changes in the text length:

    function updateText(undone) {
        var caret = doGetCaretPosition(element);
        var text = "";
        for ( var i = 0 ; i < words.length ; i++ )
            text += words[i].word + words[i].sp;
        $this.val(text);
        // If a word was autocorrected, put the caret right after it
        if ( stack.length > 0 || undone !== undefined ) {
            var last = undone !== undefined ? undone : stack[stack.length-1];
            caret = 0;
            for ( var i = 0 ; i < last ; i++ )
                caret += words[i].word.length + words[i].sp.length;
            caret += words[last].word.length + 1;
        }
        setCaretPosition(element,caret);
    }
    
  • The final plugin structure:

    $.fn.autocorrect = function(options) {
        options = $.extend({
            delimiter:/^(\w+)(\W+)(.*)$/,
            checker:function(x) { return x; }
        }, options);
        return this.each(function() {
            var element = this, $this = $(this);
            var words = [];
            var stack = [];
            var backtracking = false;
            function updateText(undone) { ... }
            $this.bind("input propertyChange", function() {
                stack = [];
                // * Only apply the spell check if the caret...
                // * When the contents of the `textarea` changes...
                backtracking = false;
            });
            // * If the user was unsatisfied with the correction...
        });
    };
    $.autocorrect = {
        regexSplit:function(regex,text) { ... }
    };
    

Assuming you're only submitting the word left of the caret, could you disable the spellchecker until a whitespace character is typed or the textbox caret is moved?

I'm not sure if that's the kind of answer you wanted.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!