parse natural language

后端 未结 3 743
粉色の甜心
粉色の甜心 2021-02-06 17:17

To start: I know this system will have flaws!

NOTE: Im adding a few other languages because I don\'t find this problem specific to php

3条回答
  •  予麋鹿
    予麋鹿 (楼主)
    2021-02-06 17:47

    That's certainly not the most efficient solution, but here's one. You can definitely improve it, like caching regular expressions, but you get the idea. The last item in every sub-array is the operation.

    DEMO

    var s = 'Turn my kitchen lights on and my bedroom lights on and living room lights off and my test and another test off',
        r = s.replace(/^Turn|\s*my/g, '').match(/.+? (on|off)/g).map(function(item) {
            var items = item.trim().replace(/^and\s*/, '').split(/\s*and\s*/),
                last = items.pop().split(' '),
                op = last.pop();
            return items.concat([last.join(' '), op]);
        });
    
    console.log(r);
    

    Mind explaining the logic u used... I mean im reading the code but i was just curious if you could say it better

    The logic is quite simple actually, perhaps too simple:

    var s = 'Turn my kitchen lights on and my bedroom lights on and living room lights off and my test and another test off',
        r = s
            .replace(/^Turn|\s*my/g, '') //remove noisy words
            .match(/.+? (on|off)/g) //capture all groups of [some things][on|off]
            //for each of those groups, generate a new array from the returned results
            .map(function(item) {
                var items = item.trim()
                        .replace(/^and\s*/, '') //remove and[space] at the beginning of string
                        //split on and to get all things, for instance if we have
                        //test and another test off, we want ['test', 'another test off']
                        .split(/\s*and\s*/),
                    //split the last item on spaces, with previous example we would get
                    //['another', 'test', 'off']
                    last = items.pop().split(' '),
                    op = last.pop(); //on/off will always be the last item in the array, pop it
                //items now contains ['test'], concatenate with the array passed as argument
                return items.concat(
                    [
                        //last is ['another', 'test'], rejoin it together to give 'another test'
                        last.join(' '),
                        op //this is the operation
                    ]
                );
            });
    

    EDIT: At the time I posted the answer, I haven't realized how complex and flexible you needed this to be. The solution I provided would only work for sentences structured as in my example, with identifiable noisy words and a specific command order. For something more complex, you will have no other choice but to create a parser like @SpaceDog suggested. I will try to come up with something as soon as I have enough time.

提交回复
热议问题