To start: I know this system will have flaws!
NOTE: Im adding a few other languages because I don\'t find this problem specific to php
Parsing natural language is non-trivial, if you want a true natural language parser I'd recommend that you try and use an existing project or library. Here's a web based parser, based on the Stanford Parser. Or wikipedia is a good jumping off point.
Having said that, if you're willing to restrict the syntax and the keywords involved you might be able to simplify it. First you need to know what's important -- you have 'things' (lights, fan) in 'places' (bedroom, kitchen) that need to go into a specific state ('on', 'off').
I would get the string into an array of words, either using str_tok, or just explode on ' '
.
Now you have an array of words start at the end and go backwards looking for a 'state' -- on or off. Then follow that backwards looking for a 'thing', and finally a 'place'. If you hit another state then you can start again.
Let me try and do that in pseudocode:
// array of words is inArray
currentPlace = null;
currentThing = null;
currentState = null;
for (i = (inArray.length - 1); i >= 0; i--) {
word = inArray[i];
if (isState(word)) {
currentState = word;
currentPlace = null;
currentThing = null;
} else if (currentState) {
if (isThing(word)) {
currentThing = word;
currentPlace = null;
} else if (currentThing) {
if (isPlace(word)) {
currentPlace = word
// Apply currentState to currentThing in currentPlace
}
// skip non-place, thing or state word.
}
// Skip when we don't have a thing to go with our state
}
// Skip when we don't have a current state and we haven't found a state
}
And, having written that, it's pretty clear that it should have used a state machine and switch statements -- which goes to show I should have designed it on paper first. If you get anymore complex you want to use a state machine to implement the logic -- states would be 'lookingForState', 'lookingForThing', etc
Also you don't really need currentPlace
as a variable, but I'll leave it as it makes the logic clearer.
EDIT
If you want to support 'turn the lights in the bedroom on' you'll need to be adjust the logic (you need to save the 'place' if when you don't have a thing). If you also want to support 'turn on the lights in the bedroom' you'll need to go even further.
Thinking about it, I wonder if you can just do:
have a currentState variable and arrays for currentPlace and currentThing
for each word
if it's a state:
store it in currentState
if it's a thing, or place:
add it to the approriate array
if currentState is set and there is content in currentPlaces and currentThings:
apply currentState to all currentThings in all currentPlaces
That's not quite there, but one of those implementations might give you a starting point.
EDIT 2
OK, I tested it out and there's a few issues due to the way English is structured. The problem is if you want to support 'Turn on ...' and 'Turn ... on' then you need to use my second pseudo-code but that doesn't work easily because of the 'and's in the sentence. For example:
Turn my kitchen lights on and my bedroom and living room lights off.
The first and joins two statements, the second and joins to places. The correct way to do this is to diagram the sentence to work out what applies to what.
There are two quick options, first you could insist on using a different word or phrase to join two commands:
Turn my kitchen lights on then my bedroom and living room lights off. Turn my kitchen lights on and also my bedroom and living room lights off.
Alternatively, and this is probably easier you can insist on only having commands of the form 'Turn ... off/on'. This works with my first psuedocode above.
JavaScript Example of first psuedocode.
Note, you'll probably need to heavily pre-process the string if there's any chance of punctuation, etc. You might also want to look at replacing 'living room' (and similar two word phrases) with 'livingroom' rather than just matching one word and hoping for the best like I'm doing. Also, the code could be simplified a bit, but I wanted to keep it close to the psuedocode example.
EDIT 3
New Javascript Example
This handles some extra sentences and is cleaned up a bit better, it still relies on the 'state' coming at the end of each clause as that's what it uses as a trigger to apply the actions (this version could probably read forwards instead of backwards). Also, it will not handle something like:
Turn my kitchen fan and my bedroom lights on and living room lights off.
You have to do something more complex to understand the relationship between 'kitchen' and 'fan' and 'bedroom' and 'lights'.
Some combination of those techniques is probably enough to do something fairly impressive, as long as whoever's entering / speaking the commands follows some basic rules.
That's certainly not the most efficient solution, but here's one. You can definitely improve it, like caching regular expressions, but you get the idea. The last item in every sub-array is the operation.
DEMO
var s = 'Turn my kitchen lights on and my bedroom lights on and living room lights off and my test and another test off',
r = s.replace(/^Turn|\s*my/g, '').match(/.+? (on|off)/g).map(function(item) {
var items = item.trim().replace(/^and\s*/, '').split(/\s*and\s*/),
last = items.pop().split(' '),
op = last.pop();
return items.concat([last.join(' '), op]);
});
console.log(r);
Mind explaining the logic u used... I mean im reading the code but i was just curious if you could say it better
The logic is quite simple actually, perhaps too simple:
var s = 'Turn my kitchen lights on and my bedroom lights on and living room lights off and my test and another test off',
r = s
.replace(/^Turn|\s*my/g, '') //remove noisy words
.match(/.+? (on|off)/g) //capture all groups of [some things][on|off]
//for each of those groups, generate a new array from the returned results
.map(function(item) {
var items = item.trim()
.replace(/^and\s*/, '') //remove and[space] at the beginning of string
//split on and to get all things, for instance if we have
//test and another test off, we want ['test', 'another test off']
.split(/\s*and\s*/),
//split the last item on spaces, with previous example we would get
//['another', 'test', 'off']
last = items.pop().split(' '),
op = last.pop(); //on/off will always be the last item in the array, pop it
//items now contains ['test'], concatenate with the array passed as argument
return items.concat(
[
//last is ['another', 'test'], rejoin it together to give 'another test'
last.join(' '),
op //this is the operation
]
);
});
EDIT: At the time I posted the answer, I haven't realized how complex and flexible you needed this to be. The solution I provided would only work for sentences structured as in my example, with identifiable noisy words and a specific command order. For something more complex, you will have no other choice but to create a parser like @SpaceDog suggested. I will try to come up with something as soon as I have enough time.
I have been working on parsing menus and recipes (not finished) and this is my approach:
key
words that you need (light/bulbs/etc.., on/off)extra words
that some people might use (bright, colorful, etc...)I.E.: Turn the lights on in the bedroom and in the kitchen
what_2
is empty, then what_2
is lights on
keep in mind that sometime needs to fill up the array with the next results (depending on how the sentence is structured, but it is rare), I add a "+" or "-" to it so I know if I have to go forward or backwards to find the missing parts while parsing it