I am using solr with ruby on rails. It\'s all working well, I just need to know if there\'s any existing code to sanitize user input, like a query starting with ? or *
If you are using Solarium with PHP then you can use the Solarium_Escape::term()
method.
/**
* Escape a term
*
* A term is a single word.
* All characters that have a special meaning in a Solr query are escaped.
*
* If you want to use the input as a phrase please use the {@link phrase()}
* method, because a phrase requires much less escaping.\
*
* @link http://lucene.apache.org/java/docs/queryparsersyntax.html#Escaping%20Special%20Characters
*
* @param string $input
* @return string
*/
static public function term($input)
{
$pattern = '/(\+|-|&&|\|\||!|\(|\)|\{|}|\[|]|\^|"|~|\*|\?|:|\\\)/';
return preg_replace($pattern, '\\\$1', $input);
}
I don't know any code that does this, but theoretically it could be done by looking at the parsing code in Lucene and searching for throw new ParseException
(only 16 matches!).
In practice, I think you're better off just catching any solr exceptions in your code and showing an "invalid query" message or something like that.
EDIT: Here are a couple of "sanitizers":