问题
I need a Regex for to use with preg_replace
php function in the search form input to use in SQL
full text search in a MySQL multilingual utf8 database. I have considered using php filter_var
with FILTER_SANITIZE_STRING
, but I ended up with preg_replace
:
I want these features:
- keep spaces and only one if more in a row (serial spaces)
- keep double quotes and only one if more in a row(so that I could use it in
phrase
inIN BOOLEAN MODE
) - keep
-
&+
& '~' and only one if more in a row - as I want it to be multi lingual it should consider Unicode (utf8) letters too
- I do not have/need accents to be considered.
This is what I have done:
$q = addslashes($q);
$q = preg_replace('/[^\w\d\s\s+\p{L}]/u', "", $q);
But the output does not satisfy me with like with quotes("
) and minus (-
). How can I write a safe query string to use in my search box?
Are there any better practises than using preg_replace
?
回答1:
You have to do 2 preg_replace.
1- Replace invalid characters by nothing:
$q = preg_replace('/[^\p{L}\d\s~+"-]+/', '', $q);
2- Replace multiple char like spaces, ~, +, ", - by only one:
$q = preg_replace('/([\s~+"-])\1+/', "$1", $q);
来源:https://stackoverflow.com/questions/17517313/need-regex-for-utf8-multilingual-search-query