In SQLite I want to case-insensitive \"SELECT LIKE name\"
works fine for normal latin names, but when the name is in UTF-8 with non-latin characters then the se
For SQLite you have 2 options:
$pdo = new PDO("sqlite::memory:");
# BEGIN
function lexa_ci_utf8_like($mask, $value) {
$mask = str_replace(
array("%", "_"),
array(".*?", "."),
preg_quote($mask, "/")
);
$mask = "/^$mask$/ui";
return preg_match($mask, $value);
}
$pdo->sqliteCreateFunction('like', "lexa_ci_utf8_like", 2);
# END
$pdo->exec("create table t1 (x)");
$pdo->exec("insert into t1 (x) values ('[Привет España Dvořák]')");
header("Content-Type: text/plain; charset=utf8");
$q = $pdo->query("select x from t1 where x like '[_РИ%Ñ%ŘÁ_]'");
print $q->fetchColumn();
Use a no-case collation, such as : LIKE name COLLATE NOCASE
If you need specific characters that are not part of ASCII to be compared with case folding, the NOCASE
will not work, as such folding is not supported by SQLite - you will have to provide your own collation function using your Unicode library of choice and sqlite3_create_collation()
.
EDIT: also, this might be interesting:
How to sort text in sqlite3 with specified locale?
An improved version of LIKE
overloading via a UDF:
$db->sqliteCreateFunction('like',
function ($pattern, $data, $escape = null) use ($db)
{
static $modifiers = null;
if (isset($modifiers) !== true)
{
$modifiers = ((strncmp($db->query('PRAGMA case_sensitive_like;')->fetchColumn(), '1', 1) === 0) ? '' : 'i') . 'suS';
}
if (isset($data) === true)
{
if (strpbrk($pattern = preg_quote($pattern, '~'), '%_') !== false)
{
$regex = array
(
'~%+~S' => '.*',
'~_~S' => '.',
);
if (strlen($escape = preg_quote($escape, '~')) > 0)
{
$regex = array
(
'~(?<!' . $escape . ')%+~S' => '.*',
'~(?<!' . $escape . ')_~S' => '.',
'~(?:' . preg_quote($escape, '~') . ')([%_])~S' => '$1',
);
}
$pattern = preg_replace(array_keys($regex), $regex, $pattern);
}
return (preg_match(sprintf('~^%s$~%s', $pattern, $modifiers), $data) > 0);
}
return false;
}
);
Respects the case_sensitive_like PRAGMA and correctly handles x LIKE y ESCAPE z syntax.
I also wrote another version that does basic and extended romanization of x
and y
values, so that an accented character will match it's unaccented counterpart, for instance: SELECT 'Á' LIKE 'à%';
.
You can star the gist to keep an eye on occasional updates.