I\'m using a django application which does some \'startswith\' ORM operations comparing longtext
columns with a unicode string. This results in a LIKE BI
A trick: If you don't want to change the type of your column to binary, try to write your WHERE
statement like this:
WHERE field = 'yourstring' AND field LIKE BINARY 'yourstring'
instead of:
WHERE field LIKE BINARY 'yourstring'
Indeed, it will check the first condition very quickly, and try the second one only if the first one is true.
It worked well on my project for this test of equality, and I think you can adapt this to the "starts with" test.
For the next person who runs across this - in our relatively small database the query:
SELECT * FROM table_name WHERE field LIKE 'some-field-search-value';
... Result row
Returns 1 row in set (0.00 sec)
Compared to:
SELECT * FROM table_name WHERE field LIKE BINARY 'some-field-search-value';
... Result row
Returns 1 row in set (0.32 sec)
Long story short, at least for our database (MySQL 5.5 / InnoDB) there is a very significant difference in performance between the two lookups.
Apparently though this is a bug in MySQL 5.5: http://bugs.mysql.com/bug.php?id=63563 and in my testing against the same database in MySQL 5.1 the LIKE BINARY query still uses the index (while in 5.5 it does a full table scan.)
If performance seems to become a problem, it might be a good idea to create a copy of the first eg. 255 characters of the longtext, add an index on that and use the startswith
with that.
BTW, this page says: "if you need to do case-sensitive matching, declare your column as BINARY; don't use LIKE BINARY in your queries to cast a non-binary column. If you do, MySQL won't use any indexes on that column." It's an old tip but I think this is still valid.