Mysql german accents not-sensitive search in full-text searches

大城市里の小女人 提交于 2020-01-02 07:38:33

问题


Let`s have a example hotels table:

CREATE TABLE `hotels` (
  `HotelNo` varchar(4) character set latin1 NOT NULL default '0000',
  `Hotel` varchar(80) character set latin1 NOT NULL default '',
  `City` varchar(100) character set latin1 default NULL,
  `CityFR` varchar(100) character set latin1 default NULL,
  `Region` varchar(50) character set latin1 default NULL,
  `RegionFR` varchar(100) character set latin1 default NULL,
  `Country` varchar(50) character set latin1 default NULL,
  `CountryFR` varchar(50) character set latin1 default NULL,
  `HotelText` text character set latin1,
  `HotelTextFR` text character set latin1,
  `tagsforsearch` text character set latin1,
  `tagsforsearchFR` text character set latin1,
  PRIMARY KEY  (`HotelNo`),
  FULLTEXT KEY `fulltextHotelSearch` (`HotelNo`,`Hotel`,`City`,`CityFR`,`Region`,`RegionFR`,`Country`,`CountryFR`,`HotelText`,`HotelTextFR`,`tagsforsearch`,`tagsforsearchFR`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_german1_ci;

In this table for example we have only one hotel with Region name = "Graubünden" (please note umlaut ü character)

And now I want to achieve same search match for phrases: 'graubunden' and 'graubünden'

This is simple with use of MySql built in collations in regular searches as follows:

SELECT *  
FROM `hotels` 
WHERE `Region` LIKE CONVERT(_utf8 '%graubunden%' USING latin1) 
COLLATE latin1_german1_ci

This works fine for 'graubunden' and 'graubünden' and as a result I receive proper result, but problem is when we make MySQL full text search

Whats wrong with this SQL statement?:

SELECT 
 *
FROM 
 hotels 
WHERE 
 MATCH (`HotelNo`,`Hotel`,`Address`,`City`,`CityFR`,`Region`,`RegionFR`,`Country`,`CountryFR`, `HotelText`, `HotelTextFR`, `tagsforsearch`, `tagsforsearchFR`)
AGAINST( CONVERT('+graubunden' USING latin1)  COLLATE latin1_german1_ci IN BOOLEAN MODE)            
ORDER BY Country ASC, Region ASC, City ASC

This doesn`t return any result. Any ideas where the dog is buried ?


回答1:


When you define individual CHARACTER SETS for your columns, you override the collation you set default on table level.

Each of your columns has default latin1 collation (which is latin1_swedish_ci). You can see it by running SHOW CREATE TABLE.

In FULLTEXT queries, indexed columns have COERCIBILITY of 0, that is all fulltext queries are converted to the collation used in the index, not vice versa.

You need to remove CHARACTER SET definitions from your columns or explicitly set all columns to latin1_german_ci:

CREATE TABLE `hotels` (
  `HotelNo` varchar(4) NOT NULL default '0000',
  `Hotel` varchar(80) NOT NULL default '',
  `City` varchar(100) default NULL,
  `CityFR` varchar(100) default NULL,
  `Region` varchar(50) default NULL,
  `RegionFR` varchar(100) default NULL,
  `Country` varchar(50) default NULL,
  `CountryFR` varchar(50) default NULL,
  `HotelText` text,
  `HotelTextFR` text,
  `tagsforsearch` text,
  `tagsforsearchFR` text,
  PRIMARY KEY  (`HotelNo`),
  FULLTEXT KEY `fulltextHotelSearch` (`HotelNo`,`Hotel`,`City`,`CityFR`,`Region`,`RegionFR`,`Country`,`CountryFR`,`HotelText`,`HotelTextFR`,`tagsforsearch`,`tagsforsearchFR`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_german1_ci;

INSERT
INTO    hotels (hotelText, HotelTextFR, tagsforsearch, tagsforsearchFR)
VALUES  ('text', 'text', 'graubünden', 'tags');

SELECT  *
FROM    hotels
WHERE   MATCH (`HotelNo`,`Hotel`,`City`,`CityFR`,`Region`,`RegionFR`,`Country`,`CountryFR`, `HotelText`, `HotelTextFR`, `tagsforsearch`, `tagsforsearchFR`)
AGAINST (CONVERT('+graubunden' USING latin1) COLLATE latin1_german1_ci IN BOOLEAN MODE)
ORDER BY
        Country ASC, Region ASC, City ASC;


来源:https://stackoverflow.com/questions/2722044/mysql-german-accents-not-sensitive-search-in-full-text-searches

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!