mySQL >> Normalizing a comma-delimited field

限于喜欢 提交于 2019-12-13 09:51:46

问题


I currently have a table called "RESOURCES" with a keywords field called "RES_Tags". The "RES_Tags" field contains a comma-delimited list of keywords for each record.

I need to normalize this table/field.

I have already set up the following tables: TAGS, TAGS_TO_RESOURCES.

Please see the schema here: http://sqlfiddle.com/#!9/edac4/1

What is a query that will allow me to parse the keywords in RES_Tags, write them into the TAGS table without creating duplicates and then write a listing in the TAGS_TO_RESOURCES table?


回答1:


Please copy your code into the actual posting, and provide the code you've tried to use to solve the problem.

The substring_index function returns a portion of a string with some delimiter (here a comma), and when a negative index is passed it starts searching for matches from the opposite side, so -1 grabs one item from what would otherwise be multi-item lists (for index>=2).

Per our discussion, I've tweaked how I did this and showed an example of using auto-increment. (This is run in the 'build schema' part of fiddle.)

create table TAGS
(`T_ID` int auto_increment primary key, `T_Name` varchar(18))
;

insert ignore into TAGS (T_Name)
  SELECT 
    SUBSTRING_INDEX(RES_Tags, ',', 1) as X
    FROM RESOURCES
;

insert ignore into TAGS (T_Name)
  SELECT 
    SUBSTRING_INDEX(
      SUBSTRING_INDEX(RES_Tags, ',', 2)
      ,',',-1)
  FROM RESOURCES
;

insert ignore into TAGS (T_Name)
  SELECT 
    SUBSTRING_INDEX(
      SUBSTRING_INDEX(RES_Tags, ',', 3)
      ,',',-1)  as X
  FROM RESOURCES
;
insert ignore into TAGS (T_Name)
  SELECT 
    SUBSTRING_INDEX(
      SUBSTRING_INDEX(RES_Tags, ',', 4)
      ,',',-1)  as X
  FROM RESOURCES
  ;

insert ignore into TAGS (T_Name)
  SELECT 
    SUBSTRING_INDEX(
      SUBSTRING_INDEX(RES_Tags, ',', 5)
      ,',',-1)  as X
  FROM RESOURCES
;

insert ignore into TAGS (T_Name)
  SELECT 
    SUBSTRING_INDEX(
      SUBSTRING_INDEX(RES_Tags, ',', 6)
      ,',',-1)  as X
  FROM RESOURCES
;

create table New_TAGS like TAGS;
insert into New_TAGS (T_Name)
  select distinct trim(T_Name)
  from TAGS;

drop table TAGS;
rename table NEW_TAGS to TAGS;

documentation of the substring function Possible duplication of this question




回答2:


  1. based on RESOURCES.RES_tags create set of INSERT ... INTO TAGS ... statements. Prevent duplicating either with UNIQUE constraint in TAGS and ON DUPLICATE KEY ... or using INSERT ... SELECT ... NOT EXISTS():

a) append on the fly some character to the start of RES_tags and different character to the end(say - to start, + to end) - but don't save it back into DB(a,b,c would transform into -a,b,c+)

b) replace on the fly each ',' into ending previous INSERT statement and starting next one; replace '-' with starting only, '+; with ending part only(e.g. - is replaced with insert into tags(tag) values(", + becomes '") and , would be "), (" - but for keeping them unique it will be required to add something mentioned in step #1)

  1. execute SQL generated by #1(e.g. insert into tags(tag) values("a"), ("b"), ("c"))

  2. link entity with tags using:

    INSERT INTO TAGS_TO_RESOURCES(resource_id, tag_id)
    SELECT RESOURCES.id, TAGS.id
    FROM RESOURCES
    INNER JOIN TAGS
    ON INSTR(CONCAT(',', RESOURCES.RES.tags, ','), CONCAT(',', TAGS.tag, ','))> 0   
    


来源:https://stackoverflow.com/questions/47189346/mysql-normalizing-a-comma-delimited-field

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!