I have to create a report on some student completions. The students each belong to one client. Here are the tables (simplified for this question).
CREATE TAB
Until now, I wanted to keep those comma separated lists in my SQL db - well aware of all warnings!
I kept thinking that they have benefits over lookup tables (which provide a way to a normalized data base). After some days of refusing, I've seen the light:
In short, there is a reason why there is no native SPLIT() function in MySQL.
SELECT
tab1.std_name, tab1.stdCode, tab1.payment,
SUBSTRING_INDEX(tab1.payment, '|', 1) as rupees,
SUBSTRING(tab1.payment, LENGTH(SUBSTRING_INDEX(tab1.payment, '|', 1)) + 2,LENGTH(SUBSTRING_INDEX(tab1.payment, '|', 2))) as date
FROM (
SELECT DISTINCT
si.std_name, hfc.stdCode,
if(isnull(hfc.payDate), concat(hfc.coutionMoneyIn,'|', year(hfc.startDtae), '-', monthname(hfc.startDtae)), concat(hfc.payMoney, '|', monthname(hfc.payDate), '-', year(hfc.payDate))) AS payment
FROM hostelfeescollection hfc
INNER JOIN hostelfeecollectmode hfm ON hfc.tranId = hfm.tranId
INNER JOIN student_info_1 si ON si.std_code = hfc.stdCode
WHERE hfc.tranId = 'TRAN-AZZZY69454'
) AS tab1
I used the above logic but modified it slightly. My input is of format : "apple:100|pinapple:200|orange:300" stored in a variable @updtAdvanceKeyVal
Here is the function block :
set @res = "";
set @i = 1;
set @updtAdvanceKeyVal = updtAdvanceKeyVal;
REPEAT
-- set r = replace(SUBSTRING(SUBSTRING_INDEX(@updtAdvanceKeyVal, "|", @i),
-- LENGTH(SUBSTRING_INDEX(@updtAdvanceKeyVal, "|", @i -1)) + 1),"|","");
-- wrapping the function in "replace" function as above causes to cut off a character from
-- the 2nd splitted value if the value is more than 3 characters. Writing it in 2 lines causes no such problem and the output is as expected
-- sample output by executing the above function :
-- orange:100
-- pi apple:200 !!!!!!!!strange output!!!!!!!!
-- tomato:500
set @r = SUBSTRING(SUBSTRING_INDEX(@updtAdvanceKeyVal, "|", @i),
LENGTH(SUBSTRING_INDEX(@updtAdvanceKeyVal, "|", @i -1)) + 1);
set @r = replace(@r,"|","");
if @r <> "" then
set @key = SUBSTRING_INDEX(@r, ":",1);
set @val = SUBSTRING_INDEX(@r, ":",-1);
select @key, @val;
end if;
set @i = @i + 1;
until @r = ""
END REPEAT;
MySQL's only string-splitting function is SUBSTRING_INDEX(str, delim, count). You can use this, to, for example:
Return the item before the first separator in a string:
mysql> SELECT SUBSTRING_INDEX('foo#bar#baz#qux', '#', 1);
+--------------------------------------------+
| SUBSTRING_INDEX('foo#bar#baz#qux', '#', 1) |
+--------------------------------------------+
| foo |
+--------------------------------------------+
1 row in set (0.00 sec)
Return the item after the last separator in a string:
mysql> SELECT SUBSTRING_INDEX('foo#bar#baz#qux', '#', -1);
+---------------------------------------------+
| SUBSTRING_INDEX('foo#bar#baz#qux', '#', -1) |
+---------------------------------------------+
| qux |
+---------------------------------------------+
1 row in set (0.00 sec)
Return everything before the third separator in a string:
mysql> SELECT SUBSTRING_INDEX('foo#bar#baz#qux', '#', 3);
+--------------------------------------------+
| SUBSTRING_INDEX('foo#bar#baz#qux', '#', 3) |
+--------------------------------------------+
| foo#bar#baz |
+--------------------------------------------+
1 row in set (0.00 sec)
Return the second item in a string, by chaining two calls:
mysql> SELECT SUBSTRING_INDEX(SUBSTRING_INDEX('foo#bar#baz#qux', '#', 2), '#', -1);
+----------------------------------------------------------------------+
| SUBSTRING_INDEX(SUBSTRING_INDEX('foo#bar#baz#qux', '#', 2), '#', -1) |
+----------------------------------------------------------------------+
| bar |
+----------------------------------------------------------------------+
1 row in set (0.00 sec)
In general, a simple way to get the nth element of a #
-separated string (assuming that you know it definitely has at least n elements) is to do:
SUBSTRING_INDEX(SUBSTRING_INDEX(your_string, '#', n), '#', -1);
The inner SUBSTRING_INDEX
call discards the nth separator and everything after it, and then the outer SUBSTRING_INDEX
call discards everything except the final element that remains.
If you want a more robust solution that returns NULL
if you ask for an element that doesn't exist (for instance, asking for the 5th element of 'a#b#c#d'
), then you can count the delimiters using REPLACE and then conditionally return NULL
using IF():
IF(
LENGTH(your_string) - LENGTH(REPLACE(your_string, '#', '')) / LENGTH('#') < n - 1,
NULL,
SUBSTRING_INDEX(SUBSTRING_INDEX(your_string, '#', n), '#', -1)
)
Of course, this is pretty ugly and hard to understand! So you might want to wrap it in a function:
CREATE FUNCTION split(string TEXT, delimiter TEXT, n INT)
RETURNS TEXT DETERMINISTIC
RETURN IF(
(LENGTH(string) - LENGTH(REPLACE(string, delimiter, ''))) / LENGTH(delimiter) < n - 1,
NULL,
SUBSTRING_INDEX(SUBSTRING_INDEX(string, delimiter, n), delimiter, -1)
);
You can then use the function like this:
mysql> SELECT SPLIT('foo,bar,baz,qux', ',', 3);
+----------------------------------+
| SPLIT('foo,bar,baz,qux', ',', 3) |
+----------------------------------+
| baz |
+----------------------------------+
1 row in set (0.00 sec)
mysql> SELECT SPLIT('foo,bar,baz,qux', ',', 5);
+----------------------------------+
| SPLIT('foo,bar,baz,qux', ',', 5) |
+----------------------------------+
| NULL |
+----------------------------------+
1 row in set (0.00 sec)
mysql> SELECT SPLIT('foo###bar###baz###qux', '###', 2);
+------------------------------------------+
| SPLIT('foo###bar###baz###qux', '###', 2) |
+------------------------------------------+
| bar |
+------------------------------------------+
1 row in set (0.00 sec)
Well, nothing I used worked, so I decided creating a real simple split function, hope it helps:
DECLARE inipos INTEGER;
DECLARE endpos INTEGER;
DECLARE maxlen INTEGER;
DECLARE item VARCHAR(100);
DECLARE delim VARCHAR(1);
SET delim = '|';
SET inipos = 1;
SET fullstr = CONCAT(fullstr, delim);
SET maxlen = LENGTH(fullstr);
REPEAT
SET endpos = LOCATE(delim, fullstr, inipos);
SET item = SUBSTR(fullstr, inipos, endpos - inipos);
IF item <> '' AND item IS NOT NULL THEN
USE_THE_ITEM_STRING;
END IF;
SET inipos = endpos + 1;
UNTIL inipos >= maxlen END REPEAT;