Store multidimensional array in database: relational or multidimensional?

后端未结

关注

 6  1773

栀梦 2021-02-14 02:14

I have read numerous posts along the lines of multidimensional to single dimension, multidimensional database, and so on, but none of the answers helped. I did

6条回答

清歌不尽 (楼主)

2021-02-14 02:38

The goal is to retrieve an element with PHP by name and all its descendants.

If that is all you need, you can use a LIKE search

SELECT *
FROM Table1
WHERE CELL LIKE 'AEE%';

With an index beginning with CELL this is a range check, which is fast.

If your data doesn't look like that, you can create a path column which looks like a directory path and contains all nodes "on the way/path" from root to the element.

| id | CELL | parent_id | path     |
|====|======|===========|==========|
|  1 | A    |      NULL | 1/       |
|  2 | AA   |         1 | 1/2/     |
|  3 | AAA  |         2 | 1/2/3/   |
|  4 | AAC  |         2 | 1/2/4/   |
|  5 | AB   |         1 | 1/5/     |
|  6 | AE   |         1 | 1/6/     | 
|  7 | AEA  |         6 | 1/6/7/   |
|  8 | AEE  |         6 | 1/6/8/   |
|  9 | AEEB |         8 | 1/6/8/9/ |

To retrieve all descendants of 'AE' (including itself) your query would be

SELECT *
FROM tree t
WHERE path LIKE '1/6/%';

or (MySQL specific concatenation)

SELECT t.*
FROM tree t
CROSS JOIN tree r -- root
WHERE r.CELL = 'AE'
  AND t.path LIKE CONCAT(r.path, '%');

Result:

| id | CELL | parent_id |     path |
|====|======|===========|==========|
|  6 | AE   |         1 | 1/6/     |
|  7 | AEA  |         6 | 1/6/7/   |
|  8 | AEE  |         6 | 1/6/8/   |
|  9 | AEEB |         8 | 1/6/8/9/ |

Demo

Performance

I have created 100K rows of fake data on MariaDB with the sequence plugin using the following script:

drop table if exists tree;
CREATE TABLE tree (
  `id` int primary key,
  `CELL` varchar(50),
  `parent_id` int,
  `path` varchar(255),
  unique index (`CELL`),
  unique index (`path`)
);

DROP TRIGGER IF EXISTS `tree_after_insert`;
DELIMITER //
CREATE TRIGGER `tree_after_insert` BEFORE INSERT ON `tree` FOR EACH ROW BEGIN
    if new.id = 1 then
        set new.path := '1/';
    else    
        set new.path := concat((
            select path from tree where id = new.parent_id
        ), new.id, '/');
    end if;
END//
DELIMITER ;

insert into tree
    select seq as id
        , conv(seq, 10, 36) as CELL
        , case 
            when seq = 1 then null
            else floor(rand(1) * (seq-1)) + 1 
        end as parent_id
        , null as path
    from seq_1_to_100000
;
DROP TRIGGER IF EXISTS `tree_after_insert`;
-- runtime ~ 4 sec.

Tests

Count all elements under the root:

SELECT count(*)
FROM tree t
CROSS JOIN tree r -- root
WHERE r.CELL = '1'
  AND t.path LIKE CONCAT(r.path, '%');
-- result: 100000
-- runtime: ~ 30 ms

Get subtree elements under a specific node:

SELECT t.*
FROM tree t
CROSS JOIN tree r -- root
WHERE r.CELL = '3B0'
  AND t.path LIKE CONCAT(r.path, '%');
-- runtime: ~ 30 ms

Result:

| id    | CELL | parent_id | path                                |
|=======|======|===========|=====================================|
|  4284 | 3B0  |       614 | 1/4/11/14/614/4284/                 |
|  6560 | 528  |      4284 | 1/4/11/14/614/4284/6560/            |
|  8054 | 67Q  |      6560 | 1/4/11/14/614/4284/6560/8054/       |
| 14358 | B2U  |      6560 | 1/4/11/14/614/4284/6560/14358/      |
| 51911 | 141Z |      4284 | 1/4/11/14/614/4284/51911/           |
| 55695 | 16Z3 |      4284 | 1/4/11/14/614/4284/55695/           |
| 80172 | 1PV0 |      8054 | 1/4/11/14/614/4284/6560/8054/80172/ |
| 87101 | 1V7H |     51911 | 1/4/11/14/614/4284/51911/87101/     |

PostgreSQL

This also works for PostgreSQL. Only the string concatenation syntax has to be changed:

SELECT t.*
FROM tree t
CROSS JOIN tree r -- root
WHERE r.CELL = 'AE'
  AND t.path LIKE r.path || '%';

Demo: sqlfiddle - rextester

How does the search work

If you look at the test example, you'll see that all paths in the result begin with '1/4/11/14/614/4284/'. That is the path of the subtree root with CELL='3B0'. If the path column is indexed, the engine will find them all efficiently, because the index is sorted by path. It's like you would want to find all the words that begin with 'pol' in a dictionary with 100K words. You wouldn't need to read the entire dictionary.

0 讨论(0)

查看其它6个回答