Creating Taxonomy Table in MySQL

断了今生、忘了曾经 提交于 2019-12-03 03:07:14
Ante

I worked with similar data, and I made it in 2 parts. In PostgreSQL syntax.

First is taxonomy structure (Family, Genus, Species, ...):

CREATE TABLE taxonomic_units (
  id         serial        PRIMARY KEY,
  name       varchar(20)   NOT NULL,
  parent_id  integer       REFERENCES taxonomic_units(id)
);

1 | Life    | NULL
2 | Domain  | 1
...
7 | Family  | 6
8 | Genus   | 7
9 | Species | 8

Second is description and storing of botanical data:

CREATE TABLE taxons (
  id                 serial        PRIMARY KEY,
  suptaxon_id        integer       REFERENCES taxons(id),
  taxonomic_unit_id  integer       NOT NULL REFERENCES taxonomic_units(id),
  name               varchar(50)   NOT NULL,
  authority          varchar(50)
);

100 | NULL | 8 | Ocimum    | L.
101 | 100  | 9 | basilicum | L.
102 | 100  | 9 | gratissim | L.

I'm not sure I really buy into that article. Graph structures would be needed when the categories itself are mutable. Such as, all the sudden taxonomists decided to add three new levels between genus and species, and so on.

From the article:

... the management of hierarchical data is not what a relational database is intended for.

Actually, its exactly what it is intended for:

http://en.wikipedia.org/wiki/Hierarchical_database_model

The hierarchical data model lost traction as Codd's relational model became the de facto standard used by virtually all mainstream database management systems.

I would first write a view that joined all of your tables so that you would have these as your columns:

Life Domain Kingdom Phylum Class Order Family Genus Species

Now you can query that view any way you like and not have to worry about any joins. Easy :)

You can download complete taxonomy data from http://itis.gov and the data is updated more or less monthly. The data they provide includes a Materialized Path -- every species in the database has a string of all the levels above it, like a breadcrumbs string or a filesystem path.

I used this data to design a demo in my presentation Models for Hierarchical Data. I converted the materialized path data into Closure Table.

It sounds more like a graph. I'd wonder if NEO4J would be a better choice.

orangepips

There are several ways of representing hierarchical data in a relational database, albeit a NoSQL solution might be easier to work with as @duffymo mentioned. So assuming an RDBMS, see my question on the topic for an enumeration of a half dozen possibilities. For your situation, I would lead with a materialized path to make seeing the family tree easy. If the hierarchy changes regularly I would probably also model as an adjacency list and update the materialized path using a trigger.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!