How to store multiple options in a single table?

后端 未结 2 612
猫巷女王i
猫巷女王i 2020-11-21 23:29

I want to design an application for result computation.

First, I need to know how to store record in a MySQL database in such a way that students can have as many co

2条回答
  •  暗喜
    暗喜 (楼主)
    2020-11-22 00:14

    Please read up on Data Normalization, General Indexing concepts, and Foreign Key constraints to keep data clean with referential integrity. This will get you going.

    Storing data in arrays may seem natural to you on paper, but to the db engine the performance with mostly be without index use. Moreover, you will find on Day 2 that getting to and maintaining your data will be a nightmare.

    The following should get you going with a good start as you tinker. Joins too.

    create table student
    (   studentId int auto_increment primary key,
        fullName varchar(100) not null
        -- etc
    );
    
    create table dept
    (   deptId int auto_increment primary key,
        deptName varchar(100) not null -- Economics
        -- etc
    );
    
    create table course
    (   courseId int auto_increment primary key,
        deptId int not null,
        courseName varchar(100) not null,
        -- etc
        CONSTRAINT fk_crs_dept FOREIGN KEY (deptId) REFERENCES dept(deptId)
    );
    
    create table SCJunction
    (   -- Student/Course Junction table (a.k.a Student is taking the course)
        -- also holds the attendance and grade
        id int auto_increment primary key,
        studentId int not null,
        courseId int not null,
        term int not null, -- term (I am using 100 in below examples for this term)
        attendance int not null, -- whatever you want, 100=always there, 0=he must have been partying,
        grade int not null, -- just an idea   
        -- See (Note Composite Index) at bottom concerning next two lines.
        unique key(studentId,courseId,term), -- no duplicates allowed for the combo (note student can re-take it next term)
        key (courseId,studentId),
        CONSTRAINT fk_sc_student FOREIGN KEY (studentId) REFERENCES student(studentId),
        CONSTRAINT fk_sc_courses FOREIGN KEY (courseId) REFERENCES course(courseId)
    );
    

    Create Test Data

    insert student(fullName) values ('Henry Carthage'),('Kim Billings'),('Shy Guy'); -- id's 1,2,3
    insert student(fullName) values ('Shy Guy');
    
    insert dept(deptName) values ('History'),('Math'),('English'); -- id's 1,2,3
    
    insert course(deptId,courseName) values (1,'Early Roman Empire'),(1,'Italian Nation States'); -- id's 1 and 2 (History dept)
    insert course(deptId,courseName) values (2,'Calculus 1'),(2,'Linear Algebra A'); -- id's 3 and 4 (Math dept)
    insert course(deptId,courseName) values (3,'World of Chaucer'); -- id 5 (English dept)
    
    -- show why FK constraints are important based on data at the moment
    insert course(deptId,courseName) values (66,'Fly Fishing 101'); -- will generate error 1452. That dept 66 does not exist
    -- That error is a good error to have. Better than faulty data
    
    -- Have Kim (studentId=2) enrolled in a few courses
    insert SCJunction(studentId,courseId,term,attendance,grade) values (2,1,100,-1,-1); -- Early Roman Empire, term 100 (made up), unknown attendance/grade
    insert SCJunction(studentId,courseId,term,attendance,grade) values (2,4,100,-1,-1); -- Linear Algebra A
    insert SCJunction(studentId,courseId,term,attendance,grade) values (2,5,100,-1,-1); -- World of Chaucer
    
    -- Have Shy Guy (studentId=3) enrolled in one course only. He is shy
    insert SCJunction(studentId,courseId,term,attendance,grade) values (3,5,100,-1,-1); -- Early Roman Empire, term 100 (made up), unknow attendance/grade
    -- note if you run that line again, the Error 1062 Duplicate entry happens. Can't take same course more than once per term
    

    Some simple questions.

    What course is in what department?

    show all, uses table aliases (abbreviations) to make typing less, readability (sometimes) better

    select c.courseId,c.courseName,d.deptId,d.deptName
    from course c
    join dept d
    on c.deptId=d.deptId
    order by d.deptName,c.courseName -- note the order
    +----------+-----------------------+--------+----------+
    | courseId | courseName            | deptId | deptName |
    +----------+-----------------------+--------+----------+
    |        5 | World of Chaucer      |      3 | English  |
    |        1 | Early Roman Empire    |      1 | History  |
    |        2 | Italian Nation States |      1 | History  |
    |        3 | Calculus 1            |      2 | Math     |
    |        4 | Linear Algebra A      |      2 | Math     |
    +----------+-----------------------+--------+----------+
    

    Who is taking the World of Chaucer course this term?

    (knowing the courseId=5)

    The below benefits from one of our composite indexes in SCJunction. A composite is an index on more than one column.

    select s.StudentId,s.FullName
    from SCJunction j
    join student s
    on j.studentId=s.studentId
    where j.courseId=5 and j.term=100
    +-----------+--------------+
    | StudentId | FullName     |
    +-----------+--------------+
    |         2 | Kim Billings |
    |         3 | Shy Guy      |
    +-----------+--------------+
    

    Kim Billings is enrolled in what this term?

    select s.StudentId,s.FullName,c.courseId,c.courseName
    from SCJunction j
    join student s
    on j.studentId=s.studentId
    join course c
    on j.courseId=c.courseId
    where s.studentId=2 and j.term=100
    order by c.courseId DESC -- descending, just for the fun of it
    +-----------+--------------+----------+--------------------+
    | StudentId | FullName     | courseId | courseName         |
    +-----------+--------------+----------+--------------------+
    |         2 | Kim Billings |        5 | World of Chaucer   |
    |         2 | Kim Billings |        4 | Linear Algebra A   |
    |         2 | Kim Billings |        1 | Early Roman Empire |
    +-----------+--------------+----------+--------------------+
    

    Kim is overwhelmed, so drop drop the math class

    delete from SCJunction
    where studentId=2 and courseId=4 and term=100
    

    run that above select statement showing what Kim is taking:

    +-----------+--------------+----------+--------------------+
    | StudentId | FullName     | courseId | courseName         |
    +-----------+--------------+----------+--------------------+
    |         2 | Kim Billings |        5 | World of Chaucer   |
    |         2 | Kim Billings |        1 | Early Roman Empire |
    +-----------+--------------+----------+--------------------+
    

    Ah, much easier term. Dad won't be happy though.

    Note such things as SCJunction.term. Much can written about that, I will skip over it at the moment mostly, other than to say it should also be in an FK somewhere. You may want your term to look more like SPRING2015 and not an int.

    And as far as id's go. This is the way I would do it. It is personal preference. It would require knowing id #'s, looking them up. Others could choose to have a courseId something like HIST101 and not 17. Those are highly more readable (but slower in the index (barely). So do what is best for you.

    Note Composite Index

    A Composite Index (INDEX means KEY, and vice-versa) is one that combines multiple columns for fast data retrieval. The orders are flipped for the two composites in the SCJunction table so that, depending on the universe of queries that go after your data, the db engine can choose which index to use for fastest retrieval based on the left-most column you are going after.

    As for the unique key, #1, the comment next to it stating enforcing no duplicates (meaning junk data) is rather self-explanatory. For instance, student 1 course 1 term 1 cannot exist twice in that table.

    A crucial concept to understand is the concept of left-most ordering of column names in an index.

    For queries that go after studentId only, then the key that has studentId listed first (left-most) is used. In queries that go after courseId only, then the key that has courseId left-most is used. In queries that go after both studentId and courseId, the db engine can decide which composite key to use.

    When I say "go after", I mean in the on clause or where clause condition.

    Were one not to have those two composite keys (with the column 1 and 2 in them flipped), then in queries where the column sought is not left-most indexed, you would not benefit with key usage, and suffer a slow tablescan for data to return.

    So, those two indexes combine the following 2 concepts

    • Fast data retrieval based on left-most or both (studentId and courseId columns)
    • Enforcing non-duplication of data in that table based on studentId, courseId, and term values

    The Takeaway

    The important takeaway is that Junction tables make for quick index retrieval, and sane management of data versus comma-delimited data (array mindset) crammed into a column, and all the misery of using such a construct.

提交回复
热议问题