Compound course prerequisites (One or more of a,b,c and either x or y as well as z style)

烈酒焚心 提交于 2019-12-05 13:12:36

In my opinion modeling conjunction and disjunction in one table is always uneasy and leads to either violation of normal form or inability to predict how many self joins are necessary. What I understand is that your prerequisites can be generally always expressed as alternatives of conjunctions. So the following:

Math AND English AND (Physics1 OR Physics2)

may be as well expressed as:

(Math AND English AND Physics1) OR (Math AND English AND Physics2)

This lead to a conclusion, that you probably need an intermediate table describing sets of prerequisites. A course is available when any of sets is successful, while set is successful when all of subjects in the set are completed.

So the structure may look like this:

   Prerequisite:
   +---------------+---------------+
   |      Id       |     Name      |         
   |---------------|---------------|         PrerequisiteSets:
   |      1        |   Maths       |         +---------------+---------------+
   |      2        |   English     |         |  SetNumber    | Prerequisite_FK
   |      3        |   Art         |         |---------------|---------------|
   |      4        |   Physics     |         |      1        |      1        |
   |      5        |   Psychology  |         |      1        |      2        |
   +-------------------------------+         |      1        |      4        |
                                             |      2        |      1        |
                                             |      2        |      2        |
   Course:                                   |      2        |      5        |
   +---------------+---------------+         +---------------v---------------+
   |      Id       |     Name      |
   |---------------|---------------|
   |      1        |   Course1     |
   |      2        |   Course2     |
   |      3        |   Course3     |
   |      4        |   Course4     |
   |      5        |   Course5     |
   +---------------v---------------+

   CoursePrerequisite:                                
   +---------------+---------------+
   |  Course_FK    |  SetNumber    |
   |---------------|---------------|
   |      5        |       1       |
   |      5        |       2       |
   +---------------v---------------+

An example Course5 can be satisfied with either SetNumber 1 (Math, English, Physics) or SetNumber 2 (Math, English, Psychology).

Unfortunately it's too late here for me to help you with exact queries now, but in case you need it I can extend my answer tomorrow. Good luck though! :-)

EDIT

To generate queries I'd start with observation, that particular set is matched, when all prerequisites in set are a subset of given prerequisites. This leads to condition, that number of distinct prerequisites in set must match number of prerequisites in this set that are also in given set. Basically (assumming SetNumber-Prerequisite_FK is unique pair in table):

select
  SetNumber,
  count(Prerequisite_FK) as NumberOfRequired,
  sum(case when Prerequisite.Name in ('Math','English','Art') then 1 else 0 end)
    as NumberOfMatching
from PrerequisiteSets
  inner join Prerequisite on PrerequisiteSets.Prerequisite_FK = Prerequisite.ID
group by SetNumber
having
   count(Prerequisite_FK)
   =
   sum(case when Prerequisite.Name in ('Math','English','Art') then 1 else 0 end)

Now getting final Courses boils down to getting all courses, which at least one set number is found in the results of query above. Starting like this (definitely can be expressed better and optimized with joins but general idea is the same):

select Id, Name
from Course
where Id in
  (select Course_FK from CoursePrerequisite
   where SetNumber in
   (
      -- insert query from above (but only first column: SetNumber, skip the two latter)
   ) as MatchingSets
  ) as MatchingCourses

Kuba Wyrostek has suggested below to enumerate all prerequisite combinations for each course into distinct sets. While this would work, I need to do this for ~6k rows, each with many enumerations. Is there a more efficient way to accomplish this?


Storing sets is an obvious choice, I agree with Kuba. But I suggest a bit different approach:

prereqs:                     courses:
+------+------------+        +------+------------+
| p_id |   Name     |        | c_id |   Name     |
|------|------------|        |------|------------|
|   1  | Math       |        |   1  | Course1    |
|   2  | English    |        |   2  | Course2    |
|   3  | Art        |        |   3  | Course3    |
|   4  | Physics    |        |   4  | Course4    |
|   5  | Psychology |        |   5  | Course5    |
+------+------------+        +------+------------+

compound_sets:               compound_sets_prereqs:
+-------+-------+-------+    +-------+-------+
| s_id  | c_id  |  cnt  |    | s_id  | p_id  |
|-------|-------|-------|    |-------|-------|
|   1   |   1   |   1   |    |   1   |   1   |
|   2   |   1   |   2   |    |   1   |   2   |
|   3   |   2   |   1   |    |   2   |   3   |
|   4   |   2   |  null |    |   2   |   4   |
|   5   |   3   |  null |    |   2   |   5   |
+-------+-------+-------+    |   3   |   1   |
                             |   3   |   4   |
                             |   4   |   1   |
                             |   4   |   2   |
                             |   5   |   2   |
                             |   5   |   3   |
                             +-------+-------+

The "cnt" column above stores the minimum number of required matches, NULL value means all prerequisites have to match. So in my example we have the following requirements:


Course1: ( Math or English ) and ( at least two out of Art, Physics and Psychology )
Course2: ( Math or Physics ) and ( both Math and English )
Course3: both English and Art

Here's the SQL:

select t.c_id
     , c.name
  from (  select c_id
               , sets_cnt

               -- flag the set if it meets the requirements
               , case when matched >= min_cnt then 1 else 0 end  flag

            from (  select c.c_id
                         , cs.s_id

                         -- the number of matched prerequisites
                         , count(p.p_id) matched

                         -- if the cnt is null - we need
                         -- to match all prerequisites
                         , coalesce( cnt, count(csp.p_id) ) min_cnt

                          -- the total number of sets the course has
                          , (  select count(1)
                                 from compound_sets t
                                where t.c_id = c.c_id
                            ) sets_cnt

                      from courses c

                      join compound_sets cs
                        on cs.c_id = c.c_id

                      join compound_sets_prereqs csp
                        on cs.s_id = csp.s_id

                      left join (  select p_id
                                     from prereqs p
                                    -- this data comes from the outside
                                    where p.name in ( 'Physics',
                                                      'English',
                                                      'Math',
                                                      'Psychology' )
                                ) p
                        on csp.p_id = p.p_id

                     group by c.c_id, cs.s_id, cs.cnt
                 ) t
       ) t
     , courses c
 where t.c_id = c.c_id 
 group by t.c_id, c.name, sets_cnt

-- check that all sets of this course meet the requirements
having count( case when flag = 1 then 1 else null end ) = sets_cnt

This is a cut&paste from one of my training labs for adcanced SQL, I hope it's the correct one, I can't test it right now, sounds similar to your task.

It's just using pizzas and toppings, I usually do it before lunch :-)

CREATE TABLE Pizzas
(Pizza# INTEGER NOT NULL PRIMARY KEY,
 PizzaName VARCHAR(30) NOT NULL UNIQUE
);

INSERT INTO Pizzas VALUES(1, 'Margherita')
;INSERT INTO Pizzas VALUES(2, 'Salami')
;INSERT INTO Pizzas VALUES(3, 'Prosciutto')
;INSERT INTO Pizzas VALUES(4, 'Funghi')
;INSERT INTO Pizzas VALUES(5, 'Hawaii')
;INSERT INTO Pizzas VALUES(6, 'Calzone')
;INSERT INTO Pizzas VALUES(7, 'Quattro Stagioni')
;INSERT INTO Pizzas VALUES(8, 'Marinara')
;INSERT INTO Pizzas VALUES(9, 'Vegetaria')
;INSERT INTO Pizzas VALUES(10, 'Diavola')
;INSERT INTO Pizzas VALUES(11, 'Tonno')
;INSERT INTO Pizzas VALUES(12, 'Primavera')
;INSERT INTO Pizzas VALUES(13, 'Gorgonzola')
;INSERT INTO Pizzas VALUES(14, 'Fantasia')
;INSERT INTO Pizzas VALUES(15, 'Quattro Formaggi')
;INSERT INTO Pizzas VALUES(16, 'Napolitane')
;INSERT INTO Pizzas VALUES(17, 'Duplicato')
;


CREATE TABLE Toppings
(Topping# INTEGER NOT NULL PRIMARY KEY,
 Topping VARCHAR(30) NOT NULL UNIQUE
);

INSERT INTO Toppings VALUES(1, 'Tomatoes')
;INSERT INTO Toppings VALUES(2, 'Mozzarella')
;INSERT INTO Toppings VALUES(3, 'Salami')
;INSERT INTO Toppings VALUES(4, 'Mushrooms')
;INSERT INTO Toppings VALUES(5, 'Chillies')
;INSERT INTO Toppings VALUES(6, 'Pepper')
;INSERT INTO Toppings VALUES(7, 'Onions')
;INSERT INTO Toppings VALUES(8, 'Garlic')
;INSERT INTO Toppings VALUES(9, 'Olives')
;INSERT INTO Toppings VALUES(10, 'Capers')
;INSERT INTO Toppings VALUES(11, 'Tuna')
;INSERT INTO Toppings VALUES(12, 'Squid')
;INSERT INTO Toppings VALUES(13, 'Pineapple')
;INSERT INTO Toppings VALUES(14, 'Spinach')
;INSERT INTO Toppings VALUES(15, 'Scallop')
;INSERT INTO Toppings VALUES(16, 'Ham')
;INSERT INTO Toppings VALUES(17, 'Gorgonzola')
;INSERT INTO Toppings VALUES(18, 'Asparagus')
;INSERT INTO Toppings VALUES(19, 'Fried egg')
;INSERT INTO Toppings VALUES(20, 'Anchovies')
;INSERT INTO Toppings VALUES(21, 'Corn')
;INSERT INTO Toppings VALUES(22, 'Artichock')
;INSERT INTO Toppings VALUES(23, 'Seafood')
;INSERT INTO Toppings VALUES(24, 'Brokkoli')
;INSERT INTO Toppings VALUES(25, 'Anchovis')
;INSERT INTO Toppings VALUES(26, 'Parmesan')
;INSERT INTO Toppings VALUES(27, 'Goat cheese')
;


CREATE TABLE PizzaToppings
(Pizza# INTEGER NOT NULL,
 Topping# INTEGER NOT NULL,
 UNIQUE (Pizza#, Topping#)
) PRIMARY INDEX(Pizza#);

INSERT INTO PizzaToppings VALUES(1, 1)
;INSERT INTO PizzaToppings VALUES(1, 2)
;INSERT INTO PizzaToppings VALUES(2, 1)
;INSERT INTO PizzaToppings VALUES(2, 2)
;INSERT INTO PizzaToppings VALUES(2, 3)
;INSERT INTO PizzaToppings VALUES(3, 1)
;INSERT INTO PizzaToppings VALUES(3, 2)
;INSERT INTO PizzaToppings VALUES(3, 16)
;INSERT INTO PizzaToppings VALUES(4, 1)
;INSERT INTO PizzaToppings VALUES(4, 2)
;INSERT INTO PizzaToppings VALUES(4, 4)
;INSERT INTO PizzaToppings VALUES(5, 1)
;INSERT INTO PizzaToppings VALUES(5, 2)
;INSERT INTO PizzaToppings VALUES(5, 13)
;INSERT INTO PizzaToppings VALUES(5, 16)
;INSERT INTO PizzaToppings VALUES(6, 1)
;INSERT INTO PizzaToppings VALUES(6, 2)
;INSERT INTO PizzaToppings VALUES(6, 4)
;INSERT INTO PizzaToppings VALUES(6, 11)
;INSERT INTO PizzaToppings VALUES(6, 22)
;INSERT INTO PizzaToppings VALUES(7, 1)
;INSERT INTO PizzaToppings VALUES(7, 2)
;INSERT INTO PizzaToppings VALUES(7, 4)
;INSERT INTO PizzaToppings VALUES(7, 6)
;INSERT INTO PizzaToppings VALUES(7, 16)
;INSERT INTO PizzaToppings VALUES(8, 1)
;INSERT INTO PizzaToppings VALUES(8, 2)
;INSERT INTO PizzaToppings VALUES(8, 8)
;INSERT INTO PizzaToppings VALUES(8, 9)
;INSERT INTO PizzaToppings VALUES(8, 12)
;INSERT INTO PizzaToppings VALUES(8, 15)
;INSERT INTO PizzaToppings VALUES(8, 16)
;INSERT INTO PizzaToppings VALUES(8, 23)
;INSERT INTO PizzaToppings VALUES(9, 1)
;INSERT INTO PizzaToppings VALUES(9, 2)
;INSERT INTO PizzaToppings VALUES(9, 5)
;INSERT INTO PizzaToppings VALUES(9, 6)
;INSERT INTO PizzaToppings VALUES(9, 7)
;INSERT INTO PizzaToppings VALUES(9, 8)
;INSERT INTO PizzaToppings VALUES(9, 9)
;INSERT INTO PizzaToppings VALUES(9, 14)
;INSERT INTO PizzaToppings VALUES(9, 18)
;INSERT INTO PizzaToppings VALUES(10, 1)
;INSERT INTO PizzaToppings VALUES(10, 2)
;INSERT INTO PizzaToppings VALUES(10, 5)
;INSERT INTO PizzaToppings VALUES(10, 7)
;INSERT INTO PizzaToppings VALUES(10, 9)
;INSERT INTO PizzaToppings VALUES(10, 10)
;INSERT INTO PizzaToppings VALUES(11, 1)
;INSERT INTO PizzaToppings VALUES(11, 2)
;INSERT INTO PizzaToppings VALUES(11, 7)
;INSERT INTO PizzaToppings VALUES(11, 11)
;INSERT INTO PizzaToppings VALUES(12, 1)
;INSERT INTO PizzaToppings VALUES(12, 2)
;INSERT INTO PizzaToppings VALUES(12, 3)
;INSERT INTO PizzaToppings VALUES(12, 4)
;INSERT INTO PizzaToppings VALUES(13, 1)
;INSERT INTO PizzaToppings VALUES(13, 2)
;INSERT INTO PizzaToppings VALUES(13, 16)
;INSERT INTO PizzaToppings VALUES(13, 17)
;INSERT INTO PizzaToppings VALUES(13, 24)
;INSERT INTO PizzaToppings VALUES(14, 1)
;INSERT INTO PizzaToppings VALUES(14, 2)
;INSERT INTO PizzaToppings VALUES(14, 10)
;INSERT INTO PizzaToppings VALUES(14, 19)
;INSERT INTO PizzaToppings VALUES(14, 20)
;INSERT INTO PizzaToppings VALUES(14, 21)
;INSERT INTO PizzaToppings VALUES(15, 1)
;INSERT INTO PizzaToppings VALUES(15, 2)
;INSERT INTO PizzaToppings VALUES(15, 17)
;INSERT INTO PizzaToppings VALUES(15, 26)
;INSERT INTO PizzaToppings VALUES(15, 27)
;INSERT INTO PizzaToppings VALUES(16, 1)
;INSERT INTO PizzaToppings VALUES(16, 2)
;INSERT INTO PizzaToppings VALUES(16, 4)
;INSERT INTO PizzaToppings VALUES(16, 5)
;INSERT INTO PizzaToppings VALUES(16, 16)
;INSERT INTO PizzaToppings VALUES(17, 1)
;INSERT INTO PizzaToppings VALUES(17, 2)
;INSERT INTO PizzaToppings VALUES(17, 4)
;INSERT INTO PizzaToppings VALUES(17, 6)
;INSERT INTO PizzaToppings VALUES(17, 16)
;


REPLACE VIEW PizzaView AS
SELECT
  P.Pizza#
 ,P.PizzaName
 ,T.Topping
FROM
  Pizzas P
JOIN
  PizzaToppings PT
ON P.Pizza# = PT.Pizza#
JOIN
  Toppings Z ON PT.Topping# = T.Topping#
;



/***
1. Return all pizzas which are a superset of the searched toppings.

*At least* ('tomaten', 'mozzarella', 'salami') and maybe additional toppings:

Salami, Primavera
***/

/*** 1. ***/
SELECT
  Pizza#
  ,PizzaName
  ,COUNT(*) AS #Toppings
FROM
  PizzaView
WHERE
  Topping IN ('tomatoes', 'mozzarella', 'salami')
GROUP BY 1,2
HAVING COUNT(*) = 3
;


/***
2. Return all pizzas which are a subset of the searched toppings.

*At most* toppings ('tomaten', 'mozzarella', 'salami'), but no other toppings:

Salami, Margherita
***/

/*** 2. ***/
SELECT
  Pizza#
  ,PizzaName
  ,COUNT(*) AS #Toppings
FROM
  PizzaView
GROUP BY 1,2
HAVING
  SUM(CASE WHEN Topping IN ('tomatoes', 'mozzarella', 'salami') THEN 0 ELSE 1 END) = 0
ORDER BY #Toppings DESC
;


/***
3. Return all pizzas which are a exactly made of the searched toppings.

*All toppings* ('tomaten', 'mozzarella', 'salami'), but no other toppings

Salami
***/

/*** 3. ***/
SELECT
  Pizza#
  ,PizzaName
  ,COUNT(*) AS #Toppings
FROM
  PizzaView
GROUP BY 1,2
HAVING
  SUM(CASE WHEN Topping IN ('tomatoes', 'mozzarella', 'salami') THEN 1 ELSE -1 END) = 3
ORDER BY #Toppings
;


/***
4. Return all pizzas which are a superset of the searched toppings.

*At least* toppings ('tomaten' and 'mozzarella') and ('olives' or 'capers')

Diavola, Fantasia, Marinara, Vegetaria
***/

/*** 4. ***/
SELECT
  Pizza#
  ,PizzaName
  ,COUNT(*) AS #Toppings
  ,SUM(CASE WHEN Topping IN ('olives', 'capers') THEN 1 ELSE 0 END) AS #Optional
FROM PizzaView
GROUP BY 1,2
HAVING
  SUM(CASE WHEN Topping IN ('tomatoes', 'mozzarella') THEN 1 ELSE 0 END) = 2
AND
  #Optional >= 1
ORDER BY 4 DESC
;


/***
5. Return all pizzas which are a superset of the searched toppings.

*At least* toppings ('tomatoes' and 'olives') and maybe additional toppings, but no 'capers'

Marinara, Vegetaria
***/

/*** 5. ***/
SELECT
  Pizza#
  ,PizzaName
  ,COUNT(*) AS #Toppings
FROM
  PizzaView
GROUP BY 1,2
HAVING
  SUM(CASE
        WHEN Topping IN ('tomatoes', 'olives') THEN 1
        WHEN Topping IN ('capers') THEN -1
        ELSE 0
      END) = 2
ORDER BY #Toppings DESC
;




/*** Instead of a list of toppings a table with searched toppings
***/
CREATE SET TABLE searched
(grp INTEGER NOT NULL,
 topping VARCHAR(30) NOT NULL
);

DELETE FROM searched;

INSERT INTO searched VALUES(1,'tomatoes');
INSERT INTO searched VALUES(1,'mozzarella');
INSERT INTO searched VALUES(1,'salami');

/*** 1. ***/
SELECT
  Pizza#
  ,PizzaName
  ,COUNT(*) AS #Toppings
FROM
  PizzaView p
JOIN searched g
ON p.Topping = g.Topping
GROUP BY 1,2
HAVING
  COUNT(*) = (SELECT COUNT(*) FROM searched)
;


/*** 2. ***/
SELECT
  Pizza#
 ,PizzaName
 ,COUNT(*) AS #Toppings
FROM
  PizzaView p
LEFT JOIN searched g
ON p.Topping = g.Topping
GROUP BY 1,2
HAVING
  COUNT(*) = COUNT(g.Topping)
;


/*** 3. ***/
SELECT
  Pizza#
 ,PizzaName
 ,COUNT(*) AS #Toppings
FROM    
  PizzaView p
LEFT JOIN searched g
ON p.Topping = g.Topping
GROUP BY 1,2
HAVING
  SUM(CASE WHEN g.Topping IS NOT NULL THEN 1 ELSE -1 END) 
  = (SELECT COUNT(*) FROM searched)
;

I never needed to do #4/#5 with that searched table, but it should be possible using above logic.

I would model this slightly differently than both Kuba and Dmitry have suggested, although both have provided the general framework to write this answer.

I apologize in advance, I'm going to break your existing model's language since I don't feel that "Art" and "Psychology" are your Prerequisites. They are simply courses which combine together form a Prerequisite entity. So I've renamed this table Subjects.

All data models can be described as entities and relationships that can be described without an actual physical database. In this case, you have this one challenging entity, the Prerequisite. As an entity it is represented by the subjects within it and the number of subjects required from its set of courses. This would fit nicely with a course catalog where you could say for a given course, what each of its prerequisites are in a single line per prerequisite ("Art, Psychology - 1 required", "Art and Psychology - all required", etc.)

The first question is:

  • Is the number of subjects required from a given prerequisite sufficiently unique and idiosyncratic to each course, or does "Art or Psychology - 1 required" apply to a large number of courses? Does it change often per each course, or is it relatively static?

If it applies to a large number of courses or is relatively static, it should sit in the Prerequisite table. If each set of prerequisites for a course is relatively dynamic, it should sit in the PrerequisiteCourse table. For now I will assume the former.

The actual subjects are a many-to-many relationship (each subject can be part of many prerequisites, each prerequisite can have many subjects) and should be modeled in a cross-reference table therein.

From here it is obvious that "Art and Psychology - choose 1" and "Art and Psychology - both required" are unique entities. So I would distinctly determine each possible set of prerequisite subjects including the number of subjects required.

Prerequisite
----------------------
PrerequisiteID
NumberOfSubjectsRequired

Subject
---------------
SubjectID
Name

PrerequisiteSubject
--------------------
PrequisiteSubjectID
PrerequisiteID
SubjectID

Course
------
CourseID

CoursePrerequisite
------------------
CoursePrerequisiteID
PrerequisiteID
CourseID

Notice how this improves (if I may be so bold) on Dmitry's model by ensuring a distinct list of prerequisite subject combinations and allowing prerequisites such as "Art or Psychology - choose 1" to be reused among all courses. This is (in my opinion based on my understanding of your data model) the proper modeling of a prerequisite. Consider the scenario where "Art and Psychology - choose 1" is modified to also include Speech for all courses. Here you could insert one row in one place (the PrerequisiteSubject table) and it would apply to all courses therein without disturbing any of the other prerequisites.

Another advantage is in the querying: for a given set of subjects, get the prerequisites that a given student would meet (assume SubjectsTaken is the subjects the student has taken):

  select 
        case when count(1) >= ct then 1 else 0 end as PrerequisiteMet,
        p.PrerequisiteID
    from
        subjectstaken st
        left join [subject] s
            inner join Prerequisitesubject PS
                inner join Prerequisite P
                on PS.prerequisiteid = P.prerequisiteiD
            on S.subjectid = PS.subjectID
        on s.name = st.name
    group by
        p.PrerequisiteID, ct

And then the courses that student could take:

select 
courseid
from prerequisitesmet  pm
right join 
prerequisitecourse pc
on pc.prerequisiteid = pm.PrerequisiteID
group by
courseid
having sum(prerequisitemet) >= count(1)

Anyway, all of this modeling really depends on your "reusable" entities. It feels like a prerequisite should be a reusable entity, but I could be wrong.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!