How to add new relations and making weight dependent in MySQL database?

问题

I am working on a health app and I'm designing a database (MySQL), which should store symptoms and diseases and bodyparts. The app will work as follows:

User chooses gender, age and bodypart (front/back)
App shows all (common/less common) symptoms of chosen bodypart
User chooses symptom
App shows all (common/less common) diseases of chosen symptom.
User chooses disease
App shows disease information

I got some help so far, but I still need some help finding a solution:

Making weight in symptom_disease dependent on the selected age and gender (order of the listed diseases should depend on the selected age and gender).
Some symptoms should have additional symptoms, that the user can choose as an extra. So for example, when the user chooses 'Head' -> 'Behavioral disturbances' (common), the app should display 2 extra checkboxes 'Depressed' and 'Drugs abuse'. The order of diseases list should depend on these inputs.

Note: The weight determines if a symptom/disease is common or less common. The common symptoms/diseases are listed above the less common symptoms/diseases.

回答1:

The weight determines if a symptom/disease is common or less common. The common symptoms/diseases are listed above the less common symptoms/diseases.

That is fine. A bit simplistic, but fine. That is ranking by weight or ORDER BY weight.

Making weight in symptom_disease dependent on the selected age and gender (order of the listed diseases should depend on the selected age and gender)

That tells me the result you want (which is also fine). But the data that is required to produce that result ...

weight per disease per symptom per age, or
weight per disease per symptom per age per gender

... is absent. You cannot produce something from nothing.

Concept

The whole concept is immature. (I am, of course, speaking about the party who gave you this job, not you, who is trying to implement it.) Eg. we know that:

white people who live in sun-drenched countries suffer melanoma, etc, that the locals don't
south-east Asians who live in Western society have more diabetes and heart disease than they do in their own countries
many diseases, eg. breast cancer, have a hereditary factor

The point is, the concept is very simplistic. When that simplistic approach is thought through carefully, hopefully by medically-trained and experienced professionals, they will come up with something that is much more indicative than a symptom_disease.weight, which then has to be modulated up or down based on age and gender. They will come up with many factors that influence weight, age and gender being just two such factors.

After the system is implemented, if not before, you are going to need:

many more factors or categories
that must each have either a weight value or a factor value
which (a) the user chooses, and (b) influences the end result (probability of the disease)

Further, during such resolution:

some of those factors might turn out to be a weight (another weight, at that factor level) rather than a factor
during that process, it may turn out that the symptom_disease.weight is actually an intermediate result, not a stored value at the symptom_disease level
that might lead to weight at the disease level, that is modulated by each factor.

Result Weight

Once the data is structured and organised, obtaining resulting weight based on the user's choice of symptoms, categories, and factors is easy. It is just a single SELECT against the database.

I would use Dynamic SQL to construct the query, and then execute it, because the alternative is to code every possible query, and select the one that is relevant to the user request.
However, for a starting point, for proof of concept, just code a few separate queries.

You don't need a function on the database side. On the code side, a function is a good idea, because it eliminates duplicate code.

Record ID

This creature poses many problems, and guarantees a non-database. Please read this Answer. You may find the whole post useful, if not, the minimum you need is from the top, up to and including False Teachers.

Likewise, prefixing the column names with a file identifier is an error. In SQL, you can qualify a column name, as and when required, with the file name. Instead of s_name and bp_name:

    SELECT  symptom.name,
            bodypart.name,
            ...

Data Model

The most important issue at this early stage, is to provide a database that is easy to extend, as per those expected changes, which means, pure Relational and adherence to Standards. You must place all data; rules about data; everything that relates to data, in the database (not in code).

Symptom Disease Data Model (Demonstrative)

In any case, right now, given what you have described, and considering my explanation above, we have two options. I have provided a separate data model for each.

1 Modulated Weight

Your client provides a weight per disease per symptom only, plus factors per age, gender, etc.

The factors will be used to modulate the weight.
Notice, the age Key is an age range, eg. 1-18; 19-25; 26-35; 35-55; etc.
Factor is a REAL between 0 and some reasonable maximum (eg. 100). Eg. 1.0 for age means the weight is unchanged with age; 0.5 means the weight is halved with age; 2.0 means the weight is doubled with age; etc.
The vectors or Dimensions are across the top. As and when new factors are identified (eg. Ethnicity), you can add another factor (Dimension), in the same way that age and gender have been added.

2 Specific Weight

Your client provides a specific weight per disease per symptom per age per gender, etc. There are no factors.

Discussion

In both models, when tables are added as and when new factors are identified, the core (five tables, top centre) do not change. However, in the second model, I have differentiated age and gender on this basis:

Gender is modelled as a definitive, a Dimension, and thus I have worked it through the core tables, as the Relational Key. Relational Keys are important for navigation (as in reducing it) and providing great power. Here in addition to that, I am saying gender is definitive.

The corollary is, you do not want definitive factors to be excluded from the Key of the core tables, because it will reduce navigation ability and speed.
Age is modelled as an additional Dimension, that can be added easily, after the core tables are stable, as an when new factors are identified.

One is not better than the other. The point I am making here is, and this applies to both models, but it is more visible in the Specific Weight model, the Dimensions that comprise the core tables need to be thought out carefully. Not because anything that you work into the Relational Keys are more difficult to change (and has an impact of existing code), but the other way around, that what is genuinely a definitive part of the core tables, that forms the core symptom::disease indicator, is stable. It must be determined and modelled early.

Eg. I don't understand why you have symptom as being a child of bodypart, I can imagine symptoms that are not specific to a bodypart, but I will leave that as is for now.
- The ugly-as-sin fix without changing the structure is to have a special bodypart indicating the whole body.
- Rather than { Head/Behavioural/...}, think about categories for symptom categories such as { Physical | Mental | Behaviour }. Or Behaviour { Smoking | Alcohol } may be a separate dimension that contributes to disease.
Eg. I can imagine symptoms that are specifically male or female only. I have modelled gender two ways, so that you can evaluate the difference, and the impact:
- Divorced and isolated in 1 Modulated Weight
- As a classifier (and therefore Identifier) of bodypart, symptom,and disease in 2 Specific Weight).

The point of all that is:

The definitive factors need to be exposed; evaluated; determined, early, such that the core tables are stable.
All factors that are relevant to disease diagnosis must be identified and implemented. Not merely Age and Gender, but Ethnicity; Hereditary diseases; Behaviour; Location; etc.

Please feel free to comment, ask questions, etc.

回答2:

There is a method you can use to break a spectrum of choices into categories according to spectrum units, in this case age.

create table Symptom_Disease(
    Sex       char( 1 ) not null,
    Age       smallint not null,
    Weight    smallint not null,
    BodyPart  int not null references BodyPart( BP_ID),
    SymptomID int not null references Symptoms( S_ID ),
    DiseaseID int not null references Disease( D_ID ),
    constraint PK_Symptom_Disease primary key( Sex, Age )
);

You have the sex, age, body part and symptom when you query:

select  sd.Sex, sd.Age, d.*, sd.weight
from    Symptom_Disease sd
join    Disease d
    on  d.D_ID = sd.D_ID
where   sd.Sex = :Sex
    and sd.BodyPart = :BodyPart
    and sd.Symptom = :Symptom
    and sd.Age =(
            select  Max( Age )
            from    Symptom_Disease
            where   Sex = sd.Sex
                and BodyPart = sd.BodyPart
                and Symptom = sd.Symptom
                and Age <= :Age )
order by weight;

For example, you might have entries (of various weights) for ages 0, 2, 12, 21, 35 and 70. The ages represent the lower cutoff for the age group. So the first record is for ages 2 or less, the second record is for ages 12 and younger down to but not including 2 and so forth. If your patient's age is 25, this query will return all the diseases of each weight for each set of (Sex, Body Part, Symptom) for age 21 -- it being the highest age less than or equal to the patient's age.

Just change sd.Symptom = :Symptom to sd.Symptom in( :S1, :S2,...) to allow for multiple symptoms. If so, you might want to add symptom to the order by clause.

来源：https://stackoverflow.com/questions/30374031/how-to-add-new-relations-and-making-weight-dependent-in-mysql-database

标签

mysql

database

database-design

relational-database

database-schema