Cleveland heart disease dataset - can’t describe the class

前端 未结 3 1275
难免孤独
难免孤独 2021-01-21 13:00

I’m using the Cleveland Heart Disease dataset from UCI for classification but i don’t understand the target attribute.

The dataset description says that

3条回答
  •  攒了一身酷
    2021-01-21 13:41

    It basically means that the presence of different heart diseases have been denoted by 1, 2, 3, 4 while the absence is simply denoted by 0. Now, most of the experiments that have been conducted on this dataset have been based on binary classification, i.e. presence(1, 2, 3, 4) vs absence(0). One reason for such behavior might the class imbalance problem(0 has about 160 sample and the rest 1, 2, 3 and 4 make up the other half) and small number of samples(only around 300 total samples). So, it makes sense to treat this data as binary classification problem instead of multi-class classification, given the constraints that we have.

提交回复
热议问题