自己随便准备了一些名人的出生日期和职业,将职业分成体育、军事、表演、文学、政治、艺术、社会经济七个大类,CSV表大概如下:
代码:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
import os
import mglearn
plt.rcParams['font.sans-serif']=['SimHei']
plt.rcParams['axes.unicode_minus']=False
os.chdir("D:\PyProjects\MyProject\dataset")
dataset = pd.read_csv('test.csv', header=0, index_col=0)
data_pandas = pd.DataFrame(dataset)
display(data_pandas)
M = dataset.iloc[ : , 1].values
D = dataset.iloc[ : , 2].values
mglearn.discrete_scatter(M, D, dataset.iloc[ : , 3].values)
plt.legend(["体育", "军事", "表演", "文学", "政治", "艺术", "社会经济"], loc=4, bbox_to_anchor=(1.2, 0))
plt.xlabel("Month")
plt.ylabel("Date")
得到如下结果: