K近邻算法(K-Nearest Neighbor)简称KNN算法,是最简单的预测模型之一，它没有多少数学上的假设，也不要求任何复杂的处理，它所要求的只有以下两点

1.某种距离计算概念

2.彼此接近的点具有相似的性质

KNN算法只依赖待预测节点附近的少量节点，有意的忽略了数据集中的大量样本，同时该算法也不能帮助我们理解事物现象背后的机制和原理；

$y = argmax_{v} \sum_{(x_{i},y_{i})\in D_{z}} I(v=y_{i})$

k值的选择

$x^{‘} = \frac{x-Min} {Max-Min}$

k近邻算法通常有两类分类原则，一类是平等投票表决原则，一类是加权投票原则；

from sklearn.datasets import  load_wine
print('wine features:')
print(wine.feature_names)
print('wine target names:')
print(wine.target_names)
print('wine data shape:')
print(wine.data.shape)
print('one wine data:')
print(wine.data[0])
print('wine target:')
print(wine.target)
wine features:
['alcohol', 'malic_acid', 'ash', 'alcalinity_of_ash', 'magnesium', 'total_phenols', 'flavanoids', 'nonflavanoid_phenols', 'proanthocyanins', 'color_intensity', 'hue', 'od280/od315_of_diluted_wines', 'proline']
wine target names:
['class_0' 'class_1' 'class_2']
wine data shape:
(178, 13)
one wine data:
[1.423e+01 1.710e+00 2.430e+00 1.560e+01 1.270e+02 2.800e+00 3.060e+00
2.800e-01 2.290e+00 5.640e+00 1.040e+00 3.920e+00 1.065e+03]
wine target:
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2]

from sklearn.datasets import  load_wine
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import MinMaxScaler
X = wine.data
y = wine.target
# 归一化
scaler = MinMaxScaler()
scaler.fit(X)
X = scaler.transform(X)
X_train,X_test,y_train,y_test = train_test_split(X,y, test_size=0.3, random_state=123)
knn = KNeighborsClassifier(n_neighbors=3)
knn.fit(X, y)
y_predict_train = knn.predict(X_train)
y_predict_test = knn.predict(X_test)
acc_train = accuracy_score(y_train, y_predict_train)
acc_test = accuracy_score(y_test, y_predict_test)
print("train set accuracy {:.2f}".format(100*acc_train))
print("test set accuracy {:.2f}".format(100*acc_test))
train set accuracy 96.77
test set accuracy 98.15