## 1、共线性的原理

### 1.1 扰动分析

【 扰动分析定理 】设非奇异方阵A满足方程

## 2、共线性问题的解决方法

### 2.4 LASSO回归

LASSO回归和岭回归类似，只不过将惩罚项由L2范数改为了L1范数

L1范数没有L2范数那幺圆润，毕竟存在不可导点，而且在L1范数下LASSO回归也给不出解析解啦，但是相对于岭回归，LASSO估计的参数能更容易收敛到0

### 2.5 ElasticNet回归等

ElasticNet回归同时兼顾了L1和L2惩罚项：

## 3、Python实践

import numpy as npfrom sklearn.linear_model import LinearRegressionfrom sklearn import cross_validation
coef0=np.array([5,6,7,8,9,10,11,12])
X1=np.random.rand(100,8)
y=np.dot(X1,coef0)+np.random.normal(0,1.5,size=100)
training=np.random.choice([True,False],p=[0.8,0.2],size=100)
lr1=LinearRegression()
lr1.fit(X1[training],y[training])# 系数的均方误差MSEprint(((lr1.coef_-coef0)**2).sum()/8)# 测试集准确率（R2）print(lr1.score(X1[~training],y[~training]))# 平均测试集准确率print(cross_validation.cross_val_score(lr1,X1,y,cv=5).mean())

X2=np.column_stack([X1,np.dot(X1[:,[0,1]],np.array([1,1]))+np.random.normal(0,0.05,size=100)])
X2=np.column_stack([X2,np.dot(X2[:,[1,2,3]],np.array([1,1,1]))+np.random.normal(0,0.05,size=100)])
X3=np.column_stack([X1,np.random.rand(100,2)])

>>>print(np.linalg.cond(X1))
>>>print(np.linalg.cond(X2))
>>>print(np.linalg.cond(X3))6.29077685383110.9306124087.25066276479

lr2=LinearRegression()
lr2.fit(X2[training],y[training])# 系数的均方误差MSEprint(((lr2.coef_[:8]-coef0)**2).sum()/8)# 测试集准确率（R2）print(lr2.score(X2[~training],y[~training]))# 平均测试集准确率print(cross_validation.cross_val_score(lr2,X2,y,cv=5).mean())
lr3=LinearRegression()
lr3.fit(X3[training],y[training])# 系数的均方误差MSEprint(((lr3.coef_[:8]-coef0)**2).sum()/8)# 测试集准确率（R2）print(lr3.score(X3[~training],y[~training]))# 平均测试集准确率print(cross_validation.cross_val_score(lr3,X3,y,cv=5).mean())

>>>print(lr2.coef_)
[ 10.506618    11.467777     6.35562175   7.56698262   9.44509206
9.81032939  11.66187822  12.29728702  -5.07439399   0.02649089]

import matplotlib.pyplot as plt
vif2=np.zeros((10,1))for i in range(10):
tmp=[k for k in range(10) if k!=i]
clf.fit(X2[:,tmp],X2[:,i])
vifi=1/(1-clf.score(X2[:,tmp],X2[:,i]))
vif2[i]=vifi
vif3=np.zeros((10,1))for i in range(10):
tmp=[k for k in range(10) if k!=i]
clf.fit(X3[:,tmp],X3[:,i])
vifi=1/(1-clf.score(X3[:,tmp],X3[:,i]))
vif3[i]=vifi
plt.figure()
ax = plt.gca()
ax.plot(vif2)
ax.plot(vif3)
plt.xlabel('feature')
plt.ylabel('VIF')
plt.title('VIF coefficients of the features')
plt.axis('tight')
plt.show()

from sklearn.linear_model import Ridge
plt.figure()
n_alphas = 20alphas = np.logspace(-1,4,num=n_alphas)
coefs = []for a in alphas:
ridge = Ridge(alpha=a, fit_intercept=False)
ridge.fit(X2, y)
coefs.append(ridge.coef_)
ax = plt.gca()
ax.plot(alphas, coefs)
ax.set_xscale('log')
handles, labels = ax.get_legend_handles_labels()
plt.legend(labels=[0,1,2,3,4,5,6,7,8,9])
plt.xlabel('alpha')
plt.ylabel('weights')
plt.title('Ridge coefficients as a function of the regularization')
plt.axis('tight')
plt.show()

>>>print(coefs[0])
[  2.70748655   0.95748918   3.53687372   5.2073456    8.70186695
9.84484102  10.67351759  11.74614246   2.46502016   3.19919212]

## 参考文献

[1]. variance inflation factor  https://en.wikipedia.org/wiki/Variance_inflation_factor

[2]. 多重共线性的解决方法之——岭回归与LASSO  http://blog.csdn.net/liunian920305/article/details/73456741

[3]. ridge regression https://en.wikipedia.org/wiki/Tikhonov_regularization