## 线性回归API

```linear_model.LinearRegression(*[, ...]) Ordinary least squares Linear Regression（普通最小二乘线性回归）.
linear_model.Ridge([alpha, fit_intercept, ...]) Linear least squares with l2 regularization.
linear_model.RidgeCV([alphas, ...]) Ridge regression with built-in cross-validation.
linear_model.SGDRegressor([loss, penalty, ...]) Linear model fitted by minimizing a regularized empirical loss with SGD.```

### 以普通最小线性回归为例，介绍sklearn线性回归API的使用。

`class sklearn.linear_model.LinearRegression(*, fit_intercept=True, normalize='deprecated', copy_X=True, n_jobs=None, positive=False)`

fit_intercept: bool, default=True

normalize: bool，default=False

copy_X: bool，default=True

n_jobs: int，default=无

positive: bool，default=False

coef_: array of shape (n_features, ) or (n_targets, n_features)

rank_: int

singular_: array of shape (min(X, y),)
X的奇异值仅在X稠密时可用。
intercept_: float or array of shape (n_targets,)

n_features_in_: int

feature_names_in_: ndarray of shape (n_features_in_,)

### 通过SGD优化

```sklearn.linear_model.SGDRegressor(loss="squared_loss", fit_intercept=True, learning_rate ='invscaling', eta0=0.01)
SGDRegressor类实现了随机梯度下降学习，它⽀持不同的loss函数和正则化惩罚项来拟合线性回归模型。

loss:损失类型
- loss=”squared_loss”: 普通最⼩⼆乘法
fit_intercept：是否计算偏置
learning_rate : string, optional
- 学习率填充
- 'constant': eta = eta0
- 'optimal': eta = 1.0 / (alpha * (t + t0)) [default]
- 'invscaling': eta = eta0 / pow(t, power_t)
- power_t=0.25:存在⽗类当中
- 对于⼀个常数值的学习率来说，可以使⽤learning_rate=’constant’ ，并使⽤eta0来指定学习率。

coef_：回归系数
intercept_：偏置```

### 改进的线性回归——岭回归

```sklearn.linear_model.Ridge(alpha=1.0, fit_intercept=True,solver="auto", normalize=False)

alpha:正则化⼒度，也叫 λ
- λ取值：0~1 1~10 正则化⼒度越⼤，权重系数会越⼩
solver:会根据数据⾃动选择优化⽅法
- sag:如果数据集、特征都⽐较⼤，选择该随机梯度下降优化
normalize:数据是否进⾏标准化
- normalize=False:可以在fit之前调⽤preprocessing.StandardScaler标准化数据
Ridge.coef_:回归权重 Ridge.intercept_:回归偏置
Ridge⽅法相当于SGDRegressor(penalty='l2', loss="squared_loss"),只不过SGDRegressor实现了⼀个普通的随机 梯度下降学习，推荐使⽤Ridge(实现了SAG)```

```sklearn.linear_model.RidgeCV(_BaseRidgeCV, RegressorMixin)

coef_:回归系数```

## 逻辑回归API

```linear_model.LogisticRegression([penalty, ...])Logistic Regression (aka logit, MaxEnt) classifier.
linear_model.LogisticRegressionCV(*[, Cs, ...])Logistic Regression CV (aka logit, MaxEnt) classifier.
linear_model.PassiveAggressiveClassifier(*)Passive Aggressive Classifier.
linear_model.Perceptron(*[, penalty, alpha, ...])Linear perceptron classifier.
linear_model.RidgeClassifier([alpha, ...])Classifier using Ridge regression.
linear_model.RidgeClassifierCV([alphas, ...])Ridge classifier with built-in cross-validation.
linear_model.SGDClassifier([loss, penalty, ...])Linear classifiers (SVM, logistic regression, etc.) with SGD training.
linear_model.SGDOneClassSVM([nu, ...])Solves linear One-Class SVM using Stochastic Gradient Descent.```

`class sklearn.linear_model.LogisticRegression(penalty='l2', *, dual=False, tol=0.0001, C=1.0, fit_intercept=True, intercept_scaling=1, class_weight=None, random_state=None, solver='lbfgs', max_iter=100, multi_class='auto', verbose=0, warm_start=False, n_jobs=None, l1_ratio=None)`

Logistic回归（又名logit，MaxEnt）分类器。

“newton-cg”、“sag”和“lbfgs”解算器仅支持原始公式的L2正则化，或不支持正则化。“liblinear”解算器支持L1和L2正则化，仅对偶公式用于L2惩罚。弹性网正则化仅由“saga”解算器支持。

penalty: {‘l1’, ‘l2’, ‘elasticnet’, ‘none’}, default=’l2’

‘none’：不增加处罚；
“l2”：添加l2惩罚条款，这是默认选项；
“l1”：添加一个l1处罚条款；
“弹性网”：同时添加L1和L2处罚条款。

dual: bool, default=False

tol: float, default=1e-4

C: float，default=1.0

fit_intercept: bool，default=True

intercept_scaling: float，default=1

class_weight: dict或“balanced”，default=None

“平衡”模式使用y值自动调整权重，权重与输入数据中的类频率成反比，即n_samples/（n_classes*np.bincount（y））。

random_state: int, RandomState instance, default=None

solver: {‘newton-cg’, ‘lbfgs’, ‘liblinear’, ‘sag’, ‘saga’}, default=’lbfgs’

“liblinear”仅限于一对多方案。
max_iter: int, default=100

multi-class: {‘auto’，‘ovr’，‘multinomial’}，default=‘auto’

verbose: int, default=0

warm_start: bool，default=False

n_jobs: int，default=None

l1_ratio: float，默认值=无

classes_ndarray of shape (n_classes, )

coef_ndarray of shape (1, n_features) or (n_classes, n_features)

intercept_ndarray of shape (1,) or (n_classes,)

n_features_in_: int

feature_names_in_: ndarray of shape (n_features_in_,)

n_iter_: ndarray of shape (n_classes,) or (1, )

```decision_function(X)Predict confidence scores for samples.
densify()Convert coefficient matrix to dense array format.
fit(X, y[, sample_weight])Fit the model according to the given training data.
get_params([deep])Get parameters for this estimator.
predict(X)Predict class labels for samples in X.
predict_log_proba(X)Predict logarithm of probability estimates.
predict_proba(X)Probability estimates.
score(X, y[, sample_weight])Return the mean accuracy on the given test data and labels.
set_params(**params)Set the parameters of this estimator.
sparsify()Convert coefficient matrix to sparse format.```

```>>> from sklearn.datasets import load_iris
>>> from sklearn.linear_model import LogisticRegression
>>> X, y = load_iris(return_X_y=True)
>>> clf = LogisticRegression(random_state=0).fit(X, y)
>>> clf.predict(X[:2, :])
array([0, 0])
>>> clf.predict_proba(X[:2, :])
array([[9.8...e-01, 1.8...e-02, 1.4...e-08],
[9.7...e-01, 2.8...e-02, ...e-08]])
>>> clf.score(X, y)
0.97...```

## 逻辑回归的优缺点

1、优点

（1）适合分类场景

（2）计算代价不高，容易理解实现。

（3）不用事先假设数据分布，这样避免了假设分布不准确所带来的问题。

（4）不仅预测出类别，还可以得到近似概率预测。

（5）目标函数任意阶可导。

2、缺点

（1）容易欠拟合，分类精度不高。

（2）数据特征有缺失或者特征空间很大时表现效果并不好。

## Softmax回归

1.https://www.bilibili.com/video/BV1Ca411M7KA?p=6&vd_source=c35b16b24807a6dbe33f5473659062ac

2.黑马机器学习

3.https://blog.csdn.net/qq_36330643/article/details/77649896