10.多项式回归和多分类策略
约 534 字大约 2 分钟
2025-09-20
我们有两种方法来处理多分类问题: OvO和OvR:
| 方法 | 时间复杂度 | 准确率 |
|---|---|---|
| OvO | Cn2T=2n(n−1) | 更准确 |
| OvR | nT | 存在类别混淆 |
import numpy as np
from matplotlib import pyplot as pltnp.random.randn()=np.random.normal(0,1)
np.random.seed(0)
X = np.random.normal(0,1,size=(100,2))
y=np.array((X[:,0]**2+X[:,1]**2<2),dtype=int) # 构造线性不可分数据
y运行结果
array([0, 0, 0, 1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 1, 0])
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)plt.scatter(x_train[:,0],x_train[:,1],c=y_train)
plt.show()
我们发现无法找到一条直线来拟合了.
多项式逻辑回归
from sklearn.preprocessing import PolynomialFeatures
poly = PolynomialFeatures(degree=2)
poly.fit(x_train)运行结果
PolynomialFeatures()
x2 = poly.fit_transform(x_train)
x2_t = poly.fit_transform(x_test)from sklearn.linear_model import LogisticRegression
clf = LogisticRegression()
clf.fit(x2,y_train)
clf.score(x2,y_train)运行结果
0.9875
clf.score(x2_t,y_test)运行结果
0.95
多分类OVR和OVO代码实现
from sklearn.datasets import load_iris
iris = load_iris()
X = iris.data
y = iris.target
X.shape, y.shape运行结果
((150, 4), (150,))
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)plt.scatter(x_train[:,0],x_train[:,1],c=y_train)
plt.show()
# OVR
from sklearn.multiclass import OneVsRestClassifier
ovr = OneVsRestClassifier(LogisticRegression())
ovr.fit(x_train,y_train)
ovr.score(x_test,y_test)运行结果
0.9666666666666667
# OVO
from sklearn.multiclass import OneVsOneClassifier
ovo = OneVsOneClassifier(LogisticRegression())
ovo.fit(x_train,y_train)
ovo.score(x_test,y_test)运行结果
1.0
