Classify_breast_cancer_with_sklearn-hmchzb19-ChinaUnix博客

Linuxer

首页　| 　博文目录　| 　关于我

hmchzb19

博客访问： 1812829
博文数量： 297
博客积分： 285
博客等级：二等列兵
技术积分： 3006
用户组：普通用户
注册时间： 2010-03-06 22:04

个人简介

Linuxer, ex IBMer. GNU https://hmchzb19.github.io/

文章分类

全部博文（297）

machine_learning（16）
PYthon_Design_Pa（1）
数学（1）
Data Struct（1）
scheme（3）
Container（1）
sqlite3（1）
firefox（4）
Tor（1）
java（30）
生活（2）
测试生涯（1）
互联网（4）
algorithm（4）
ubuntu（4）
安全和kali （35）
windows（5）
cloud_manage（3）
tcp/ip（1）
security（5）
Linux（74）
python（70）
C（9）
postgresql（5）
shell（3）
db2（3）
oracle（3）
Power-VM虚拟化（7）
未分配的博文（0）

文章存档

2020年（11）

2019年（15）

2018年（43）

2017年（79）

2016年（79）

2015年（58）

2014年（1）

2013年（8）

2012年（3）

我的朋友

相关博文

Classify_breast_cancer_with_sklearn

分类：大数据

2020-04-24 16:28:06

在sklearn里面有个数据集叫load_breast_cancer,569个sample,每个sample有30个feature. 我打算用svm.SVC来试试看.主要尝试的参数有kernel和gamma,kernel分别设置为rbf和sigmoid. gamma则设置为auto和scale.
经过实验我的感觉如下 sigmoid的score略低。而rbf稍微好一点.
‘gamma’设置为auto和scale最后的score都一样.

我很奇怪，sigmoid的train score比test score还少了0.03, 3%为什么呢?

点击(此处)折叠或打开

(569, 30)
(569,)
['malignant' 'benign']
X data shape:(569, 30); no. positive:357; no. negative:212
svm with kernel "rbf" train score and test score
train score: 0.984251968503937, test score: 0.9680851063829787
svm with kernel "sigmoid" train score and test score
train score: 0.931758530183727, test score: 0.9627659574468085

代码如下:

点击(此处)折叠或打开

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn import svm
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import classification_report, accuracy_score
cancer = load_breast_cancer()
X, y = cancer.data, cancer.target
print(X.shape)
print(y.shape)
print(cancer.target_names)
print('X data shape:{0}; no. positive:{1}; no. negative:{2}'.format(X.shape, y[y==1].shape[0], y[y==0].shape[0]))
X_train,X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
scaler=StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
svm_cf = svm.SVC(C=1, kernel='rbf',gamma='scale', tol=.001, random_state=42)
svm_cf.fit(X_train, y_train)
y_train_pred = svm_cf.predict(X_train)
y_test_pred = svm_cf.predict(X_test)
train_score = accuracy_score(y_train, y_train_pred)
test_score = accuracy_score(y_test, y_test_pred)
print('svm with kernel "rbf" train score and test score')
print('train score: {}, test score: {}'.format(train_score, test_score))
svm_sig_cf = svm.SVC(C=1, kernel='sigmoid', tol=.001, gamma='auto',random_state=42)
svm_sig_cf.fit(X_train, y_train)
y_train_pred = svm_sig_cf.predict(X_train)
y_test_pred = svm_sig_cf.predict(X_test)
train_score = accuracy_score(y_train, y_train_pred)
test_score = accuracy_score(y_test, y_test_pred)
print('\nsvm with kernel "sigmoid" train score and test score')
print('train score: {}, test score: {}'.format(train_score, test_score))

阅读(1249) | 评论(0) | 转发(0) |

上一篇：install_tensorflow2.1.0_in_venv

下一篇：tensorflow的simple linear regression

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6