Chinaunix首页 | 论坛 | 博客
  • 博客访问: 1734006
  • 博文数量: 297
  • 博客积分: 285
  • 博客等级: 二等列兵
  • 技术积分: 3006
  • 用 户 组: 普通用户
  • 注册时间: 2010-03-06 22:04
个人简介

Linuxer, ex IBMer. GNU https://hmchzb19.github.io/

文章分类

全部博文(297)

文章存档

2020年(11)

2019年(15)

2018年(43)

2017年(79)

2016年(79)

2015年(58)

2014年(1)

2013年(8)

2012年(3)

分类: 大数据

2020-04-24 16:28:06

在sklearn里面有个数据集叫load_breast_cancer,569个sample,每个sample有30个feature. 我打算用svm.SVC来试试看.主要尝试的参数有kernel和gamma,kernel分别设置为rbf和sigmoid. gamma则设置为auto和scale.
经过实验我的感觉如下 sigmoid的score略低。而rbf稍微好一点.
‘gamma’设置为auto和scale最后的score都一样.

我很奇怪,sigmoid的train score比test score还少了0.03, 3%为什么呢?


点击(此处)折叠或打开

  1. (569, 30)
  2. (569,)
  3. ['malignant' 'benign']
  4. X data shape:(569, 30); no. positive:357; no. negative:212
  5. svm with kernel "rbf" train score and test score
  6. train score: 0.984251968503937, test score: 0.9680851063829787

  7. svm with kernel "sigmoid" train score and test score
  8. train score: 0.931758530183727, test score: 0.9627659574468085
代码如下:

点击(此处)折叠或打开

  1. from sklearn.datasets import load_breast_cancer
  2. from sklearn.model_selection import train_test_split
  3. from sklearn import svm
  4. from sklearn.preprocessing import StandardScaler
  5. from sklearn.metrics import classification_report, accuracy_score

  6. cancer = load_breast_cancer()
  7. X, y = cancer.data, cancer.target
  8. print(X.shape)
  9. print(y.shape)
  10. print(cancer.target_names)
  11. print('X data shape:{0}; no. positive:{1}; no. negative:{2}'.format(X.shape, y[y==1].shape[0], y[y==0].shape[0]))

  12. X_train,X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

  13. scaler=StandardScaler()
  14. X_train = scaler.fit_transform(X_train)
  15. X_test = scaler.transform(X_test)

  16. svm_cf = svm.SVC(C=1, kernel='rbf',gamma='scale', tol=.001, random_state=42)
  17. svm_cf.fit(X_train, y_train)
  18. y_train_pred = svm_cf.predict(X_train)
  19. y_test_pred = svm_cf.predict(X_test)
  20. train_score = accuracy_score(y_train, y_train_pred)
  21. test_score = accuracy_score(y_test, y_test_pred)

  22. print('svm with kernel "rbf" train score and test score')
  23. print('train score: {}, test score: {}'.format(train_score, test_score))

  24. svm_sig_cf = svm.SVC(C=1, kernel='sigmoid', tol=.001, gamma='auto',random_state=42)
  25. svm_sig_cf.fit(X_train, y_train)
  26. y_train_pred = svm_sig_cf.predict(X_train)
  27. y_test_pred = svm_sig_cf.predict(X_test)
  28. train_score = accuracy_score(y_train, y_train_pred)
  29. test_score = accuracy_score(y_test, y_test_pred)

  30. print('\nsvm with kernel "sigmoid" train score and test score')
  31. print('train score: {}, test score: {}'.format(train_score, test_score))


阅读(1037) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~