標籤:

The difference between C-SVM and nu-SVM.(translation)

This is a answer about the question "What is the difference between C-SVM and nu-SVM?" in Quora ,and i translate it into chinese .The original answer was written by Chinmay Pradhan , a data analytics engineer , who had worked for years,and Divyansh Khanna .

And this is the link of the discussion (in Quora):

quora.com/What-is-the-d

寫在前面:

SVM(Support Vector Machine)演算法是一種分類模型,中文名支持向量機。在神經網路演算法被大規模利用前受到了非常廣泛的重視。

The following is the original answer and the translation:

/************************************************************************/

Chinmay Pradhan:

SVM use hyperplanes to perform classification. While performing classifications using SVM there are 2 types of SVM

  • C SVM
  • Nu SVM

C and nu are regularisation parameters which help implement a penalty on the misclassifications that are performed while separating the classes. Thus helps in improving the accuracy of the output.

C ranges from 0 to infinity and can be a bit hard to estimate and use. A modification to this was the introduction of nu which operates between 0-1 and represents the lower and upper bound on the number of examples that are support vectors and that lie on the wrong side of the hyperplane.

Both have a comparative similar classification power, but the nu- SVM has been harder to optimise.

SVM 通過高維平面(超平面)分析來對數據集進行分類。當SVM被用於對數據集進行分類的時候,以下兩種分支是值得注意的:C-SVM和Nu-SVM。

(註:classification為分類,clustering為聚類, 但是根據其他資料,譯者更相信C-SVM和Nu-SVM確實符合分類(classification)的定義,所以依從原作者進行翻譯)

C 和 Nu 是兩種正則參數,他們幫助我們為錯誤分類事件施加懲罰(增加錯誤分類的成本),從而提高結果的正確率。

對於C-SVM來說,他可以對數據量從0到無限大的數據集進行分類,但是同時,他的使用又往往會受到各種各樣的限制。為了改變這一問題,我們引入了Nu參數,Nu參數的大小位於0-1之間,標示著位於高維平面錯誤一邊的支持向量數目的上限和下限。

(C-SVM和Nu-SVM)二者的分類效果相當,不過Nu-SVM以往更難以實現優化。

/************************************************************************/

Divyansh Khanna:

The nu-SVM was proposed by Scholkopf et al has the advantage of using a parameter nu for controlling the number of support vectors. The parameter C in the ordinary SVM formulation is replaced by a parameter nu which is bounded by 0 and 1. Earlier the parameter C could have taken any positive value, thus this additional bound is beneficial in implementation.

The parameter nu represents the lower and upper bound on the number of examples that are support vectors and that lie on the wrong side of the hyperplane,respectively.

Now despite the new bound, the nu-SVM is comparatively difficult to optimize and often the runtime is not scalable as compared to C-SVM.

Nu-SVM是Scholkopf提出的,Nu-SVM的好處在於使用了參數nu來控制支持向量的數目。普通SVM模型中的參數C被參數nu替代了,nu參數的大小介於0到1之間。相比之下,參數C可以取任何正值,所以,C參數對nu參數的替換明顯是有益的。

參數nu代表標示著位於高維平面錯誤一邊的支持向量數目的上限和下限。

現在即使nu相比C有了新的約束,但是nu-SVM相對而言還是非常的難以優化,而且往往,nu-SVM的運行時間相比C-SVM而言,更缺乏彈性。


推薦閱讀:

如何提高你的收入 | 你應該做的更深還是更廣?
~廣告設計練習~2017.10.01~10.12份
Python3 Cookbook(001)
(未完成)2018.04.NO.6(Note)

TAG:練習 |