The utility learns One-Class Support Vector Machine (SVM) classifier on the input file ("-i") for classifying documents into one category ("-cat"). It produces model ("-o") in Bag-Of-Words format ".bowmd". Only positive examples are needed for learning. Input vectors can be weighted ("-w") with different weights.
The parameter "-nu" determines the value of nu parameter for SVM (like cost parameter for binary SVM), which must be between 0 and 1. The parameter "-t" selects kernel used for learning:
The parameter "-cachesize" determines size of cache (in MB) non-linear SVM
can use for caching evaluated kernel functions. The parameter "-time" determines
maximal time in seconds allowed for learning classifier. The parameter "-v"
determines verbosity during learning. The parameters "-subsize" determines size
of sub-problems used at learning algorithm (-1 means classifier decides). The
parameters "-ter" determines termination criteria. By increasing it learning
gets faster but at the end classifier is less accurate. The parameters "-shrink"
determines if support vectors are prediction while learning. Using this option
can increases learning time dramatically
The parameter "-t" is used for
Reuters21578 dataset. It determines what documents from ModApte split of this
dataset are used for learning.
usage:
BowTrainOneClassSVM.exe
-i:Input-BagOfWords-FileName
(default:'')
-o:Output-One-Class-SVM-Model-FileName
(default:'')
-cat:Category-Name (default:'')
-td:Training-Documents (0 -
all, 1 - train, 2 - test) (default:0)
-w:Weighting (none, norm, bin, tfidf)
(default:'tfidf')
-nu:Nu-Parameter
(default:0.1)
-t:SVM-Type: 0-linear, 1-polynomial, 2-radial, 3-sigmoid
(default:0)
-ker_p:Degree-of-Polynomail-Kernel
(default:3)
-ker_s:Linear-Part-in-Polynomial-Kernel
(default:1)
-ker_c:Constant-Part-in-Polynomail-Kernel
(default:1)
-ker_gamma:Gamma-for-Radial-Kernel
(default:1)
-cachesize:Memory-Cache-Size
(default:50)
-time:Upper-Time-Limit (default:-1)
-v:Verbosity
(default:0)
-subsize:Subproblem-Size
(default:-1)
-ter:Terminating-Condition (default:0.001)
-shrink:Shrinking
(default:'T')
Example 1:
BowTrainOneClassSVM.exe
-i:Reuters21578.Bow -w:tfidf -cat:corn -nu:0.2 -td:1
The
above example learns linear SVM classifier for category corn using
documents from Reuters21578 tagged as training documents. Nu parameter is set to
0.2. Model is saved into file reuters21578.BowMd.
Example
2:
BowTrainOneClassSVM.exe -i:Reuters21578.Bow -w:tfidf
-cat:corn -nu:0.2 -t:1 -ker_p:2 -td:1
The above example learns SVM
classifier with Polynomial kernel for category corn using documents from
Reuters21578 tagged as training documents. Degree of polynomial kernel is set
with parameter "-ker_p" to 2. Model is saved into file
reuters21578.BowMd.