Some questions are listed in LIBSVM FAQ.
Please check our explanation on the LIBLINEAR webpage. Also see appendix C of our SVM guide.
Please see the descriptions at LIBLINEAR page.
Please cite the following paper:
R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. LIBLINEAR: A Library for Large Linear Classification, Journal of Machine Learning Research 9(2008), 1871-1874. Software available at http://www.csie.ntu.edu.tw/~cjlin/liblinear
The bibtex format is
@Article{REF08a, author = {Rong-En Fan and Kai-Wei Chang and Cho-Jui Hsieh and Xiang-Rui Wang and Chih-Jen Lin}, title = {{LIBLINEAR}: A Library for Large Linear Classification}, journal = {Journal of Machine Learning Research}, year = {2008}, volume = {9}, pages = {1871--1874} }
See the change log and directory for earlier/current versions.
Generally we recommend linear SVM as its training is faster and the accuracy is competitive. However, if you would like to have probability outputs, you may consider logistic regression.
Moreover, try L2 regularization first unless you need a sparse model. For most cases, L1 regularization does not give higher accuracy but may be slightly slower in training.
Among L2-regularized SVM solvers, try the default one (L2-loss SVC dual) first. If it is too slow, use the option -s 2 to solve the primal problem.
For document classification, our experience indicates that if you normalize each document to unit length, then not only the training time is shorter, but also the performance is better.
If you need to read the same data set several times, saving data in MATLAB/OCTAVE binary formats can significantly reduce the loading time. The following MATLAB code generates a binary file rcv1_test.mat:
[rcv1_test_labels,rcv1_test_inst] = libsvmread('../rcv1_test.binary'); save rcv1_test.mat rcv1_test_labels rcv1_test_inst;For OCTAVE user, use
save -mat7-binary rcv1_test.mat rcv1_test_labels rcv1_test_inst;to save rcv1_test.mat in MATLAB 7 binary format. (Or you can use -binary to save in OCTAVE binary format) Then, type
load rcv1_test.matto read data. A simple experiment shows that read_sparse takes 88 seconds to read a data set rcv1 with half million instances, but it costs only 7 seconds to load the MATLAB binary file. Please type
help savein MATLAB/OCTAVE for further information.
Very likely you use a large C or don't scale data. If your number of features is small, you may use the option
-s 2by solving the primal problem. More examples are in the appendix C of our SVM guide.
They should be very similar. However, sometimes the difference may not be small. Note that LIBLINEAR does not use the bias term b by default. If you observe very different results, try to set -B 1 for LIBLINEAR. This will add the bias term to the loss function as well as the regularization term (w^Tw + b^2). Then, results should be closer.
You can use grid.py of LIBSVM (after version 3.16) to check cross validation accuracy of different C. Two options must be specified.
> python grid.py -log2c -3,0,1 -log2g null -svmtrain ./train heart_scaleto check CV values at C=2^-3, 2^-2, 2^-1, and 2^0.
We guess that you are comparing
> time ./train -s 0 -v 5 -e 0.001 datawith the environment used in our paper, and find that LIBLINEAR is slower. Two reasons may cause the diffierence.
We carefully studied such issues, and decided to use the current setting. For data classification, one doesn't need very accurate solution, so numerical issues are less important. Moreover, log1p is not available on all platforms. Please let us know if you observe any numerical problems.
Assume k is the total number of classes and n is the number of features. In the model file, after the parameters, there is an n*k matrix W, whose columns are obtained from solving two-class problems: 1 vs rest, 2 vs rest, 3 vs rest, ...., k vs rest. For example, if there are 4 classes, the file looks like:
+-------+-------+-------+-------+ | w_1vR | w_2vR | w_3vR | w_4vR | +-------+-------+-------+-------+
Please see the answer in LIBSVM faq.
To correctly obtain decision values, you need to check the array
labelin the model.
LIBSVM uses more advanced techniques for SVM probability outputs. The code is a bit complicated so we haven't decided if including it is suitable or not.
If you really would like to have probability outputs for SVM in LIBLINEAR, you can consider using the simple probability model of logistic regression. Simply modify the following subrutine in linear.cpp.
int check_probability_model(const struct model *model_) { return (model_->param.solver_type==L2R_LR ||to
int check_probability_model(const struct model *model_) { return 1;
Some LIBLINEAR solvers consider the primal problem, so support vectors are not obtained during the training procedure. For dual solvers, we output only the primal weight vector w, so support vectors are not stored in the model. This is different from LIBSVM.
To know support vectors, you can modify the following loop in solve_l2r_l1l2_svc() of linear.cpp to print out indices:
for(i=0; i<l; i++) { v += alpha[i]*(alpha[i]*diag[GETI(i)] - 2); if(alpha[i] > 0) ++nSV; }Note that we group data in the same class together before calling this subroutine. Thus the order of your training instances has been changed. You can sort your data (e.g., positive instances before negative ones) before using liblinear. Then indices will be the same.
This FAQ is for solvers. For multiclass classification, please check How to speedup multiclass classification using OpenMP instead.
Because of the design of LIBLINEAR's solvers, it is not easy to achieve good speedup using OpenMP. However, by the following steps, we can still achieve some speedup for primal solvers (-s 0, 2, 11).
#pragma omp parallel for private (i) for(i=0;i<l;i++)In l2r_l2_svc_fun, modify the for loop in subXv to:
#pragma omp parallel for private (i) for(i=0;i<sizeI;i++)
%export OMP_NUM_THREADS=8 %time ./train -s 2 rcv1_test.binary 0m45.250s %time ./train -s 2 mnist8m.scale 59m41.300sUsing standard LIBLINEAR
%time ./train -s 2 rcv1_test.binary 0m55.657s %time ./train -s 2 mnist8m.scale 78m59.452s
Please take the following steps. Note that it works only for -s 0, 1, 2, 3, 5, 7.
In Makefile, add -fopenmp to CFLAGS.
In linear.cpp, replace the following segment of code
model_->w=Malloc(double, w_size*nr_class); double *w=Malloc(double, w_size); for(i=0;i<nr_class;i++) { int si = start[i]; int ei = si+count[i]; k=0; for(; k<si; k++) sub_prob.y[k] = -1; for(; k<ei; k++) sub_prob.y[k] = +1; for(; k<sub_prob.l; k++) sub_prob.y[k] = -1; train_one(&sub_prob, param, w, weighted_C[i], param->C); for(int j=0;j<w_size;j++) model_->w[j*nr_class+i] = w[j]; } free(w);with
model_->w=Malloc(double, w_size*nr_class); #pragma omp parallel for private(i) for(i=0;i<nr_class;i++) { problem sub_prob_omp; sub_prob_omp.l = l; sub_prob_omp.n = n; sub_prob_omp.x = x; sub_prob_omp.y = Malloc(double,l); int si = start[i]; int ei = si+count[i]; double *w=Malloc(double, w_size); int t=0; for(; t<si; t++) sub_prob_omp.y[t] = -1; for(; t<ei; t++) sub_prob_omp.y[t] = +1; for(; t<sub_prob_omp.l; t++) sub_prob_omp.y[t] = -1; train_one(&sub_prob_omp, param, w, weighted_C[i], param->C); for(int j=0;j<w_size;j++) model_->w[j*nr_class+i] = w[j]; free(sub_prob_omp.y); free(w); }Using 8 cores on the set rcv1_test.multiclass.bz2.
%export OMP_NUM_THREADS=8 %time ./train -s 2 rcv1_test.multiclass 2m4.019s %time ./train -s 1 rcv1_test.multiclass 0m45.349sUsing standard LIBLINEAR
%time ./train -s 2 rcv1_test.multiclass 6m52.237s %time ./train -s 1 rcv1_test.multiclass 1m51.739s
Please check this page
If you would like to identify important features. For most cases, L1 regularization does not give higher accuracy but may be slower in training.
We hope to know situations where L1 is useful. Please contact us if you have some success stories.
We don't have any application which really needs this setting. However, please email us if your application must use a sparse weight vector.
Yes. L2-loss SVR with epsilon = 0 (i.e., -p 0) reduces to regularized least-square regression (ridge regression).