LinearSVM¶

APIs¶

class pyis.python.ops.LinearSVM(self: ops.LinearSVM, model_file: str) → None¶

A support vector machine implementation using liblinear.

Create a LinearSVM object given the model file in libsvm format.

Parameters: model_file (str) – The model file in libsvm format.

predict(self: ops.LinearSVM, features: List[Tuple[int, float]]) → List[float]¶

Given a list of features triggered by the input sample, return the decision values for each of the classes.

Parameters: features (List[Tuple[int, double]]) – List of features, each is represented by a tuple of (feature id, feature value). Don’t repeat two identical feature ids in the list.
Returns: List of decision values.

static train(libsvm_data_file: str, model_file: str, solver_type: int = 5, eps: float = 0.1, C: float = 1.0, p: float = 0.5, bias: float = 1.0) → None¶

Train a SVM model using liblinear.

Parameters

libsvm_data_file (str) – Training data in libsvm data format.
model_file (str) – Target file for the generated model file.
solver_type (int) – L2R_LR(0), L2R_L2LOSS_SVC_DUAL, L2R_L2LOSS_SVC, L2R_L1LOSS_SVC_DUAL, MCSVM_CS, L1R_L2LOSS_SVC, L1R_LR, L2R_LR_DUAL, L2R_L2LOSS_SVR = 11, L2R_L2LOSS_SVR_DUAL, L2R_L1LOSS_SVR_DUAL, ONECLASS_SVM = 21
eps (float) – Stopping criteria.
c (float) – Regularization parameter.
p (float) – Epsilon parameter (for Epsilon_SVR).
bias (float) – If non negative then each instance is appended a constant bias term.

Example¶

# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT license.

import io, os
from typing import List, Tuple, Dict
from pyis.python import ops
from pyis.python.offline import TextClassification as TextCls

tmp_dir = 'tmp/doc_linear_svm/'
os.makedirs(tmp_dir, exist_ok=True)

# training
# id `zero` is illegal for liblinear
xs = [
        [(1, 1.0), (2, 1.0)],
        [(2, 1.0), (3, 1.0)],
    ]
ys = [1, 2]

data_file = os.path.join(tmp_dir, 'svm.data.txt')     
with io.open(data_file, 'w', newline='\n') as f:
    for x, y in zip(xs, ys):
        features: Dict[int, float] = {}
        for fid, fvalue in x:
            if fid not in features:
                features[fid] = 0.0
            features[fid] += fvalue
        print(y, file=f, end='')
        for k in sorted(features):
            print(f" {k}:{features[k]}", file=f, end='')
        print("", file=f) 

model_file = os.path.join(tmp_dir, 'svm.model.bin')
ops.LinearSVM.train(data_file, model_file, 5, 0.1, 1.0, 0.5, 1.0)
linear_svm = ops.LinearSVM(model_file)

# inference
values = linear_svm.predict([(1, 1.0)])
print(values)

values = linear_svm.predict([(2, 1.0)])
print(values)

values = linear_svm.predict([(3, 1.0)])
print(values)