LinearChainCRF

APIs

class pyis.python.ops.LinearChainCRF(self: ops.LinearChainCRF, model_file: str) None

A linear chain conditional random field(CRF) implementation.

Create a LinearChainCRF object given the model file.

Parameters

model_file (str) – The model file.

predict(self: ops.LinearChainCRF, len: int, features: List[Tuple[int, int, float]]) List[int]

Given a list of features triggered by the input sample, return the label for each token of the input.

Parameters
  • len (int) – token number of the input.

  • features (List[Tuple[int, int, float]]) – List of features, each is represented by a tuple of (token index, feature id, feature value).

Returns

List of labels.

static train(data_file: str, model_file: str, alg: str = 'l1sgd', max_iter: int = 150) None

Train a crf model.

Parameters

Example

# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT license.

import os
from pyis.python import ops
from pyis.python.offline import SequenceTagging as SeqTag

tmp_dir = 'tmp/doc_linear_chain_crf/'
os.makedirs(tmp_dir, exist_ok=True)

# training
xs = [
        [ops.TextFeature(0, 1.0, 0, 0), ops.TextFeature(1, 1.0, 1, 1)], # hello tom
        [ops.TextFeature(0, 1.0, 0, 0), ops.TextFeature(2, 1.0, 1, 1)], # hello jerry
    ]
ys = [
        [0, 1], # O NAME
        [0, 1], # O NAME
    ]

data_file = os.path.join(tmp_dir, 'lccrf.data.txt')
SeqTag.text_features_to_lccrf(xs, ys, data_file)

model_file = os.path.join(tmp_dir, 'lccrf.model.bin')
ops.LinearChainCRF.train(data_file, model_file, 'l1sgd')
lccrf = ops.LinearChainCRF(model_file)

# inference
values = lccrf.predict(2, [(0, 0, 1.0), (1, 1, 1.0)]) # hello tom
print(values)