LinearChainCRF¶
APIs¶
- class pyis.python.ops.LinearChainCRF(self: ops.LinearChainCRF, model_file: str) None ¶
A linear chain conditional random field(CRF) implementation.
Create a LinearChainCRF object given the model file.
- Parameters
model_file (str) – The model file.
- predict(self: ops.LinearChainCRF, len: int, features: List[Tuple[int, int, float]]) List[int] ¶
Given a list of features triggered by the input sample, return the label for each token of the input.
- Parameters
len (int) – token number of the input.
features (List[Tuple[int, int, float]]) – List of features, each is represented by a tuple of (token index, feature id, feature value).
- Returns
List of labels.
- static train(data_file: str, model_file: str, alg: str = 'l1sgd', max_iter: int = 150) None ¶
Train a crf model.
- Parameters
data_file (str) – Training data file.
model_file (str) – Target file for the generated model file.
alg (str) – The training algorithm. perceptron: Structured Perceptron, l1sgd: Stochastic Gradient Descent Training for L1-regularized Log-linear Models
max_iter (int) – Maximun iterations.
Example¶
# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT license.
import os
from pyis.python import ops
from pyis.python.offline import SequenceTagging as SeqTag
tmp_dir = 'tmp/doc_linear_chain_crf/'
os.makedirs(tmp_dir, exist_ok=True)
# training
xs = [
[ops.TextFeature(0, 1.0, 0, 0), ops.TextFeature(1, 1.0, 1, 1)], # hello tom
[ops.TextFeature(0, 1.0, 0, 0), ops.TextFeature(2, 1.0, 1, 1)], # hello jerry
]
ys = [
[0, 1], # O NAME
[0, 1], # O NAME
]
data_file = os.path.join(tmp_dir, 'lccrf.data.txt')
SeqTag.text_features_to_lccrf(xs, ys, data_file)
model_file = os.path.join(tmp_dir, 'lccrf.model.bin')
ops.LinearChainCRF.train(data_file, model_file, 'l1sgd')
lccrf = ops.LinearChainCRF(model_file)
# inference
values = lccrf.predict(2, [(0, 0, 1.0), (1, 1, 1.0)]) # hello tom
print(values)