pe.llm.huggingface.huggingface module

class pe.llm.huggingface.huggingface.HuggingfaceLLM(model_name_or_path, batch_size=128, dry_run=False, **generation_args)[source]

Bases: LLM

A wrapper for Huggingface LLMs.

__init__(model_name_or_path, batch_size=128, dry_run=False, **generation_args)[source]

Constructor.

Parameters:

model_name_or_path (str) – The model name or path of the Huggingface model. Note that we use the FastChat library (https://github.com/lm-sys/FastChat) to manage the conversation template. If the conversation template of your desired model is not available in FastChat, please register the conversation template in the FastChat library. See the following link for an example: https://github.com/microsoft/DPSDA/blob/main/pe/llm/huggingface/register_fastchat/gpt2.py
batch_size (int, optional) – The batch size to use for generating the responses, defaults to 128
dry_run (bool, optional) – Whether to enable dry run. When dry run is enabled, the responses are fake and the LLMs are not called. Defaults to False
**generation_args (str) – The generation arguments that will be passed to the OpenAI API

_get_conv_template()[source]

Get the conversation template.

_get_prompt(messages)[source]

Get the prompt from the messages.

_get_responses(prompt_list, generation_args)[source]

Get the responses from the LLM.

Parameters:

Returns:

The responses

Return type:

list[str]

property generation_arg_map

Get the mapping from the generation arguments to arguments for this specific LLM.

get_responses(requests, **generation_args)[source]

Get the responses from the LLM.

Parameters:

requests (list[pe.llm.Request]) – The requests
**generation_args (str) – The generation arguments. The priority of the generation arguments from the highest to the lowerest is in the order of: the arguments set in the requests > the arguments passed to this function > and the arguments passed to the constructor

Returns:

The responses

Return type:

list[str]