pe.llm package

class pe.llm.AzureOpenAILLM(progress_bar=True, dry_run=False, num_threads=1, **generation_args)[source]

Bases: LLM

A wrapper for Azure OpenAI LLM APIs. The following environment variables are required:

AZURE_OPENAI_API_KEY: Azure OpenAI API key. You can get it from https://portal.azure.com/. Multiple keys can be separated by commas. The key can also be “AZ_CLI”, in which case the Azure CLI will be used to authenticate the requests, and the environment variable AZURE_OPENAI_API_SCOPE needs to be set. See Azure OpenAI authentication documentation for more information: https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/switching-endpoints#microsoft-entra-id-authentication
AZURE_OPENAI_API_ENDPOINT: Azure OpenAI endpoint. Multiple endpoints can be separated by commas. You can get it from https://portal.azure.com/.
AZURE_OPENAI_API_VERSION: Azure OpenAI API version. You can get it from https://portal.azure.com/.

Assuming $x_1$ API keys and $x_2$ endpoints are provided. When it is desired to use $X$ endpoints/API keys, then $x_1,x_2$ must be equal to either $X$ or 1, and the maximum of $x_1,x_2$ must be $X$. For the $x_i$ that equals to 1, the same value will be used for all endpoints/API keys. For each request, the API key + endpoint pair with the lowest current workload will be used.

__init__(progress_bar=True, dry_run=False, num_threads=1, **generation_args)[source]

Constructor.

Parameters:

progress_bar (bool, optional) – Whether to show the progress bar, defaults to True
dry_run (bool, optional) – Whether to enable dry run. When dry run is enabled, the responses are fake and the APIs are not called. Defaults to False
num_threads (int, optional) – The number of threads to use for making concurrent API calls, defaults to 1
**generation_args (str) – The generation arguments that will be passed to the OpenAI API

Raises:

ValueError – If the number of API keys and endpoints are not equal

_get_environment_variable(name)[source]

Get the environment variable.

Parameters:: name (str) – The name of the environment variable
Raises:: ValueError – If the environment variable is not set
Returns:: The value of the environment variable
Return type:: str

_get_response_for_one_request(messages, generation_args)[source]

Get the response for one request.

Parameters:

messages (list[str]) – The messages
generation_args (dict) – The generation arguments

Returns:

The response

Return type:

str

property generation_arg_map

Get the mapping from the generation arguments to arguments for this specific LLM.

Returns:: The mapping that maps max_completion_tokens to max_tokens
Return type:: dict

get_responses(requests, **generation_args)[source]

Get the responses from the LLM.

Parameters:

requests (list[pe.llm.Request]) – The requests
**generation_args (str) – The generation arguments. The priority of the generation arguments from the highest to the lowerest is in the order of: the arguments set in the requests > the arguments passed to this function > and the arguments passed to the constructor

Returns:

The responses

Return type:

list[str]

class pe.llm.HuggingfaceLLM(model_name_or_path, batch_size=128, dry_run=False, **generation_args)[source]

Bases: LLM

A wrapper for Huggingface LLMs.

__init__(model_name_or_path, batch_size=128, dry_run=False, **generation_args)[source]

Constructor.

Parameters:

model_name_or_path (str) – The model name or path of the Huggingface model. Note that we use the FastChat library (https://github.com/lm-sys/FastChat) to manage the conversation template. If the conversation template of your desired model is not available in FastChat, please register the conversation template in the FastChat library. See the following link for an example: https://github.com/microsoft/DPSDA/blob/main/pe/llm/huggingface/register_fastchat/gpt2.py
batch_size (int, optional) – The batch size to use for generating the responses, defaults to 128
dry_run (bool, optional) – Whether to enable dry run. When dry run is enabled, the responses are fake and the LLMs are not called. Defaults to False
**generation_args (str) – The generation arguments that will be passed to the OpenAI API

_get_conv_template()[source]

Get the conversation template.

Returns:: The empty conversation template for this model from FastChat
Return type:: fastchat.conversation.Conversation

_get_prompt(messages)[source]

Get the prompt from the messages.

Parameters:: messages (list[dict]) – The messages
Raises:: ValueError – If the role is invalid
Returns:: The prompt
Return type:: str

_get_responses(prompt_list, generation_args)[source]

Get the responses from the LLM.

Parameters:

prompt_list (list[str]) – The prompts
generation_args (dict) – The generation arguments

Returns:

The responses

Return type:

list[str]

property generation_arg_map

Get the mapping from the generation arguments to arguments for this specific LLM.

Returns:: The mapping that maps max_completion_tokens to max_new_tokens
Return type:: dict

get_responses(requests, **generation_args)[source]

Get the responses from the LLM.

Parameters:

requests (list[pe.llm.Request]) – The requests
**generation_args (str) – The generation arguments. The priority of the generation arguments from the highest to the lowerest is in the order of: the arguments set in the requests > the arguments passed to this function > and the arguments passed to the constructor

Returns:

The responses

Return type:

list[str]

class pe.llm.LLM[source]

Bases: ABC

The abstract class for large language models (LLMs).

property generation_arg_map

Get the mapping from the generation arguments to arguments for this specific LLM.

Returns:: The mapping from the generation arguments to the large language model arguments
Return type:: dict

get_generation_args(*args)[source]

Get the generation arguments from a list of dictionaries.

Parameters:: *args (dict) – A list of generation arguments. The later ones will overwrite the earlier ones.
Returns:: The generation arguments
Return type:: dict

abstract get_responses(requests, **generation_args)[source]

Get the responses from the LLM.

Parameters:

requests (list[pe.llm.request.Request]) – The requests
**generation_args (str) – The generation arguments

Returns:

The responses

Return type:

list[str]

class pe.llm.OpenAILLM(progress_bar=True, dry_run=False, num_threads=1, **generation_args)[source]

Bases: LLM

A wrapper for OpenAI LLM APIs. The following environment variables are required:

OPENAI_API_KEY: OpenAI API key. You can get it from https://platform.openai.com/account/api-keys. Multiple keys can be separated by commas, and a key with the lowest current workload will be used for each request.

__init__(progress_bar=True, dry_run=False, num_threads=1, **generation_args)[source]

Constructor.

Parameters:

progress_bar (bool, optional) – Whether to show the progress bar, defaults to True
dry_run (bool, optional) – Whether to enable dry run. When dry run is enabled, the responses are fake and the APIs are not called. Defaults to False
num_threads (int, optional) – The number of threads to use for making concurrent API calls, defaults to 1
**generation_args (str) – The generation arguments that will be passed to the OpenAI API

_get_environment_variable(name)[source]

Get the environment variable.

Parameters:: name (str) – The name of the environment variable
Raises:: ValueError – If the environment variable is not set
Returns:: The value of the environment variable
Return type:: str

_get_response_for_one_request(messages, generation_args)[source]

Get the response for one request.

Parameters:

messages (list[str]) – The messages
generation_args (dict) – The generation arguments

Returns:

The response

Return type:

str

get_responses(requests, **generation_args)[source]

Get the responses from the LLM.

Parameters:

requests (list[pe.llm.request.Request]) – The requests
**generation_args (str) – The generation arguments. The priority of the generation arguments from the highest to the lowerest is in the order of: the arguments set in the requests > the arguments passed to this function > and the arguments passed to the constructor

Returns:

The responses

Return type:

list[str]

namedtuple pe.llm.Request(messages, generation_args)

Bases: namedtuple()

The request to the LLM.

Parameters:

messages (list[dict]) – The messages to the LLM
generation_args (dict) – The generation arguments to the LLM

Request(messages, generation_args)

Fields:

messages – Alias for field number 0
generation_args – Alias for field number 1

Subpackages

pe.llm.huggingface package
- Subpackages
  - pe.llm.huggingface.register_fastchat package
    - Submodules
      - pe.llm.huggingface.register_fastchat.gpt2 module
        
        register()
- Submodules
  - pe.llm.huggingface.huggingface module
    - HuggingfaceLLM

pe.llm package

Subpackages

Submodules