LLMLingua Prompt Compression#

Introduction#

LLMLingua Prompt Compression tool enables you to speed up large language model’s inference and enhance large language model’s perceive of key information, compress the prompt with minimal performance loss.

Requirements#

PyPI package: llmlingua-promptflow.

Prerequisite#

Create a MaaS deployment for large language model in Azure model catalog. Take the Llama model as an example, you can learn how to deploy and consume Meta Llama models with model as a service by the guidance for Azure AI Studio.

Inputs#

The tool accepts the following inputs:

Name

Type

Description

Required

prompt

string

The prompt that needs to be compressed.

Yes

myconn

CustomConnection

The created connection to a MaaS resource for calculating log probability.

Yes

rate

float

The maximum compression rate target to be achieved. Default value is 0.5.

No

Outputs#

Return Type

Description

string

The resulting compressed prompt.

Sample Flows#

Find example flows using the llmlingua-promptflow package here.

Contact#

Please reach out to LLMLingua Team (llmlingua@microsoft.com) with any issues.