Task Module¶
openaivec.task ¶
Pre-configured task library for OpenAI API structured outputs.
This module provides a comprehensive collection of pre-configured tasks designed for
various business and academic use cases. Tasks are organized into domain-specific
submodules, each containing ready-to-use PreparedTask
instances that work seamlessly
with openaivec's batch processing capabilities.
Available Task Domains¶
Natural Language Processing (nlp
)¶
Core NLP tasks for text analysis and processing:
- Translation: Multi-language translation with 40+ language support
- Sentiment Analysis: Emotion detection and sentiment scoring
- Named Entity Recognition: Extract people, organizations, locations
- Morphological Analysis: Part-of-speech tagging and lemmatization
- Dependency Parsing: Syntactic structure analysis
- Keyword Extraction: Important term identification
Customer Support (customer_support
)¶
Specialized tasks for customer service operations:
- Intent Analysis: Understand customer goals and requirements
- Sentiment Analysis: Customer satisfaction and emotional state
- Urgency Analysis: Priority assessment and response time recommendations
- Inquiry Classification: Automatic categorization and routing
- Inquiry Summary: Comprehensive issue summarization
- Response Suggestion: AI-powered response drafting
Usage Patterns¶
Quick Start with Default Tasks¶
from openai import OpenAI
from openaivec._responses import BatchResponses
from openaivec.task import nlp, customer_support
client = OpenAI()
# Use pre-configured tasks
sentiment_analyzer = BatchResponses.of_task(
client=client,
model_name="gpt-4.1-mini",
task=nlp.SENTIMENT_ANALYSIS
)
intent_analyzer = BatchResponses.of_task(
client=client,
model_name="gpt-4.1-mini",
task=customer_support.INTENT_ANALYSIS
)
Customized Task Configuration¶
from openaivec.task.customer_support import urgency_analysis
# Create customized urgency analysis
custom_urgency = urgency_analysis(
business_context="SaaS platform support",
urgency_levels={
"critical": "Service outages, security breaches",
"high": "Login issues, payment failures",
"medium": "Feature bugs, billing questions",
"low": "Feature requests, general feedback"
}
)
analyzer = BatchResponses.of_task(
client=client,
model_name="gpt-4.1-mini",
task=custom_urgency
)
Pandas Integration¶
import pandas as pd
from openaivec import pandas_ext
df = pd.DataFrame({"text": ["I love this!", "This is terrible."]})
# Apply tasks directly to DataFrame columns
df["sentiment"] = df["text"].ai.task(nlp.SENTIMENT_ANALYSIS)
df["intent"] = df["text"].ai.task(customer_support.INTENT_ANALYSIS)
# Extract structured results
results_df = df.ai.extract("sentiment")
Spark Integration¶
from openaivec.spark import ResponsesUDFBuilder
# Register UDF for large-scale processing
spark.udf.register(
"analyze_sentiment",
ResponsesUDFBuilder.of_openai(
api_key=api_key,
model_name="gpt-4.1-mini"
).build_from_task(task=nlp.SENTIMENT_ANALYSIS)
)
# Use in Spark SQL
df = spark.sql("""
SELECT text, analyze_sentiment(text) as sentiment
FROM customer_feedback
""")
Task Architecture¶
PreparedTask Structure¶
All tasks are built using the PreparedTask
dataclass:
@dataclass(frozen=True)
class PreparedTask:
instructions: str # Detailed prompt for the LLM
response_format: type[ResponseFormat] # Pydantic model or str for structured/plain output
temperature: float = 0.0 # Sampling temperature
top_p: float = 1.0 # Nucleus sampling parameter
Response Format Standards¶
- Literal Types: Categorical fields use
typing.Literal
for type safety - Multilingual: Non-categorical fields respond in input language
- Validation: Pydantic models ensure data integrity
- Spark Compatible: All types map correctly to Spark schemas
Design Principles¶
- Consistency: Uniform API across all task domains
- Configurability: Customizable parameters for different use cases
- Type Safety: Strong typing with Pydantic validation
- Scalability: Optimized for batch processing and large datasets
- Extensibility: Easy to add new domains and tasks
Adding New Task Domains¶
To add a new domain (e.g., finance
, healthcare
, legal
):
- Create Domain Module:
src/openaivec/task/new_domain/
- Implement Tasks: Following existing patterns with Pydantic models
- Add Multilingual Support: Include language-aware instructions
- Export Functions: Both configurable functions and constants
- Update Documentation: Add to this module docstring
Example New Domain Structure¶
src/openaivec/task/finance/
├── __init__.py # Export all functions and constants
├── risk_assessment.py # Credit risk, market risk analysis
├── document_analysis.py # Financial document processing
└── compliance_check.py # Regulatory compliance verification
Performance Considerations¶
- Batch Processing: Use
BatchResponses
for multiple inputs - Deduplication: Automatic duplicate removal reduces API costs
- Caching: Results are cached based on input content
- Async Support:
AsyncBatchResponses
for concurrent processing - Token Optimization: Vectorized system messages for efficiency
Best Practices¶
- Choose Appropriate Models:
gpt-4.1-mini
: Fast, cost-effective for most tasks-
gpt-4o
: Higher accuracy for complex analysis -
Customize When Needed:
- Use default tasks for quick prototyping
-
Configure custom tasks for production use
-
Handle Multilingual Input:
- Tasks automatically detect and respond in input language
-
Categorical fields remain in English for system compatibility
-
Monitor Performance:
- Use batch sizes appropriate for your use case
- Monitor token usage for cost optimization
See individual task modules for detailed documentation and examples.
Modules¶
customer_support ¶
Modules¶
customer_sentiment ¶
Customer sentiment analysis task for support interactions.
This module provides a predefined task for analyzing customer sentiment specifically in support contexts, including satisfaction levels and emotional states that affect customer experience and support strategy.
Example
Basic usage with BatchResponses:
from openai import OpenAI
from openaivec._responses import BatchResponses
from openaivec.task import customer_support
client = OpenAI()
analyzer = BatchResponses.of_task(
client=client,
model_name="gpt-4.1-mini",
task=customer_support.CUSTOMER_SENTIMENT
)
inquiries = [
"I'm really disappointed with your service. This is the third time I've had this issue.",
"Thank you so much for your help! You've been incredibly patient.",
"I need to cancel my subscription. It's not working for me."
]
sentiments = analyzer.parse(inquiries)
for sentiment in sentiments:
print(f"Sentiment: {sentiment.sentiment}")
print(f"Satisfaction: {sentiment.satisfaction_level}")
print(f"Churn Risk: {sentiment.churn_risk}")
print(f"Emotional State: {sentiment.emotional_state}")
With pandas integration:
import pandas as pd
from openaivec import pandas_ext # Required for .ai accessor
from openaivec.task import customer_support
df = pd.DataFrame({"inquiry": [
"I'm really disappointed with your service. This is the third time I've had this issue.",
"Thank you so much for your help! You've been incredibly patient.",
"I need to cancel my subscription. It's not working for me."
]})
df["sentiment"] = df["inquiry"].ai.task(customer_support.CUSTOMER_SENTIMENT)
# Extract sentiment components
extracted_df = df.ai.extract("sentiment")
print(extracted_df[[
"inquiry", "sentiment_satisfaction_level",
"sentiment_churn_risk", "sentiment_emotional_state"
]])
Attributes:
Name | Type | Description |
---|---|---|
CUSTOMER_SENTIMENT |
PreparedTask
|
A prepared task instance configured for customer sentiment analysis with temperature=0.0 and top_p=1.0 for deterministic output. |
customer_sentiment(
business_context: str = "general customer support",
**api_kwargs,
) -> PreparedTask
Create a configurable customer sentiment analysis task.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
business_context
|
str
|
Business context for sentiment analysis. |
'general customer support'
|
**api_kwargs
|
Additional OpenAI API parameters (temperature, top_p, etc.). |
{}
|
Returns:
Type | Description |
---|---|
PreparedTask
|
PreparedTask configured for customer sentiment analysis. |
Source code in src/openaivec/task/customer_support/customer_sentiment.py
inquiry_classification ¶
Inquiry classification task for customer support.
This module provides a configurable task for classifying customer inquiries into different categories to help route them to the appropriate support team.
Example
Basic usage with default settings:
from openai import OpenAI
from openaivec._responses import BatchResponses
from openaivec.task import customer_support
client = OpenAI()
classifier = BatchResponses.of_task(
client=client,
model_name="gpt-4.1-mini",
task=customer_support.inquiry_classification()
)
inquiries = [
"I can't log into my account",
"When will my order arrive?",
"I want to cancel my subscription"
]
classifications = classifier.parse(inquiries)
for classification in classifications:
print(f"Category: {classification.category}")
print(f"Subcategory: {classification.subcategory}")
print(f"Confidence: {classification.confidence}")
print(f"Routing: {classification.routing}")
Customized for e-commerce:
from openaivec.task import customer_support
# E-commerce specific categories
ecommerce_categories = {
"order_management": ["order_status", "order_cancellation", "order_modification", "returns"],
"payment": ["payment_failed", "refund_request", "payment_methods", "billing_inquiry"],
"product": ["product_info", "size_guide", "availability", "recommendations"],
"shipping": ["delivery_status", "shipping_cost", "delivery_options", "tracking"],
"account": ["login_issues", "account_settings", "profile_updates", "password_reset"],
"general": ["complaints", "compliments", "feedback", "other"]
}
ecommerce_routing = {
"order_management": "order_team",
"payment": "billing_team",
"product": "product_team",
"shipping": "logistics_team",
"account": "account_support",
"general": "general_support"
}
task = customer_support.inquiry_classification(
categories=ecommerce_categories,
routing_rules=ecommerce_routing,
business_context="e-commerce platform"
)
classifier = BatchResponses.of_task(
client=client,
model_name="gpt-4.1-mini",
task=task
)
With pandas integration:
import pandas as pd
from openaivec import pandas_ext # Required for .ai accessor
from openaivec.task import customer_support
df = pd.DataFrame({"inquiry": [
"I can't log into my account",
"When will my order arrive?",
"I want to cancel my subscription"
]})
df["classification"] = df["inquiry"].ai.task(customer_support.inquiry_classification())
# Extract classification components
extracted_df = df.ai.extract("classification")
print(extracted_df[[
"inquiry", "classification_category",
"classification_subcategory", "classification_confidence"
]])
inquiry_classification(
categories: Dict[str, list[str]] | None = None,
routing_rules: Dict[str, str] | None = None,
priority_rules: Dict[str, str] | None = None,
business_context: str = "general customer support",
custom_keywords: Dict[str, list[str]] | None = None,
**api_kwargs,
) -> PreparedTask
Create a configurable inquiry classification task.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
categories
|
dict[str, list[str]] | None
|
Dictionary mapping category names to lists of subcategories. Default provides standard support categories. |
None
|
routing_rules
|
dict[str, str] | None
|
Dictionary mapping categories to routing destinations. Default provides standard routing options. |
None
|
priority_rules
|
dict[str, str] | None
|
Dictionary mapping keywords/patterns to priority levels. Default uses standard priority indicators. |
None
|
business_context
|
str
|
Description of the business context to help with classification. |
'general customer support'
|
custom_keywords
|
dict[str, list[str]] | None
|
Dictionary mapping categories to relevant keywords. |
None
|
**api_kwargs
|
Additional keyword arguments to pass to the OpenAI API, such as temperature, top_p, etc. |
{}
|
Returns:
Type | Description |
---|---|
PreparedTask
|
PreparedTask configured for inquiry classification. |
Source code in src/openaivec/task/customer_support/inquiry_classification.py
116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 |
|
inquiry_summary ¶
Inquiry summary task for customer support interactions.
This module provides a predefined task for summarizing customer inquiries, extracting key information, and creating concise summaries for support agents and management reporting.
Example
Basic usage with BatchResponses:
from openai import OpenAI
from openaivec._responses import BatchResponses
from openaivec.task import customer_support
client = OpenAI()
summarizer = BatchResponses.of_task(
client=client,
model_name="gpt-4.1-mini",
task=customer_support.INQUIRY_SUMMARY
)
inquiries = [
'''Hi there, I've been having trouble with my account for the past week.
Every time I try to log in, it says my password is incorrect, but I'm sure
it's right. I tried resetting it twice but the email never arrives.
I'm getting really frustrated because I need to access my files for work tomorrow.''',
'''I love your product! It's been incredibly helpful for my team.
However, I was wondering if there's any way to get more storage space?
We're running out and would like to upgrade our plan.'''
]
summaries = summarizer.parse(inquiries)
for summary in summaries:
print(f"Summary: {summary.summary}")
print(f"Issue: {summary.main_issue}")
print(f"Actions Taken: {summary.actions_taken}")
print(f"Resolution Status: {summary.resolution_status}")
With pandas integration:
import pandas as pd
from openaivec import pandas_ext # Required for .ai accessor
from openaivec.task import customer_support
df = pd.DataFrame({"inquiry": [long_inquiry_text]})
df["summary"] = df["inquiry"].ai.task(customer_support.INQUIRY_SUMMARY)
# Extract summary components
extracted_df = df.ai.extract("summary")
print(extracted_df[["inquiry", "summary_main_issue", "summary_resolution_status"]])
Attributes:
Name | Type | Description |
---|---|---|
INQUIRY_SUMMARY |
PreparedTask
|
A prepared task instance configured for inquiry summarization with temperature=0.0 and top_p=1.0 for deterministic output. |
inquiry_summary(
summary_length: str = "concise",
business_context: str = "general customer support",
**api_kwargs,
) -> PreparedTask
Create a configurable inquiry summary task.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
summary_length
|
str
|
Length of summary (concise, detailed, bullet_points). |
'concise'
|
business_context
|
str
|
Business context for summary. |
'general customer support'
|
**api_kwargs
|
Additional keyword arguments to pass to the OpenAI API, such as temperature, top_p, etc. |
{}
|
Returns:
Type | Description |
---|---|
PreparedTask
|
PreparedTask configured for inquiry summarization. |
Source code in src/openaivec/task/customer_support/inquiry_summary.py
87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 |
|
intent_analysis ¶
Intent analysis task for customer support interactions.
This module provides a predefined task for analyzing customer intent to understand what the customer is trying to achieve and how to best assist them.
Example
Basic usage with BatchResponses:
from openai import OpenAI
from openaivec._responses import BatchResponses
from openaivec.task import customer_support
client = OpenAI()
analyzer = BatchResponses.of_task(
client=client,
model_name="gpt-4.1-mini",
task=customer_support.INTENT_ANALYSIS
)
inquiries = [
"I want to upgrade my plan to get more storage",
"How do I delete my account? I'm not satisfied with the service",
"Can you walk me through setting up the mobile app?"
]
intents = analyzer.parse(inquiries)
for intent in intents:
print(f"Primary Intent: {intent.primary_intent}")
print(f"Action Required: {intent.action_required}")
print(f"Success Likelihood: {intent.success_likelihood}")
print(f"Next Steps: {intent.next_steps}")
With pandas integration:
import pandas as pd
from openaivec import pandas_ext # Required for .ai accessor
from openaivec.task import customer_support
df = pd.DataFrame({"inquiry": [
"I want to upgrade my plan to get more storage",
"How do I delete my account? I'm not satisfied with the service",
"Can you walk me through setting up the mobile app?"
]})
df["intent"] = df["inquiry"].ai.task(customer_support.INTENT_ANALYSIS)
# Extract intent components
extracted_df = df.ai.extract("intent")
print(extracted_df[["inquiry", "intent_primary_intent", "intent_action_required", "intent_success_likelihood"]])
Attributes:
Name | Type | Description |
---|---|---|
INTENT_ANALYSIS |
PreparedTask
|
A prepared task instance configured for intent analysis with temperature=0.0 and top_p=1.0 for deterministic output. |
intent_analysis(
business_context: str = "general customer support",
**api_kwargs,
) -> PreparedTask
Create a configurable intent analysis task.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
business_context
|
str
|
Business context for intent analysis. |
'general customer support'
|
**api_kwargs
|
Additional keyword arguments to pass to the OpenAI API, such as temperature, top_p, etc. |
{}
|
Returns:
Type | Description |
---|---|
PreparedTask
|
PreparedTask configured for intent analysis. |
Source code in src/openaivec/task/customer_support/intent_analysis.py
response_suggestion ¶
Response suggestion task for customer support interactions.
This module provides a predefined task for generating suggested responses to customer inquiries, helping support agents provide consistent, helpful, and professional communication.
Example
Basic usage with BatchResponses:
from openai import OpenAI
from openaivec._responses import BatchResponses
from openaivec.task import customer_support
client = OpenAI()
responder = BatchResponses.of_task(
client=client,
model_name="gpt-4.1-mini",
task=customer_support.RESPONSE_SUGGESTION
)
inquiries = [
"I can't access my account. I've tried resetting my password but the email never arrives.",
"I'm really disappointed with your service. This is the third time I've had issues.",
"Thank you for your help yesterday! The problem is now resolved."
]
responses = responder.parse(inquiries)
for response in responses:
print(f"Suggested Response: {response.suggested_response}")
print(f"Tone: {response.tone}")
print(f"Priority: {response.priority}")
print(f"Follow-up: {response.follow_up_required}")
With pandas integration:
import pandas as pd
from openaivec import pandas_ext # Required for .ai accessor
from openaivec.task import customer_support
df = pd.DataFrame({"inquiry": [
"I can't access my account. I've tried resetting my password but the email never arrives.",
"I'm really disappointed with your service. This is the third time I've had issues."
]})
df["response"] = df["inquiry"].ai.task(customer_support.RESPONSE_SUGGESTION)
# Extract response components
extracted_df = df.ai.extract("response")
print(extracted_df[["inquiry", "response_suggested_response", "response_tone", "response_priority"]])
Attributes:
Name | Type | Description |
---|---|---|
RESPONSE_SUGGESTION |
PreparedTask
|
A prepared task instance configured for response suggestion with temperature=0.0 and top_p=1.0 for deterministic output. |
response_suggestion(
response_style: str = "professional",
company_name: str = "our company",
business_context: str = "general customer support",
**api_kwargs,
) -> PreparedTask
Create a configurable response suggestion task.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
response_style
|
str
|
Style of response (professional, friendly, empathetic, formal). |
'professional'
|
company_name
|
str
|
Name of the company for personalization. |
'our company'
|
business_context
|
str
|
Business context for responses. |
'general customer support'
|
**api_kwargs
|
Additional keyword arguments to pass to the OpenAI API, such as temperature, top_p, etc. |
{}
|
Returns:
Type | Description |
---|---|
PreparedTask
|
PreparedTask configured for response suggestions. |
Source code in src/openaivec/task/customer_support/response_suggestion.py
91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 |
|
urgency_analysis ¶
Urgency analysis task for customer support.
This module provides a configurable task for analyzing the urgency level of customer inquiries to help prioritize support queue and response times.
Example
Basic usage with default settings:
from openai import OpenAI
from openaivec._responses import BatchResponses
from openaivec.task import customer_support
client = OpenAI()
analyzer = BatchResponses.of_task(
client=client,
model_name="gpt-4.1-mini",
task=customer_support.urgency_analysis()
)
inquiries = [
"URGENT: My website is down and I'm losing customers!",
"Can you help me understand how to use the new feature?",
"I haven't received my order from last week"
]
analyses = analyzer.parse(inquiries)
for analysis in analyses:
print(f"Urgency Level: {analysis.urgency_level}")
print(f"Score: {analysis.urgency_score}")
print(f"Response Time: {analysis.response_time}")
print(f"Escalation: {analysis.escalation_required}")
Customized for SaaS platform with business hours:
from openaivec.task import customer_support
# SaaS-specific urgency levels
saas_urgency_levels = {
"critical": "Service outages, security breaches, data loss",
"high": "Login issues, payment failures, API errors",
"medium": "Feature bugs, performance issues, billing questions",
"low": "Feature requests, documentation questions, general feedback"
}
# Custom response times based on SLA
saas_response_times = {
"critical": "immediate",
"high": "within_1_hour",
"medium": "within_4_hours",
"low": "within_24_hours"
}
# Enterprise customer tier gets priority
enterprise_customer_tiers = {
"enterprise": "Priority support, dedicated account manager",
"business": "Standard business support",
"professional": "Professional plan support",
"starter": "Basic support"
}
task = customer_support.urgency_analysis(
urgency_levels=saas_urgency_levels,
response_times=saas_response_times,
customer_tiers=enterprise_customer_tiers,
business_context="SaaS platform",
business_hours="9 AM - 5 PM EST, Monday-Friday"
)
analyzer = BatchResponses.of_task(
client=client,
model_name="gpt-4.1-mini",
task=task
)
With pandas integration:
import pandas as pd
from openaivec import pandas_ext # Required for .ai accessor
from openaivec.task import customer_support
df = pd.DataFrame({"inquiry": [
"URGENT: My website is down and I'm losing customers!",
"Can you help me understand how to use the new feature?",
"I haven't received my order from last week"
]})
df["urgency"] = df["inquiry"].ai.task(customer_support.urgency_analysis())
# Extract urgency components
extracted_df = df.ai.extract("urgency")
print(extracted_df[["inquiry", "urgency_urgency_level", "urgency_urgency_score", "urgency_response_time"]])
urgency_analysis(
urgency_levels: Dict[str, str] | None = None,
response_times: Dict[str, str] | None = None,
customer_tiers: Dict[str, str] | None = None,
escalation_rules: Dict[str, str] | None = None,
urgency_keywords: Dict[str, list[str]] | None = None,
business_context: str = "general customer support",
business_hours: str = "24/7 support",
sla_rules: Dict[str, str] | None = None,
**api_kwargs,
) -> PreparedTask
Create a configurable urgency analysis task.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
urgency_levels
|
dict[str, str] | None
|
Dictionary mapping urgency levels to descriptions. |
None
|
response_times
|
dict[str, str] | None
|
Dictionary mapping urgency levels to response times. |
None
|
customer_tiers
|
dict[str, str] | None
|
Dictionary mapping tier names to descriptions. |
None
|
escalation_rules
|
dict[str, str] | None
|
Dictionary mapping conditions to escalation actions. |
None
|
urgency_keywords
|
dict[str, list[str]] | None
|
Dictionary mapping urgency levels to indicator keywords. |
None
|
business_context
|
str
|
Description of the business context. |
'general customer support'
|
business_hours
|
str
|
Description of business hours for response time calculation. |
'24/7 support'
|
sla_rules
|
dict[str, str] | None
|
Dictionary mapping customer tiers to SLA requirements. |
None
|
**api_kwargs
|
Additional keyword arguments to pass to the OpenAI API, such as temperature, top_p, etc. |
{}
|
Returns:
Type | Description |
---|---|
PreparedTask
|
PreparedTask configured for urgency analysis. |
Source code in src/openaivec/task/customer_support/urgency_analysis.py
129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 |
|
nlp ¶
Modules¶
dependency_parsing ¶
Dependency parsing task for OpenAI API.
This module provides a predefined task for dependency parsing that analyzes syntactic dependencies between words in sentences using OpenAI's language models.
Example
Basic usage with BatchResponses:
from openai import OpenAI
from openaivec._responses import BatchResponses
from openaivec.task import nlp
client = OpenAI()
analyzer = BatchResponses.of_task(
client=client,
model_name="gpt-4.1-mini",
task=nlp.DEPENDENCY_PARSING
)
texts = ["The cat sat on the mat.", "She quickly ran to the store."]
analyses = analyzer.parse(texts)
for analysis in analyses:
print(f"Tokens: {analysis.tokens}")
print(f"Dependencies: {analysis.dependencies}")
print(f"Root: {analysis.root_word}")
With pandas integration:
import pandas as pd
from openaivec import pandas_ext # Required for .ai accessor
from openaivec.task import nlp
df = pd.DataFrame({"text": ["The cat sat on the mat.", "She quickly ran to the store."]})
df["parsing"] = df["text"].ai.task(nlp.DEPENDENCY_PARSING)
# Extract parsing components
extracted_df = df.ai.extract("parsing")
print(extracted_df[["text", "parsing_tokens", "parsing_root_word", "parsing_syntactic_structure"]])
Attributes:
Name | Type | Description |
---|---|---|
DEPENDENCY_PARSING |
PreparedTask
|
A prepared task instance configured for dependency parsing with temperature=0.0 and top_p=1.0 for deterministic output. |
keyword_extraction ¶
Keyword extraction task for OpenAI API.
This module provides a predefined task for keyword extraction that identifies important keywords and phrases from text using OpenAI's language models.
Example
Basic usage with BatchResponses:
from openai import OpenAI
from openaivec._responses import BatchResponses
from openaivec.task import nlp
client = OpenAI()
analyzer = BatchResponses.of_task(
client=client,
model_name="gpt-4.1-mini",
task=nlp.KEYWORD_EXTRACTION
)
texts = ["Machine learning is transforming the technology industry.",
"Climate change affects global weather patterns."]
analyses = analyzer.parse(texts)
for analysis in analyses:
print(f"Keywords: {analysis.keywords}")
print(f"Key phrases: {analysis.keyphrases}")
print(f"Topics: {analysis.topics}")
With pandas integration:
import pandas as pd
from openaivec import pandas_ext # Required for .ai accessor
from openaivec.task import nlp
df = pd.DataFrame({"text": ["Machine learning is transforming the technology industry.",
"Climate change affects global weather patterns."]})
df["keywords"] = df["text"].ai.task(nlp.KEYWORD_EXTRACTION)
# Extract keyword components
extracted_df = df.ai.extract("keywords")
print(extracted_df[["text", "keywords_keywords", "keywords_topics", "keywords_summary"]])
Attributes:
Name | Type | Description |
---|---|---|
KEYWORD_EXTRACTION |
PreparedTask
|
A prepared task instance configured for keyword extraction with temperature=0.0 and top_p=1.0 for deterministic output. |
morphological_analysis ¶
Morphological analysis task for OpenAI API.
This module provides a predefined task for morphological analysis including tokenization, part-of-speech tagging, and lemmatization using OpenAI's language models.
Example
Basic usage with BatchResponses:
from openai import OpenAI
from openaivec._responses import BatchResponses
from openaivec.task import nlp
client = OpenAI()
analyzer = BatchResponses.of_task(
client=client,
model_name="gpt-4.1-mini",
task=nlp.MORPHOLOGICAL_ANALYSIS
)
texts = ["Running quickly", "The cats are sleeping"]
analyses = analyzer.parse(texts)
for analysis in analyses:
print(f"Tokens: {analysis.tokens}")
print(f"POS Tags: {analysis.pos_tags}")
print(f"Lemmas: {analysis.lemmas}")
With pandas integration:
import pandas as pd
from openaivec import pandas_ext # Required for .ai accessor
from openaivec.task import nlp
df = pd.DataFrame({"text": ["Running quickly", "The cats are sleeping"]})
df["analysis"] = df["text"].ai.task(nlp.MORPHOLOGICAL_ANALYSIS)
# Extract analysis components
extracted_df = df.ai.extract("analysis")
print(extracted_df[["text", "analysis_tokens", "analysis_pos_tags", "analysis_lemmas"]])
Attributes:
Name | Type | Description |
---|---|---|
MORPHOLOGICAL_ANALYSIS |
PreparedTask
|
A prepared task instance configured for morphological analysis with temperature=0.0 and top_p=1.0 for deterministic output. |
named_entity_recognition ¶
Named entity recognition task for OpenAI API.
This module provides a predefined task for named entity recognition that identifies and classifies named entities in text using OpenAI's language models.
Example
Basic usage with BatchResponses:
from openai import OpenAI
from openaivec._responses import BatchResponses
from openaivec.task import nlp
client = OpenAI()
analyzer = BatchResponses.of_task(
client=client,
model_name="gpt-4.1-mini",
task=nlp.NAMED_ENTITY_RECOGNITION
)
texts = ["John works at Microsoft in Seattle", "The meeting is on March 15th"]
analyses = analyzer.parse(texts)
for analysis in analyses:
print(f"Persons: {analysis.persons}")
print(f"Organizations: {analysis.organizations}")
print(f"Locations: {analysis.locations}")
With pandas integration:
import pandas as pd
from openaivec import pandas_ext # Required for .ai accessor
from openaivec.task import nlp
df = pd.DataFrame({"text": ["John works at Microsoft in Seattle", "The meeting is on March 15th"]})
df["entities"] = df["text"].ai.task(nlp.NAMED_ENTITY_RECOGNITION)
# Extract entity components
extracted_df = df.ai.extract("entities")
print(extracted_df[["text", "entities_persons", "entities_organizations", "entities_locations"]])
Attributes:
Name | Type | Description |
---|---|---|
NAMED_ENTITY_RECOGNITION |
PreparedTask
|
A prepared task instance configured for named entity recognition with temperature=0.0 and top_p=1.0 for deterministic output. |
sentiment_analysis ¶
Sentiment analysis task for OpenAI API.
This module provides a predefined task for sentiment analysis that analyzes sentiment and emotions in text using OpenAI's language models.
Example
Basic usage with BatchResponses:
from openai import OpenAI
from openaivec._responses import BatchResponses
from openaivec.task import nlp
client = OpenAI()
analyzer = BatchResponses.of_task(
client=client,
model_name="gpt-4.1-mini",
task=nlp.SENTIMENT_ANALYSIS
)
texts = ["I love this product!", "This is terrible and disappointing."]
analyses = analyzer.parse(texts)
for analysis in analyses:
print(f"Sentiment: {analysis.sentiment}")
print(f"Confidence: {analysis.confidence}")
print(f"Emotions: {analysis.emotions}")
With pandas integration:
import pandas as pd
from openaivec import pandas_ext # Required for .ai accessor
from openaivec.task import nlp
df = pd.DataFrame({"text": ["I love this product!", "This is terrible and disappointing."]})
df["sentiment"] = df["text"].ai.task(nlp.SENTIMENT_ANALYSIS)
# Extract sentiment components
extracted_df = df.ai.extract("sentiment")
print(extracted_df[["text", "sentiment_sentiment", "sentiment_confidence", "sentiment_polarity"]])
Attributes:
Name | Type | Description |
---|---|---|
SENTIMENT_ANALYSIS |
PreparedTask
|
A prepared task instance configured for sentiment analysis with temperature=0.0 and top_p=1.0 for deterministic output. |
translation ¶
Multilingual translation task for OpenAI API.
This module provides a predefined task that translates text into multiple languages using OpenAI's language models. The translation covers a comprehensive set of languages including Germanic, Romance, Slavic, East Asian, South Asian, Southeast Asian, Middle Eastern, African, and other language families.
The task is designed to be used with the OpenAI API for batch processing and provides structured output with consistent language code naming.
Example
Basic usage with BatchResponses:
from openai import OpenAI
from openaivec._responses import BatchResponses
from openaivec.task import nlp
client = OpenAI()
translator = BatchResponses.of_task(
client=client,
model_name="gpt-4.1-mini",
task=nlp.MULTILINGUAL_TRANSLATION
)
texts = ["Hello", "Good morning", "Thank you"]
translations = translator.parse(texts)
for translation in translations:
print(f"English: {translation.en}")
print(f"Japanese: {translation.ja}")
print(f"Spanish: {translation.es}")
With pandas integration:
import pandas as pd
from openaivec import pandas_ext # Required for .ai accessor
from openaivec.task import nlp
df = pd.DataFrame({"text": ["Hello", "Goodbye"]})
df["translations"] = df["text"].ai.task(nlp.MULTILINGUAL_TRANSLATION)
# Extract specific languages
extracted_df = df.ai.extract("translations")
print(extracted_df[["text", "translations_en", "translations_ja", "translations_fr"]])
Attributes:
Name | Type | Description |
---|---|---|
MULTILINGUAL_TRANSLATION |
PreparedTask
|
A prepared task instance configured for multilingual translation with temperature=0.0 and top_p=1.0 for deterministic output. |
Note
The translation covers 58 languages across major language families. All field names use ISO 639-1 language codes where possible, with some exceptions like 'zh_tw' for Traditional Chinese and 'is_' for Icelandic (to avoid Python keyword conflicts).
Languages included: - Germanic: English, German, Dutch, Swedish, Danish, Norwegian, Icelandic - Romance: Spanish, French, Italian, Portuguese, Romanian, Catalan - Slavic: Russian, Polish, Czech, Slovak, Ukrainian, Bulgarian, Croatian, Serbian - East Asian: Japanese, Korean, Chinese (Simplified/Traditional) - South Asian: Hindi, Bengali, Telugu, Tamil, Urdu - Southeast Asian: Thai, Vietnamese, Indonesian, Malay, Filipino - Middle Eastern: Arabic, Hebrew, Persian, Turkish - African: Swahili, Amharic - Other European: Finnish, Hungarian, Estonian, Latvian, Lithuanian, Greek - Celtic: Welsh, Irish - Other: Basque, Maltese
table ¶
Classes¶
FillNaResponse ¶
Bases: BaseModel
Response model for missing value imputation results.
Contains the row index and the imputed value for a specific missing entry in the target column.
Modules¶
fillna ¶
Missing value imputation task for DataFrame columns.
This module provides functionality to intelligently fill missing values in DataFrame columns using AI-powered analysis. The task analyzes existing data patterns to generate contextually appropriate values for missing entries.
Example
Basic usage with pandas DataFrame:
import pandas as pd
from openaivec import pandas_ext # Required for .ai accessor
from openaivec.task.table import fillna
# Create DataFrame with missing values
df = pd.DataFrame({
"name": ["Alice", "Bob", None, "David"],
"age": [25, 30, 35, None],
"city": ["New York", "London", "Tokyo", "Paris"],
"salary": [50000, 60000, 70000, None]
})
# Fill missing values in the 'salary' column
task = fillna(df, "salary")
filled_salaries = df[df["salary"].isna()].ai.task(task)
# Apply filled values back to DataFrame
for result in filled_salaries:
df.loc[result.index, "salary"] = result.output
With BatchResponses for more control:
from openai import OpenAI
from openaivec._responses import BatchResponses
from openaivec.task.table import fillna
client = OpenAI()
df = pd.DataFrame({...}) # Your DataFrame with missing values
# Create fillna task for target column
task = fillna(df, "target_column")
# Get rows with missing values in target column
missing_rows = df[df["target_column"].isna()]
# Process with BatchResponses
filler = BatchResponses.of_task(
client=client,
model_name="gpt-4.1-mini",
task=task
)
# Generate inputs for missing rows
inputs = []
for idx, row in missing_rows.iterrows():
inputs.append({
"index": idx,
"input": {k: v for k, v in row.items() if k != "target_column"}
})
filled_values = filler.parse(inputs)
Bases: BaseModel
Response model for missing value imputation results.
Contains the row index and the imputed value for a specific missing entry in the target column.
fillna(
df: DataFrame,
target_column_name: str,
max_examples: int = 500,
**api_kwargs,
) -> PreparedTask
Create a prepared task for filling missing values in a DataFrame column.
Analyzes the provided DataFrame to understand data patterns and creates a configured task that can intelligently fill missing values in the specified target column. The task uses few-shot learning with examples extracted from non-null rows in the DataFrame.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df
|
DataFrame
|
Source DataFrame containing the data with missing values. |
required |
target_column_name
|
str
|
Name of the column to fill missing values for. This column should exist in the DataFrame and contain some non-null values to serve as training examples. |
required |
max_examples
|
int
|
Maximum number of example rows to use for few-shot learning. Defaults to 500. Higher values provide more context but increase token usage and processing time. |
500
|
**api_kwargs
|
Additional keyword arguments to pass to the OpenAI API, such as temperature, top_p, etc. |
{}
|
Returns:
Type | Description |
---|---|
PreparedTask
|
PreparedTask configured for missing value imputation with: |
PreparedTask
|
|
PreparedTask
|
|
PreparedTask
|
|
Raises:
Type | Description |
---|---|
ValueError
|
If target_column_name doesn't exist in DataFrame, contains no non-null values for training examples, DataFrame is empty, or max_examples is not a positive integer. |
Example
import pandas as pd
from openaivec.task.table import fillna
df = pd.DataFrame({
"product": ["laptop", "phone", "tablet", "laptop"],
"brand": ["Apple", "Samsung", None, "Dell"],
"price": [1200, 800, 600, 1000]
})
# Create task to fill missing brand values
task = fillna(df, "brand")
# Use with pandas AI accessor
missing_brands = df[df["brand"].isna()].ai.task(task)