# Building AI Agents with Persistent Memory using Cognee

This notebook demonstrates how to build intelligent AI agents with sophisticated memory capabilities using [**cognee**](https://www.cognee.ai/) - an open source AI memory that combines knowledge graphs, semantic search, and session management to create context-aware AI systems.

## üéØ Learning Objectives

By the end of this tutorial, you'll understand how to:
- **Build Knowledge Graphs Backed by Embeddings**: Transform unstructured text into structured, queryable knowledge
- **Implement Session Memory**: Create multi-turn conversations with automatic context retention
- **Persist Conversations**: Optionally store important interactions in long-term memory for future reference
- **Query Using Natural Language**: Access and leverage historical context in new conversations
- **Visualize Memory**: Explore the relationships in your agent's knowledge graph

## üèóÔ∏è What You'll Build

In this tutorial, we'll create a **Coding Assistant** with persistent memory that:

### 1. **Knowledge Base Construction**
   - Ingests developer profile and expertise information
   - Processes Python programming principles and best practices
   - Stores historical conversations between developers and AI assistants

### 2. **Session-Aware Conversations**
   - Maintains context across multiple questions in the same session
   - Automatically caches each question/answer pair for efficient retrieval
   - Provides coherent, contextual responses based on conversation history

### 3. **Long-term Memory**
   - Persists important conversations into a long-term memory
   - Retrieves relevant memories from knowledge base and past sessions to inform new interactions
   - Builds a growing knowledge base that improves over time

### 4. **Intelligent Memory Retrieval**
   - Uses graph-aware semantic search to find relevant information across all stored knowledge
   - Filters searches by data subgroups (developer info vs. principles)
   - Combines multiple data sources to provide comprehensive answers

## üìã Prerequisites & Setup

### System Requirements

Before starting, ensure you have:

1. **Python Environment**
   - Python 3.9 or higher
   - Virtual environment (recommended)
   
2. **Redis Cache** (Required for Session Management)
   - Local Redis: `docker run -d -p 6379:6379 redis`
   - Or use a managed Redis service
   
3. **LLM API Access**
   - OpenAI API key or other providers (see [documentation](https://docs.cognee.ai/setup-configuration/llm-providers))

4. **Database Configuration**
   - No configuration required by default. Cognee uses file-based databases (LanceDB and Kuzu)
   - Optionally, you can setup Azure AI Search as a vectore store (see [documentation](https://github.com/topoteretes/cognee-community/tree/main/packages/vector/azureaisearch))

### Environment Configuration

Create a `.env` file in your project directory with the following variables:

```ini
# LLM Configuration (Required)
LLM_API_KEY=your-openai-api-key-here

# Cache Configuration (Required for Sessions)
CACHING=true  # Must be enabled for session history

```


## üèõÔ∏è Understanding Cognee's Memory Architecture

### How Cognee Works

Cognee provides a sophisticated memory system that goes beyond simple key-value storage:

```
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ      30+ data sources    ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
            ‚îÇ
            ‚ñº
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ  Dynamically evolving memory layers      ‚îÇ
‚îÇ                                          ‚îÇ
‚îÇ  ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê  ‚îÇ
‚îÇ  ‚îÇ Knowledge Graph in Graph Database  ‚îÇ  ‚îÇ
‚îÇ  ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò  ‚îÇ
‚îÇ  ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê  ‚îÇ
‚îÇ  ‚îÇ Embeddings in Vector Store         ‚îÇ  ‚îÇ
‚îÇ  ‚îÇ   (e.g., Azure AI Search)          ‚îÇ  ‚îÇ
‚îÇ  ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò  ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
            ‚îÇ                      ‚ñ≤   
            ‚ñº                      ‚îÇ(optional)
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê           ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ     cognee     ‚îÇ(optional) ‚îÇ Cognee Session ‚îÇ
‚îÇ    retrievers  ‚îÇ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚ñ∂‚îÇ     Cache      ‚îÇ
‚îÇ                ‚îÇ           ‚îÇ    (Redis)     ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò           ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
        ‚ñ≤
        ‚îÇ
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ          Agents          ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò

```

### Key Components:

1. **Knowledge Graph**: Stores entities, relationships, and semantic connections
2. **Vector Embeddings**: Enables semantic search across all stored information
3. **Session Cache**: Maintains conversation context within and across sessions
4. **NodeSets**: Organize data into logical categories for targeted retrieval

### Memory Types in This Tutorial:

- **Persistent Memory**: Long-term storage in the knowledge graph
- **Session Memory**: Temporary conversation context in Redis cache
- **Semantic Memory**: Vector-based similarity search across all data

## üì¶ Install Required Packages

Install Cognee with Redis support for session management:

In [None]:
!pip install --quiet "cognee[redis]==0.4.0"

## üîß Initialize Environment and Load Libraries

Make sure:
1. Redis is running (e.g., via Docker: `docker run -d -p 6379:6379 redis`)
2. Environment variables are set before importing cache modules
3. If needed, restart the kernel and run cells in order

The following cell will:
1. Load environment variables from `.env`
2. Configure Cognee with your LLM settings
3. Enable caching for session management
4. Validate all components are properly connected

In [None]:
import os
from pathlib import Path

from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# cognee Configuration
os.environ["LLM_API_KEY"] = os.getenv("LLM_API_KEY", None)
os.environ["CACHING"] = os.getenv("CACHING", "true")


import cognee

print(f"Cognee version: {cognee.__version__}")
print(f"CACHING: {os.environ.get('CACHING')}")
print(f"LLM_API_KEY: {os.environ.get('LLM_API_KEY')}")

## üìÅ Configure Storage Directories

Cognee uses two separate directories for its operations:
- **Data Root**: Stores ingested documents and processed data
- **System Root**: Contains the knowledge graph database and system metadata

We'll create isolated directories for this tutorial as follows:

In [None]:
DATA_ROOT = Path('.data_storage').resolve()
SYSTEM_ROOT = Path('.cognee_system').resolve()

DATA_ROOT.mkdir(parents=True, exist_ok=True)
SYSTEM_ROOT.mkdir(parents=True, exist_ok=True)

cognee.config.data_root_directory(str(DATA_ROOT))
cognee.config.system_root_directory(str(SYSTEM_ROOT))

print(f"Data root: {DATA_ROOT}")
print(f"System root: {SYSTEM_ROOT}")

## üßπ Reset Memory State

Before we begin building our memory system, let's ensure we're starting fresh.

> üí° **Tip**: You can skip this step if you want to preserve existing memories from your previous runs when you use this notebook later.

In [None]:
await cognee.prune.prune_data()
await cognee.prune.prune_system(metadata=True)
print('Cleared previous Cognee state.')

## üìö Part 1: Building the Knowledge Base

### Data Sources for Our Developer Assistant

We'll ingest three types of data to create a comprehensive knowledge base:

1. **Developer Profile**: Personal expertise and technical background
2. **Python Best Practices**: The Zen of Python with practical guidelines
3. **Historical Conversations**: Past Q&A sessions between developers and AI assistants

This diverse data allows our agent to:
- Understand the user's technical context
- Apply best practices in recommendations
- Learn from previous successful interactions

In [None]:
developer_intro = (
  "Hi, I'm an AI/Backend engineer. "
  "I build FastAPI services with Pydantic, heavy asyncio/aiohttp pipelines, "
  "and production testing via pytest-asyncio. "
  "I've shipped low-latency APIs on AWS, Azure, and GoogleCloud."
)

python_zen_principles = (
  """
    # The Zen of Python: Practical Guide

    ## Overview
    Use these principles as a checklist during design, coding, and reviews.

    ## Key Principles With Guidance

    ### 1. Beautiful is better than ugly
    Prefer descriptive names, clear structure, and consistent formatting.

    ### 2. Explicit is better than implicit
    Be clear about behavior, imports, and types.
    ```python
    from datetime import datetime, timedelta

    def get_future_date(days_ahead: int) -> datetime:
        return datetime.now() + timedelta(days=days_ahead)
    ```

    ### 3. Simple is better than complex
    Choose straightforward solutions first.

    ### 4. Complex is better than complicated
    When complexity is needed, organize it with clear abstractions.

    ### 5. Flat is better than nested
    Use early returns to reduce indentation.

    ## Modern Python Tie-ins
    - Type hints reinforce explicitness
    - Context managers enforce safe resource handling
    - Dataclasses improve readability for data containers

    ## Quick Review Checklist
    - Is it readable and explicit?
    - Is this the simplest working solution?
    - Are errors explicit and logged?
    - Are modules/namespaces used appropriately?
  """
)

human_agent_conversations = (
  """
  "conversations": [
      {
        "id": "conv_001",
        "timestamp": "2024-01-15T10:30:00Z",
        "topic": "async/await patterns",
        "user_query": "I'm building a web scraper that needs to handle thousands of URLs concurrently. What's the best way to structure this with asyncio?",
        "assistant_response": "Use asyncio with aiohttp, a semaphore to cap concurrency, TCPConnector for connection pooling, context managers for session lifecycle, and robust exception handling for failed requests.",
        "code_context": {
          "file": "scraper.py",
          "language": "python",
          "patterns_discussed": ["async/await", "context_managers", "semaphores", "aiohttp", "error_handling"]
        },
        "follow_up_questions": [
          "How do I add retry logic for failed requests?",
          "What's the best way to parse the scraped HTML content?"
        ]
      },
      {
        "id": "conv_002",
        "timestamp": "2024-01-16T14:20:00Z",
        "topic": "dataclass vs pydantic",
        "user_query": "When should I use dataclasses vs Pydantic models? I'm building an API and need to handle user input validation.",
        "assistant_response": "For API input/output, prefer Pydantic: it provides runtime validation, type coercion, JSON serialization, enums for roles, field constraints, and custom validators; integrates cleanly with FastAPI for automatic request validation and error reporting.",
        "code_context": {
          "file": "models.py",
          "language": "python",
          "patterns_discussed": ["pydantic", "dataclasses", "validation", "fastapi", "type_hints", "enums"]
        },
        "follow_up_questions": [
          "How do I handle nested validation with Pydantic?",
          "Can I use Pydantic with SQLAlchemy models?"
        ]
      },
      {
        "id": "conv_003",
        "timestamp": "2024-01-17T09:15:00Z",
        "topic": "testing patterns",
        "user_query": "I'm struggling with testing async code and database interactions. What's the best approach for pytest with async functions?",
        "assistant_response": "Recommended using pytest-asyncio, async fixtures, and an isolated test database or mocks to reliably test async functions and database interactions in FastAPI.",
        "code_context": {
          "file": "test_users.py",
          "language": "python",
          "patterns_discussed": ["pytest", "async_testing", "fixtures", "mocking", "database_testing", "fastapi_testing"]
        },
        "follow_up_questions": [
          "How do I test WebSocket connections?",
          "What's the best way to test database migrations?"
        ]
      },
      {
        "id": "conv_004",
        "timestamp": "2024-01-18T16:45:00Z",
        "topic": "performance optimization",
        "user_query": "My FastAPI app is getting slow with large datasets. How can I optimize database queries and response times?",
        "assistant_response": "Suggested optimizing database queries (indexes, pagination, selecting only needed columns), adding caching, streaming or chunked responses for large datasets, background tasks for heavy work, and monitoring to find bottlenecks.",
        "code_context": {
          "file": "optimizations.py",
          "language": "python",
          "patterns_discussed": ["performance_optimization", "caching", "database_optimization", "async_patterns", "monitoring"]
        },
        "follow_up_questions": [
          "How do I implement database connection pooling properly?",
          "What's the best way to handle memory usage with large datasets?"
        ]
      },
      {
        "id": "conv_005",
        "timestamp": "2024-01-19T11:30:00Z",
        "topic": "error handling and logging",
        "user_query": "I need to implement proper error handling and logging across my Python application. What's the best approach for production-ready error management?",
        "assistant_response": "Proposed centralized error handling with custom exceptions, structured logging, FastAPI middleware or decorators, and integration points for external monitoring/alerting tools.",
        "code_context": {
          "file": "error_handling.py",
          "language": "python",
          "patterns_discussed": ["error_handling", "logging", "exceptions", "middleware", "decorators", "fastapi"]
        },
        "follow_up_questions": [
          "How do I integrate this with external monitoring tools like Sentry?",
          "What's the best way to handle errors in background tasks?"
        ]
      }
    ],
    "metadata": {
      "total_conversations": 5,
      "date_range": "2024-01-15 to 2024-01-19",
      "topics_covered": [
        "async/await patterns",
        "dataclass vs pydantic",
        "testing patterns",
        "performance optimization",
        "error handling and logging"
      ],
      "code_patterns_discussed": [
        "asyncio", "aiohttp", "semaphores", "context_managers",
        "pydantic", "fastapi", "type_hints", "validation",
        "pytest", "async_testing", "fixtures", "mocking",
        "performance_optimization", "caching", "database_optimization",
        "error_handling", "logging", "exceptions", "middleware"
      ],
      "difficulty_levels": {
        "beginner": 1,
        "intermediate": 2,
        "advanced": 2
      }
    }
  """
)

## üîÑ Process Data into Knowledge Graph

Now we'll transform our raw text into a structured memory. This process:

1. **Adds data to NodeSets**: Organizes information into logical categories
   - `developer_data`: Developer profile and conversations
   - `principles_data`: Python best practices and guidelines

2. **Runs Cognify Pipeline**: Extracts entities, relationships, and creates embeddings
   - Identifies key concepts
   - Creates semantic connections between related information
   - Generates vector embeddings

This may take a few moments as the LLM processes the text and builds the graph structure:

In [None]:
await cognee.add(developer_intro, node_set=["developer_data"])
await cognee.add(human_agent_conversations, node_set=["developer_data"])
await cognee.add(python_zen_principles, node_set=["principles_data"])

await cognee.cognify()

## üìä Visualize the Knowledge Graph

Let's explore the structure of our knowledge graph. The visualization shows:
- **Nodes**: Entities extracted from the text (concepts, technologies, people)
- **Edges**: Relationships and connections between entities
- **Clusters**: Related concepts grouped by semantic similarity

Open the generated HTML file in your browser to interactively explore the graph:

In [None]:
from cognee import visualize_graph
await visualize_graph('./visualization_1.html')

## üß† Enrich Memory with Memify

The `memify()` function analyzes the knowledge graph and generates intelligent rules about the data. This process:
- Identifies patterns and best practices
- Creates actionable guidelines based on the content
- Establishes relationships between different knowledge areas

These rules help the agent make more informed decisions when answering questions. Capturing a second visualization helps you compare how the graph densifies once enriched.


In [None]:
await cognee.memify()

await visualize_graph('./visualization_2.html')

## üîç Part 2: Intelligent Memory Retrieval

### Demonstration 1: Cross-Document Knowledge Integration

Now that our knowledge graph is built, let's test how Cognee combines information from multiple sources to answer complex questions. 

The first query demonstrates:
- **Semantic understanding**: Finding relevant concepts even when not explicitly mentioned
- **Cross-referencing**: Combining developer profile with Python principles
- **Contextual reasoning**: Applying best practices to specific implementations

### Demonstration 2: Filtered Search with NodeSets

The second query shows how to target specific subsets of the knowledge graph:
- Uses `node_name` parameter to search only within `principles_data`
- Provides focused answers from a specific knowledge domain
- Useful for when you need domain-specific information

In [None]:
# demonstrate cross-document knowledge retrieval from multiple data sources
from cognee.modules.search.types import SearchType

results = await cognee.search(
    query_text="How does my AsyncWebScraper implementation align with Python's design principles?",
    query_type=SearchType.GRAPH_COMPLETION,
)
print("Python Pattern Analysis:", results)

# demonstrate filtered search using NodeSet to query only specific subsets of memory
from cognee.modules.engine.models.node_set import NodeSet
results = await cognee.search(
    query_text="How should variables be named?",
    query_type=SearchType.GRAPH_COMPLETION,
    node_type=NodeSet,
    node_name=["principles_data"],
)
print("Filtered search result:", results)

## üîê Part 3: Session Management Setup

### Enabling Conversation Memory

Session management is crucial for maintaining context across multiple interactions. Here we'll:

1. **Initialize User Context**: Create or retrieve a user profile for session tracking
2. **Configure Cache Engine**: Connect to Redis for storing conversation history
3. **Enable Session Variables**: Set up context variables that persist across queries

> ‚ö†Ô∏è **Important**: This requires Redis to be running and `CACHING=true` in your environment

In [None]:
from cognee.modules.users.methods import get_default_user
from cognee.context_global_variables import set_session_user_context_variable 
from cognee.infrastructure.databases.cache import get_cache_engine

user = await get_default_user()
await set_session_user_context_variable(user)
print(f"Using user id: {getattr(user, 'id', 'unknown')}")

cache_engine = get_cache_engine()
if cache_engine is None:
    raise RuntimeError('Cache engine is not available. Double-check your cache configuration.')
print('Session cache is ready.')


## üõ†Ô∏è Helper Function: View Session History

This utility function allows us to inspect the conversation history stored in Redis. It's useful for:
- Debugging session management
- Verifying that conversations are being cached
- Understanding what context is available to the agent

In [None]:
async def show_history(session_id: str) -> None:
    # Let's check the cache directly
    cache_engine = get_cache_engine()
    if cache_engine:
        # Try to get history directly from cache
        user_id = str(user.id) if hasattr(user, 'id') else None
        if user_id:
            history_entries = await cache_engine.get_latest_qa(user_id, session_id, last_n=10)
            print(f"\nDirect cache query for user_id={user_id}, session_id={session_id}:")
            print(f"Found {len(history_entries)} entries")
            if history_entries:
                for i, entry in enumerate(history_entries, 1):
                    print(f"\nEntry {i}:")
                    print(f"  Question: {entry.get('question', 'N/A')[:100]}...")
                    print(f"  Answer: {entry.get('answer', 'N/A')[:100]}...")
        else:
            print("No user_id available")


## Session 1: Async Support Lab ‚Äî First Question

Kick off the `async-support-lab` session by asking for telemetry-friendly asyncio patterns for a massive web scraper. The graph already knows about asyncio, aiohttp, and monitoring practices, so the response should mirror prior conversations while tailoring the answer to the new query.


In [None]:
session_1 = "async-support-lab"

result = await cognee.search(
    query_type=SearchType.GRAPH_COMPLETION,
    query_text="I'm building a web scraper that hits thousands of URLs concurrently. What's a reliable asyncio pattern with telemetry?",
    session_id=session_1
)

## Inspect Session 1 Memory After the First Exchange

Running `show_history(session_1)` immediately after the initial question confirms that Cognee wrote both the prompt and completion into Redis. You should see one entry with the concurrency guidance.


In [None]:
await show_history(session_1)

## Session 1: Follow-up on Data Models

Next we ask, "When should I pick dataclasses versus Pydantic?" using the same session id. Cognee should stitch together the Python principles plus prior FastAPI conversations to provide nuanced advice‚Äîdemonstrating that context carries over within a named session.


In [None]:
result = await cognee.search(
    query_type=SearchType.GRAPH_COMPLETION,
    query_text="When should I pick dataclasses versus Pydantic for this work?",
    session_id=session_1
)

## Confirm Session 1 History Contains Both Turns

Another `show_history(session_1)` call should reveal two Q&A entries. This matches the Mem0 lab's "memory replay" step and proves that additional turns extend the same transcript.


In [None]:
await show_history(session_1)

## Session 2: Design Review Thread ‚Äî Fresh Session

To show isolation between threads we spin up `design-review-session` and ask for logging guidance for incident reviews. Even though the underlying knowledge base is the same, the new session id keeps transcripts separate.


In [None]:
session_2 = "design-review-session"

result = await cognee.search(
    query_type=SearchType.GRAPH_COMPLETION,
    query_text="We're drafting logging guidance for incident reviews. Capture the key principles please.",
    session_id=session_2
)

## Review Session 2 History

`show_history(session_2)` should only list the design-review prompt/response pair. Compare it with Session 1 to highlight how Cognee keeps independent transcripts while reusing the shared knowledge graph.


In [None]:
await show_history(session_2)

## Summary 

Congratulations! You‚Äôve just given your coding assistant a real long-term memory layer powered by Cognee.

In this tutorial you took raw developer content (code, docs, chats) and turned it into a graph + vector memory that your agent can search, reason over, and continuously improve.

What You‚Äôve Learned

1. **From raw text to AI memory**: How Cognee ingests unstructured data and turns it into intelligent, searchable memory using a combined vector + knowledge graph architecture.

2. **Graph enrichment with memify**: How to go beyond basic graph creation and use memify to add derived facts and richer relationships on top of your existing graph. 

3. **Multiple search strategies**: How to query memory with different search types (graph-aware Q&A, RAG-style completion, insights, raw chunks, code search, etc.) depending on what your agent needs. 

4. **Visual exploration**: How to inspect and debug what Cognee built using graph visualizations and the Cognee UI, so you can actually see how knowledge is structured. 

5. **Session-aware memory**: How to combine per-session context with persistent semantic memory so that agents can remember across runs without leaking information between users. 

## Key Takeaways
1. Memory as a Knowledge Graph backed by Embeddings

    - **Structured understanding**: Cognee combines a vector store and a graph store so your data is both searchable by meaning and connected by relationships. Cognee uses file-based databases by default (LanceDB for vector-, Kuzu for graph database)

    - **Relationship-aware retrieval**: Answers can be grounded not only in ‚Äúsimilar text,‚Äù but also in how entities relate.

    - **Living memory**: The memory layer evolves, grows, and stays queryable as one connected graph. 

2. Search & Reasoning Modes
    - **Hybrid retrieval**: search blends vector similarity, graph structure, and LLM reasoning, from raw chunk lookup to graph-aware question answering. 

    - **Fit the mode to the job**: Use completion-style modes when you want natural language answers, and chunk/summary/graph modes when your agent needs raw context or to drive its own reasoning.

3. Personalized, Session-Aware Agents
    - **Session context + long-term memory**: Cognee keeps short-term ‚Äúthread‚Äù context separate from long-lived, user- or org-level memory. 

## Real-World Applications

1. **Vertical AI Agents**

    Use the pattern from this notebook to power domain-smart copilots that sit on top of Cognee as their retrieval and reasoning core:

- **Developer copilots**: Code review, incident analysis, and architecture assistants that traverse code, APIs, design docs, and tickets as a single memory graph.

- **Customer-facing copilots**: Support or success agents that pull from product docs, FAQs, CRM notes, and past tickets with graph-aware retrieval and cited answers.

- **Internal expert copilots**: Policy, legal, or security assistants that reason over interconnected rules, guidelines, and historical decisions instead of isolated PDFs.

    Cognee is explicitly positioned as persistent, accurate memory for AI agents, providing a living knowledge graph that slots in behind your agent and replaces ad-hoc combinations of vector stores and custom graph code. 

2. **Unifying Data Silos into One Memory**

    The same approach also helps you build a unified memory layer across scattered sources:

- **From silos to one graph**: Ingest structured (e.g., databases) and unstructured data (e.g., docs, chats) into a single graph backed by embeddings, rather than separate indices for each system. 

- **Cross-source reasoning with citations**: Run multi-step reasoning over everything‚Äî‚Äújoin‚Äù logs, metrics, and docs via the graph‚Äîand still return grounded answers with provenance. 

- **Knowledge hubs**: For domains like banking or education, Cognee is already used to unify PDFs, internal systems, and app data into one knowledge graph with vectors so agents can answer questions with precise, cited context. 

## Next Steps

You‚Äôve implemented the core memory loop. Here are natural extensions you can try on your own (see [Cognee documentation](https://docs.cognee.ai/) for details):

1. **Experiment with temporal awareness**: Turn on temporal cognify to extract events and timestamps from text.

2. **Introduce ontology-driven reasoning** Define an OWL ontology for your domain. Use Cognee‚Äôs ontology support so extracted entities and relations are grounded in that schema, improving graph quality and domain-specific answers. 

3. **Add a feedback loop**: Let Cognee adjust graph edge weights from real user feedback, so retrieval improves over time instead of staying static. 

4. **Tune for personalization & session behavior**: Use user IDs, tenants, and datasets to give each person or team their own view over the shared memory engine. 

5. **Scale out to more complex agents**: Plug Cognee into agent frameworks to build multi-agent systems that all share the same memory layer. *Microsoft Agent Framework x Cognee plugin is coming soon.*