Local AI Embeddings

WDG uses local embedding models for semantic code search. Indexing happens automatically when you commit code - no manual intervention required after initial setup.

How It Works

Traditional Search vs Semantic Search

Traditional (Keyword) Search:

Searches for exact text matches
Misses similar concepts with different wording
Can't understand code context

Semantic (AI) Search:

Understands meaning and context
Finds similar code patterns
Recognizes related concepts
Works across languages (PHP, JS, CSS)

How Local Embeddings Work

%%{init: {'theme':'neutral'}}%%
graph LR
    Code[Your Code] --> Model[Local AI Model]
    Model --> Vectors[384-dim Vectors]
    Vectors --> Qdrant[(Qdrant DB)]

    Query[Search Query] --> Model2[Same Model]
    Model2 --> QVector[Query Vector]
    QVector --> Qdrant
    Qdrant --> Results[Similar Code]

Code Indexing: Your code is processed by a local AI model
Vector Generation: Each code chunk becomes a 384-dimensional vector
Storage: Vectors are stored in Qdrant database
Search: Queries are converted to vectors and compared
Results: Most similar code chunks are returned

Available Models

Configure in .env:

bash

EMBEDDING_MODEL=all-MiniLM-L6-v2

Model Comparison

Model	Speed	Quality	Size	Dimensions	Use Case
`all-MiniLM-L6-v2`	⚡⚡⚡	★★★☆☆	80MB	384	Default - Fast indexing
`all-mpnet-base-v2`	⚡⚡☆	★★★★☆	420MB	768	Better accuracy
`all-MiniLM-L12-v2`	⚡⚡⚡	★★★☆☆	120MB	384	Balanced
`all-distilroberta-v1`	⚡⚡☆	★★★★☆	290MB	768	High quality
`multi-qa-MiniLM-L6-cos-v1`	⚡⚡⚡	★★★★☆	80MB	384	Q&A optimized

Switching Models

⚠️ WARNING

Changing models requires re-indexing all code!

bash

# 1. Update .env
EMBEDDING_MODEL=all-mpnet-base-v2

# 2. Restart indexer service to load new model
docker compose restart indexer

# 3. Re-index Wikit framework
wdg index

# 4. Re-index your projects
wdg index my-project

What Gets Indexed

The indexer intelligently chunks your code:

PHP Files

Functions with full body
Classes with methods
WordPress hooks with context
DocBlock comments

Example chunk:

php

// Indexed as one unit:
function get_user_by_email($email) {
    global $wpdb;
    return $wpdb->get_row(
        $wpdb->prepare(
            "SELECT * FROM users WHERE email = %s",
            $email
        )
    );
}

JavaScript Files

Functions (regular, arrow, async)
React components
Event handlers
Module exports

CSS/SCSS Files

Chunked by selectors
Media queries preserved
Variables and mixins

Wikit Blocks

block.json configurations
registerBlockType calls
Block metadata

Markdown/Documentation Files

Sections split by headers (H1-H3)
Code examples preserved with language tags
Technical documentation indexed semantically
README and wiki files

Example chunk:

markdown

## User Authentication

The authentication system uses JWT tokens...

```php
function authenticate_user($credentials) {
    // Indexed as code example
}


### Other Files
- JSON configuration files
- YAML files
- CSS chunked by selectors (50 lines per chunk)

## Performance Optimization

### First-Time Setup
```bash
# Initial model download (one-time)
Downloading model: ~80MB
Time: 1-2 minutes

# Indexing Wikit framework
Files: ~5000
Time: 5-7 minutes
Vectors created: ~15,000

Automatic Indexing via Git Hooks

bash

# Model already cached in Docker
Loading time: <1 second

# Git post-commit hook triggers indexing
# Only changed files are indexed
# Happens automatically on: git commit, git merge
Time: seconds per file

Memory Usage

Operation	RAM Usage	CPU Usage
Idle	50MB	0%
Model Loading	200-500MB	20%
Indexing	300-800MB	40-60%
Searching	100-200MB	10%

How Indexing Works

The indexing process happens automatically through git hooks:

Code Parsing: When you commit code, the indexer extracts semantic components
- PHP: Functions, classes, WordPress hooks
- JavaScript: Functions, React components, event handlers
- Other files: Chunked by logical sections
Embedding Generation: Each code chunk is converted to a 384-dimensional vector using the local Sentence Transformer model
Storage in Qdrant: Vectors are stored in the vector database with metadata:
- File path and line number
- Component type (function, class, hook)
- Language and repository information
- Project association

Search Examples

Finding Similar Functions

When you search for "get user by ID", the system finds:

getUserById()
fetch_user_by_identifier()
loadUserFromDatabase($id)
wp_get_user($user_id)

Even though none have "get user by ID" exactly!

Cross-Language Search

Search: "validate email"

Finds across all languages:

PHP: is_valid_email($email)
JS: validateEmailAddress(email)
Regex: /^[^@]+@[^@]+\.[^@]+$/

Pattern Recognition

Search: "database query with prepare statement"

Finds all secure database patterns:

php

$wpdb->prepare("SELECT * FROM...", $var)
$stmt = $pdo->prepare(...)
mysqli_prepare($conn, ...)

Privacy & Security

Local Processing

%%{init: {'theme':'neutral'}}%%
graph TB
    subgraph "Your Machine"
        Code[Your Code]
        Model[AI Model]
        Vectors[Vectors]
        DB[(Qdrant)]

        Code --> Model
        Model --> Vectors
        Vectors --> DB
    end

All processing occurs locally:

Source code
Embeddings/vectors
Search queries
Results
Model weights

The system does not require external API connections for embedding generation or vector search operations.

Advanced Configuration

Custom Model Path

bash

# Use custom model location
export SENTENCE_TRANSFORMERS_HOME=/path/to/models

Batch Processing

python

# Index multiple files at once
embeddings = model.encode(
    code_chunks,
    batch_size=32,
    show_progress_bar=True
)

GPU Acceleration (Advanced)

GPU acceleration requires Docker GPU passthrough configuration. This is an advanced setup not covered in standard installation.

If you have an NVIDIA GPU and want faster indexing, you'll need to:

Install nvidia-docker2
Modify the indexer service in docker-compose.yml to enable GPU access
Rebuild the indexer container with CUDA-enabled PyTorch

Troubleshooting

Model Download Issues

If the indexer fails to download the model on first run:

bash

# Check indexer logs
docker compose logs indexer

# Restart indexer to retry download
docker compose restart indexer

# Manually trigger download in container
docker exec wdg-indexer python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('all-MiniLM-L6-v2')"

Indexing Performance

If indexing is slow:

bash

# Use faster model (update .env)
EMBEDDING_MODEL=all-MiniLM-L6-v2

# Restart indexer service
docker compose restart indexer

# Check resource usage
docker stats wdg-indexer

Memory Issues

If indexer runs out of memory:

bash

# Increase Docker memory limit in compose.yml
# Under indexer service, adjust:
deploy:
  resources:
    limits:
      memory: 3G  # Increase from 2G

# Or use a smaller model
EMBEDDING_MODEL=all-MiniLM-L6-v2

Best Practices

1. Choose the Right Model

Speed priority: all-MiniLM-L6-v2
Quality priority: all-mpnet-base-v2
Multilingual: paraphrase-multilingual-MiniLM-L12-v2

2. Leverage Automatic Indexing

bash

# Indexing happens automatically via git hooks
# Just commit your changes:
git commit -m "Add new feature"

# The post-commit hook will:
# - Detect changed files
# - Index them automatically
# - Update the vector database

# Manual indexing only needed for:
# - Initial project setup: wdg index my-site
# - After pulling Wikit updates: wdg index

3. Collection Management

bash

# Separate collections per project
wdg index project1  # Creates: project_project1
wdg index project2  # Creates: project_project2

# Clean old collections
wdg collections delete project_old_site

Local AI Embeddings

How It Works

Traditional Search vs Semantic Search

How Local Embeddings Work

Available Models

Model Comparison

Switching Models

What Gets Indexed

PHP Files

JavaScript Files

CSS/SCSS Files

Wikit Blocks

Markdown/Documentation Files

Automatic Indexing via Git Hooks

Memory Usage

How Indexing Works

Search Examples

Finding Similar Functions

Cross-Language Search

Pattern Recognition

Privacy & Security

Local Processing

Advanced Configuration

Custom Model Path

Batch Processing

GPU Acceleration (Advanced)

Troubleshooting

Model Download Issues

Indexing Performance

Memory Issues

Best Practices

1. Choose the Right Model

2. Leverage Automatic Indexing

3. Collection Management

The Future

Coming Soon

Research & Development

Local AI Embeddings ​

How It Works ​

Traditional Search vs Semantic Search ​

How Local Embeddings Work ​

Available Models ​

Model Comparison ​

Switching Models ​

What Gets Indexed ​

PHP Files ​

JavaScript Files ​

CSS/SCSS Files ​

Wikit Blocks ​

Markdown/Documentation Files ​

Automatic Indexing via Git Hooks ​

Memory Usage ​

How Indexing Works ​

Search Examples ​

Finding Similar Functions ​

Cross-Language Search ​

Pattern Recognition ​

Privacy & Security ​

Local Processing ​

Advanced Configuration ​

Custom Model Path ​

Batch Processing ​

GPU Acceleration (Advanced) ​

Troubleshooting ​

Model Download Issues ​

Indexing Performance ​

Memory Issues ​

Best Practices ​

1. Choose the Right Model ​

2. Leverage Automatic Indexing ​

3. Collection Management ​

The Future ​

Coming Soon ​

Research & Development ​

Local AI Embeddings

How It Works

Traditional Search vs Semantic Search

How Local Embeddings Work

Available Models

Model Comparison

Switching Models

What Gets Indexed

PHP Files

JavaScript Files

CSS/SCSS Files

Wikit Blocks

Markdown/Documentation Files

Automatic Indexing via Git Hooks

Memory Usage

How Indexing Works

Search Examples

Finding Similar Functions

Cross-Language Search

Pattern Recognition

Privacy & Security

Local Processing

Advanced Configuration

Custom Model Path

Batch Processing

GPU Acceleration (Advanced)

Troubleshooting

Model Download Issues

Indexing Performance

Memory Issues

Best Practices

1. Choose the Right Model

2. Leverage Automatic Indexing

3. Collection Management

The Future

Coming Soon

Research & Development