Local AI Embeddings
WDG uses local embedding models for semantic code search. Indexing happens automatically when you commit code - no manual intervention required after initial setup.
How It Works
Traditional Search vs Semantic Search
Traditional (Keyword) Search:
- Searches for exact text matches
- Misses similar concepts with different wording
- Can't understand code context
Semantic (AI) Search:
- Understands meaning and context
- Finds similar code patterns
- Recognizes related concepts
- Works across languages (PHP, JS, CSS)
How Local Embeddings Work
%%{init: {'theme':'neutral'}}%%
graph LR
Code[Your Code] --> Model[Local AI Model]
Model --> Vectors[384-dim Vectors]
Vectors --> Qdrant[(Qdrant DB)]
Query[Search Query] --> Model2[Same Model]
Model2 --> QVector[Query Vector]
QVector --> Qdrant
Qdrant --> Results[Similar Code]
- Code Indexing: Your code is processed by a local AI model
- Vector Generation: Each code chunk becomes a 384-dimensional vector
- Storage: Vectors are stored in Qdrant database
- Search: Queries are converted to vectors and compared
- Results: Most similar code chunks are returned
Available Models
Configure in .env:
EMBEDDING_MODEL=all-MiniLM-L6-v2Model Comparison
| Model | Speed | Quality | Size | Dimensions | Use Case |
|---|---|---|---|---|---|
all-MiniLM-L6-v2 | ⚡⚡⚡ | ★★★☆☆ | 80MB | 384 | Default - Fast indexing |
all-mpnet-base-v2 | ⚡⚡☆ | ★★★★☆ | 420MB | 768 | Better accuracy |
all-MiniLM-L12-v2 | ⚡⚡⚡ | ★★★☆☆ | 120MB | 384 | Balanced |
all-distilroberta-v1 | ⚡⚡☆ | ★★★★☆ | 290MB | 768 | High quality |
multi-qa-MiniLM-L6-cos-v1 | ⚡⚡⚡ | ★★★★☆ | 80MB | 384 | Q&A optimized |
Switching Models
⚠️ WARNING
Changing models requires re-indexing all code!
# 1. Update .env
EMBEDDING_MODEL=all-mpnet-base-v2
# 2. Restart indexer service to load new model
docker compose restart indexer
# 3. Re-index Wikit framework
wdg index
# 4. Re-index your projects
wdg index my-project2
3
4
5
6
7
8
9
10
11
What Gets Indexed
The indexer intelligently chunks your code:
PHP Files
- Functions with full body
- Classes with methods
- WordPress hooks with context
- DocBlock comments
Example chunk:
// Indexed as one unit:
function get_user_by_email($email) {
global $wpdb;
return $wpdb->get_row(
$wpdb->prepare(
"SELECT * FROM users WHERE email = %s",
$email
)
);
}2
3
4
5
6
7
8
9
10
JavaScript Files
- Functions (regular, arrow, async)
- React components
- Event handlers
- Module exports
CSS/SCSS Files
- Chunked by selectors
- Media queries preserved
- Variables and mixins
Wikit Blocks
- block.json configurations
- registerBlockType calls
- Block metadata
Markdown/Documentation Files
- Sections split by headers (H1-H3)
- Code examples preserved with language tags
- Technical documentation indexed semantically
- README and wiki files
Example chunk:
## User Authentication
The authentication system uses JWT tokens...
```php
function authenticate_user($credentials) {
// Indexed as code example
}2
3
4
5
6
7
8
### Other Files
- JSON configuration files
- YAML files
- CSS chunked by selectors (50 lines per chunk)
## Performance Optimization
### First-Time Setup
```bash
# Initial model download (one-time)
Downloading model: ~80MB
Time: 1-2 minutes
# Indexing Wikit framework
Files: ~5000
Time: 5-7 minutes
Vectors created: ~15,0002
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
Automatic Indexing via Git Hooks
# Model already cached in Docker
Loading time: <1 second
# Git post-commit hook triggers indexing
# Only changed files are indexed
# Happens automatically on: git commit, git merge
Time: seconds per file2
3
4
5
6
7
Memory Usage
| Operation | RAM Usage | CPU Usage |
|---|---|---|
| Idle | 50MB | 0% |
| Model Loading | 200-500MB | 20% |
| Indexing | 300-800MB | 40-60% |
| Searching | 100-200MB | 10% |
How Indexing Works
The indexing process happens automatically through git hooks:
Code Parsing: When you commit code, the indexer extracts semantic components
- PHP: Functions, classes, WordPress hooks
- JavaScript: Functions, React components, event handlers
- Other files: Chunked by logical sections
Embedding Generation: Each code chunk is converted to a 384-dimensional vector using the local Sentence Transformer model
Storage in Qdrant: Vectors are stored in the vector database with metadata:
- File path and line number
- Component type (function, class, hook)
- Language and repository information
- Project association
Search Examples
Finding Similar Functions
When you search for "get user by ID", the system finds:
getUserById()fetch_user_by_identifier()loadUserFromDatabase($id)wp_get_user($user_id)
Even though none have "get user by ID" exactly!
Cross-Language Search
Search: "validate email"
Finds across all languages:
- PHP:
is_valid_email($email) - JS:
validateEmailAddress(email) - Regex:
/^[^@]+@[^@]+\.[^@]+$/
Pattern Recognition
Search: "database query with prepare statement"
Finds all secure database patterns:
$wpdb->prepare("SELECT * FROM...", $var)
$stmt = $pdo->prepare(...)
mysqli_prepare($conn, ...)2
3
Privacy & Security
Local Processing
%%{init: {'theme':'neutral'}}%%
graph TB
subgraph "Your Machine"
Code[Your Code]
Model[AI Model]
Vectors[Vectors]
DB[(Qdrant)]
Code --> Model
Model --> Vectors
Vectors --> DB
end
All processing occurs locally:
- Source code
- Embeddings/vectors
- Search queries
- Results
- Model weights
The system does not require external API connections for embedding generation or vector search operations.
Advanced Configuration
Custom Model Path
# Use custom model location
export SENTENCE_TRANSFORMERS_HOME=/path/to/models2
Batch Processing
# Index multiple files at once
embeddings = model.encode(
code_chunks,
batch_size=32,
show_progress_bar=True
)2
3
4
5
6
GPU Acceleration (Advanced)
GPU acceleration requires Docker GPU passthrough configuration. This is an advanced setup not covered in standard installation.
If you have an NVIDIA GPU and want faster indexing, you'll need to:
- Install nvidia-docker2
- Modify the indexer service in docker-compose.yml to enable GPU access
- Rebuild the indexer container with CUDA-enabled PyTorch
Troubleshooting
Model Download Issues
If the indexer fails to download the model on first run:
# Check indexer logs
docker compose logs indexer
# Restart indexer to retry download
docker compose restart indexer
# Manually trigger download in container
docker exec wdg-indexer python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('all-MiniLM-L6-v2')"2
3
4
5
6
7
8
Indexing Performance
If indexing is slow:
# Use faster model (update .env)
EMBEDDING_MODEL=all-MiniLM-L6-v2
# Restart indexer service
docker compose restart indexer
# Check resource usage
docker stats wdg-indexer2
3
4
5
6
7
8
Memory Issues
If indexer runs out of memory:
# Increase Docker memory limit in compose.yml
# Under indexer service, adjust:
deploy:
resources:
limits:
memory: 3G # Increase from 2G
# Or use a smaller model
EMBEDDING_MODEL=all-MiniLM-L6-v22
3
4
5
6
7
8
9
Best Practices
1. Choose the Right Model
- Speed priority:
all-MiniLM-L6-v2 - Quality priority:
all-mpnet-base-v2 - Multilingual:
paraphrase-multilingual-MiniLM-L12-v2
2. Leverage Automatic Indexing
# Indexing happens automatically via git hooks
# Just commit your changes:
git commit -m "Add new feature"
# The post-commit hook will:
# - Detect changed files
# - Index them automatically
# - Update the vector database
# Manual indexing only needed for:
# - Initial project setup: wdg index my-site
# - After pulling Wikit updates: wdg index2
3
4
5
6
7
8
9
10
11
12
3. Collection Management
# Separate collections per project
wdg index project1 # Creates: project_project1
wdg index project2 # Creates: project_project2
# Clean old collections
wdg collections delete project_old_site2
3
4
5
6
The Future
Coming Soon
- Fine-tuned models for WordPress/PHP
- Code completion using local LLMs
- Semantic diff for code review
- Multi-model support (different models per project)
Research & Development
- Training custom models on Wikit patterns
- Multi-modal embeddings (code + comments + docs)
- Cross-project code similarity detection