Collection Management
Manage vector database collections for AI-powered code search across projects and repositories.
Overview
Collections are isolated vector databases that store code embeddings for semantic search:
- Framework Collection:
wdg_framework- Wikit core code - Project Collections:
project_<name>- Project-specific code - Platform Collection:
platform- Platform infrastructure code
List Collections
View All Collections
bash
wdg collections listOutput:
Vector Database Collections:
wdg_framework
Description: Wikit Framework
Vectors: 15,234
Size: 48.2 MB
Dimensions: 768
Distance: Cosine
Last updated: 2024-10-14 09:15:00
Indexed files: ~5,000
project_my_site
Description: Project: my-site
Vectors: 1,450
Size: 4.7 MB
Dimensions: 768
Distance: Cosine
Last updated: 2024-10-14 10:30:00
Indexed files: ~500
project_client_website
Description: Project: client-website
Vectors: 3,892
Size: 12.1 MB
Dimensions: 768
Distance: Cosine
Last updated: 2024-10-13 16:45:00
Indexed files: ~1,200
platform
Description: Platform Infrastructure
Vectors: 892
Size: 2.8 MB
Dimensions: 768
Distance: Cosine
Last updated: 2024-10-12 14:20:00
Indexed files: ~300
Total: 4 collections, 21,468 vectors, 67.8 MBList with Filters
bash
# Only project collections
wdg collections list --type=project
# Only active projects
wdg collections list --active
# Sort by size
wdg collections list --sort=size
# Sort by last updated
wdg collections list --sort=updatedCollection Details
Get Collection Info
bash
wdg collections info <collection-name>Example:
bash
wdg collections info project_my_siteOutput:
Collection: project_my_site
Type: Project
Status: Active
Statistics:
Vectors: 1,450
Storage: 4.7 MB
Dimensions: 768
Distance metric: Cosine
Optimization: Indexed
Metadata:
Project: my-site
Created: 2024-10-01 14:30:00
Last indexed: 2024-10-14 10:30:00
Last query: 2024-10-14 11:15:00
Indexed Content:
PHP files: 245 (890 vectors)
JavaScript files: 120 (380 vectors)
CSS files: 45 (110 vectors)
Other files: 90 (70 vectors)
Top Components:
Functions: 520
Classes: 85
Hooks: 145
Blocks: 12
Performance:
Average query time: 45ms
Cache hit rate: 38%
Optimization: ✓ OptimalCreate Collections
Manual Collection Creation
bash
wdg collections create <name> [--description="..."]Example:
bash
wdg collections create project_demo \
--description="Demo Project" \
--dimensions=384 \
--distance=Cosine💡 TIP
Collections are automatically created when indexing new projects. Manual creation is rarely needed.
Create from Project
bash
wdg collections create-from-project <project-name>Analyzes project and creates optimally configured collection.
Delete Collections
Delete Collection
bash
wdg collections delete <collection-name>Example:
bash
wdg collections delete project_old_siteConfirmation prompt:
⚠️ WARNING: This will permanently delete the collection!
Collection: project_old_site
Vectors: 1,234
Size: 3.9 MB
This action cannot be undone.
Are you sure you want to delete? (y/N):Force Delete (No Confirmation)
bash
wdg collections delete project_old_site --forceDelete Multiple Collections
bash
# Delete all inactive project collections
wdg collections delete --inactive
# Delete by pattern
wdg collections delete "project_old_*"
# Delete all project collections
wdg collections delete --type=project --forceUpdate Collections
Re-index Collection
bash
wdg collections reindex <collection-name>Example:
bash
wdg collections reindex project_my_siteWhat it does:
- Clears existing vectors
- Re-parses all files
- Generates new embeddings
- Rebuilds collection
- Optimizes indexes
Optimize Collection
bash
wdg collections optimize <collection-name>Optimizes collection for faster search:
- Rebuilds HNSW index
- Removes deleted vectors
- Compacts storage
- Updates statistics
Update Metadata
bash
wdg collections update <collection-name> <key> <value>Examples:
bash
# Update description
wdg collections update project_my_site description "Client Website"
# Add custom metadata
wdg collections update project_my_site client_name "Acme Corp"Search Collections
Search Within Collection
bash
wdg collections search <collection-name> "<query>" [--limit=10]Example:
bash
wdg collections search project_my_site "custom post type registration"Output:
Search Results (3 found):
1. functions.php:45 (score: 0.89)
function register_portfolio_cpt() {
register_post_type('portfolio', [...]);
}
2. inc/post-types.php:12 (score: 0.84)
class CustomPostTypes {
public function register_types() {
// Register custom post types
}
}
3. lib/register.php:78 (score: 0.76)
add_action('init', 'register_all_post_types');Cross-Collection Search
bash
# Search all collections
wdg collections search-all "<query>"
# Search multiple collections
wdg collections search-multi "project_my_site,project_client" "<query>"Collection Analytics
Usage Statistics
bash
wdg collections stats [collection-name]For single collection:
Statistics for project_my_site:
Storage:
Vectors: 1,450
Size on disk: 4.7 MB
Average vector size: 3.2 KB
Compression ratio: 85%
Performance:
Average query time: 45ms
Fastest query: 12ms
Slowest query: 230ms
Queries today: 156
Popular Searches:
1. "custom post type" (23 queries)
2. "email validation" (18 queries)
3. "user authentication" (15 queries)
Content Breakdown:
PHP: 890 vectors (61%)
JavaScript: 380 vectors (26%)
CSS: 110 vectors (8%)
Other: 70 vectors (5%)For all collections:
Global Collection Statistics:
Total Collections: 4
Total Vectors: 21,468
Total Storage: 67.8 MB
Collection Sizes:
wdg_framework: 48.2 MB (71%)
project_client_website: 12.1 MB (18%)
project_my_site: 4.7 MB (7%)
platform: 2.8 MB (4%)
Query Performance:
Average: 52ms
P50: 45ms
P95: 180ms
P99: 350ms
Cache Performance:
Hit rate: 42%
Miss rate: 58%
Cache size: 256 MBGrowth Tracking
bash
wdg collections growth [collection-name] [--period=week|month]Shows collection growth over time:
Collection Growth: project_my_site
Week of 2024-10-07:
Monday: +45 vectors
Tuesday: +23 vectors
Wednesday: +67 vectors
Thursday: +12 vectors
Friday: +89 vectors
Weekend: +8 vectors
Total growth: +244 vectors (20% increase)
Average per day: 35 vectors
Projection: 1,694 vectors by end of monthBackup and Restore
Backup Collection
bash
wdg collections backup <collection-name> [--output=<path>]Example:
bash
wdg collections backup project_my_site \
--output=backups/my-site-$(date +%Y%m%d).qdrantWhat's included:
- All vectors
- Metadata
- Index structure
- Configuration
Restore Collection
bash
wdg collections restore <backup-file>Example:
bash
wdg collections restore backups/my-site-20241014.qdrantBackup All Collections
bash
wdg collections backup-all [--output=<directory>]Collection Maintenance
Cleanup Orphaned Vectors
bash
wdg collections cleanup <collection-name>Removes vectors for files that no longer exist.
Verify Collection Integrity
bash
wdg collections verify <collection-name>Output:
Verifying collection: project_my_site
Checking vectors...
✓ All vectors valid
✓ No duplicate vectors
✓ Metadata consistent
Checking indexes...
✓ HNSW index intact
✓ Payload index valid
✓ Full-text index ok
Checking files...
✓ All source files exist
⚠ 3 vectors for deleted files
→ Run cleanup to remove orphaned vectors
Status: Healthy (with warnings)Compact Collection
bash
wdg collections compact <collection-name>Reduces storage by:
- Removing deleted vectors
- Optimizing index structures
- Compressing metadata
Collection Migration
Export Collection
bash
wdg collections export <collection-name> --format=<json|csv|parquet>Example:
bash
wdg collections export project_my_site --format=json \
--output=exports/my-site-vectors.jsonImport Collection
bash
wdg collections import <file> --name=<collection-name>Example:
bash
wdg collections import exports/my-site-vectors.json \
--name=project_my_site_restoredMerge Collections
bash
wdg collections merge <source1> <source2> --output=<new-collection>Example:
bash
wdg collections merge project_old project_new \
--output=project_combinedAdvanced Operations
Collection Snapshots
bash
# Create snapshot
wdg collections snapshot <collection-name> [--name=<snapshot-name>]
# List snapshots
wdg collections snapshots <collection-name>
# Restore from snapshot
wdg collections restore-snapshot <collection-name> <snapshot-name>
# Delete snapshot
wdg collections delete-snapshot <collection-name> <snapshot-name>Vector Operations
bash
# Count vectors matching filter
wdg collections count <collection-name> --filter='type="function"'
# Find duplicate vectors
wdg collections duplicates <collection-name>
# Update vector metadata
wdg collections update-vectors <collection-name> \
--filter='file_type="php"' \
--set language="php8"Configuration
Collection Settings
bash
# View collection config
wdg collections config <collection-name>
# Update settings
wdg collections config <collection-name> set <key> <value>Common settings:
bash
# Optimize for search speed
wdg collections config project_my_site set \
hnsw_ef_construct 200
# Optimize for memory
wdg collections config project_my_site set \
hnsw_m 16
# Enable compression
wdg collections config project_my_site set \
compression trueMonitoring
Real-Time Collection Monitoring
bash
wdg collections monitor [collection-name]Output:
Monitoring: project_my_site (Press Ctrl+C to exit)
14:23:15 Query: "custom post type" (47ms, 5 results)
14:23:42 Indexed: functions.php (+12 vectors)
14:24:08 Query: "email validation" (38ms, 3 results)
14:24:35 Optimized indexes (saved 450KB)
14:25:12 Query: "user auth" (52ms, 8 results)
Current stats:
Vectors: 1,462 (+12 since start)
Queries: 3 (avg 45ms)
Size: 4.72 MBCollection Alerts
bash
# Set up alerts
wdg collections alert <collection-name> \
--threshold=size:100MB \
--threshold=queries:1000 \
--notify=email:admin@wdg.comTroubleshooting
Collection Not Found
bash
# List available collections
wdg collections list
# Verify collection name
wdg collections info <collection-name>
# Recreate by re-indexing
wdg index <project-name>Slow Queries
bash
# Optimize collection
wdg collections optimize <collection-name>
# Check stats
wdg collections stats <collection-name>
# Rebuild indexes
wdg collections reindex <collection-name>Storage Issues
bash
# Check collection sizes
wdg collections list --sort=size
# Compact large collections
wdg collections compact <collection-name>
# Delete unused collections
wdg collections delete --inactiveSee Also: