Documentation
Complete guide to using semantic-node-router in your applications
Overview
Semantic routing uses vector embeddings to make fast routing decisions based on semantic meaning, rather than relying on slower LLM calls or brittle keyword matching.
This enables you to quickly route user queries to the appropriate handler or function based on what the query means, not just what words it contains.
Encoders
☁️OpenAI Encoder
Use OpenAI's embedding API for high-accuracy semantic routing.
import { OpenAIEncoder } from 'semantic-node-router';
const encoder = new OpenAIEncoder({
apiKey: process.env.OPENAI_API_KEY, // Or set OPENAI_API_KEY env var
model: 'text-embedding-3-small', // Default
scoreThreshold: 0.3, // Default
dimensions: undefined, // Optional, for embedding-3 models
maxRetries: 3 // Default
});Supported Models:
text-embedding-3-small(default) - Fast and efficienttext-embedding-3-large- Higher accuracytext-embedding-ada-002- Legacy model
💻Transformers Encoder
Use local Hugging Face models for offline, free embeddings with no API key required.
import { TransformersEncoder } from 'semantic-node-router';
const encoder = new TransformersEncoder({
modelName: 'Xenova/all-MiniLM-L6-v2', // Default
quantized: true, // Default
scoreThreshold: 0.5, // Default
cacheDir: './models', // Optional
device: 'cpu' // Default: 'cpu' or 'gpu'
});
// IMPORTANT: Must initialize before use
await encoder.initialize(); // Loads model (1-10s, one-time)Benefits:
- No API key needed
- Completely offline
- Fast inference (10-50ms)
- Privacy-friendly
Trade-offs:
- Model download (~80-160MB)
- Memory usage (~200-400MB)
- Slightly lower accuracy
Routes
Routes define categories with example utterances that represent different intents or handlers.
import { Route } from 'semantic-node-router';
const route = new Route({
name: 'route-name', // Required: unique identifier
utterances: ['example 1', '...'], // Required: example phrases
description: 'What this handles', // Optional
scoreThreshold: 0.7, // Optional: override encoder default
metadata: { custom: 'data' } // Optional: custom metadata
});Router
The Router orchestrates the routing process, managing routes and performing semantic matching.
import { Router } from 'semantic-node-router';
const router = new Router({
routes: [route1, route2], // Required: array of routes
encoder: encoder, // Required: encoder instance
aggregationMethod: 'max', // Optional: 'max' | 'mean' | 'sum'
topK: 1, // Optional: default top matches
enableCache: true, // Optional: enable LRU cache
cacheSize: 1000 // Optional: cache size
});
// Initialize (encodes all utterances)
await router.initialize();
// Route a query
const result = await router.route('my query');
// { route: 'route-name' | null, score: 0.87 }
// Get top K matches
const matches = await router.routeTopK('my query', 3);Configuration
Aggregation Methods
When a route has multiple utterances, how should similarities be combined?
const router = new Router({
routes,
encoder,
aggregationMethod: 'max' // or 'mean' or 'sum'
});Score Thresholds
Control routing confidence with thresholds:
// Global threshold (applies to all routes)
const encoder = new OpenAIEncoder({
scoreThreshold: 0.5 // Balanced matching
});
// Per-route threshold (overrides global)
const route = new Route({
name: 'sensitive_action',
utterances: ['delete my account'],
scoreThreshold: 0.9 // Require very high confidence
});Threshold Guidelines:
- 0.3 - Very loose matching, many false positives
- 0.5 - Balanced (good default)
- 0.7 - Stricter, fewer false positives
- 0.9 - Very strict, only near-exact semantic matches
Best Practices for Utterances
Include Diverse Variations
Provide multiple ways users might express the same intent:
new Route({
name: 'check_balance',
utterances: [
// Formal
'What is my account balance?',
'Please show my current balance',
// Informal
'how much money do I have',
'what\'s my balance',
'check my balance',
// Different phrasings
'I want to see my balance',
'Can you tell me my balance?',
'balance inquiry'
]
})✅ Do
- • Include 8-15 diverse utterances per route
- • Use synonyms and alternative terms
- • Include formal and informal phrasings
- • Consider common typos and fragments
- • Make utterances semantically distinct across routes
❌ Don't
- • Use less than 3 utterances per route
- • Duplicate utterances across routes
- • Make utterances too similar across routes
- • Use only formal or only informal language
- • Exceed 30 utterances (consider splitting)
Error Handling
The library provides specific error types for different failure scenarios:
import { Router, RateLimitError, AuthenticationError } from 'semantic-node-router';
try {
await router.initialize();
} catch (error) {
if (error instanceof RateLimitError) {
console.error('Rate limit hit:', error.message);
await new Promise(resolve => setTimeout(resolve, error.retryAfter * 1000));
await router.initialize(); // Retry
} else if (error instanceof AuthenticationError) {
console.error('Authentication failed:', error.message);
// Fix API key
}
}Available Error Types:
SemanticRouterError- Base error classRouterConfigurationError- Invalid configurationRouterNotInitializedError- Used before initializationEncodingError- Encoding failureRateLimitError- API rate limit exceededAuthenticationError- API authentication failed
Advanced Usage
Dynamic Routes
Add, remove, or update routes at runtime:
// Add route dynamically
await router.addRoute(new Route({
name: 'new-route',
utterances: ['example']
}));
// Remove route
const removed = router.removeRoute('route-name');
// Get all routes
const routes = router.getRoutes();