Guardrails Middleware¶
Since: v7.42.0 | Status: Stable | Availability: SDK (CLI + SDK)
Overview¶
What it does: Guardrails middleware provides real-time content filtering and policy enforcement for AI model outputs, blocking profanity, PII, unsafe content, and custom-defined terms.
Why use it: Protect your application from generating harmful, inappropriate, or non-compliant content. Ensures AI responses meet safety standards and regulatory requirements.
Common use cases:
- Content moderation for user-facing applications
- PII (Personally Identifiable Information) redaction
- Profanity filtering for family-friendly apps
- Compliance with industry regulations (COPPA, GDPR, etc.)
- Brand safety and reputation management
Quick Start¶
Zero Configuration
Guardrails work out of the box with the security
preset. No custom configuration required for basic content filtering.
SDK Example with Security Preset¶
import { NeuroLink } from "@juspay/neurolink";
const neurolink = new NeuroLink({
middleware: {
preset: "security", // (1)!
},
});
const result = await neurolink.generate({
// (2)!
prompt: "Tell me about security best practices",
});
// Output is automatically filtered for bad words and unsafe content
console.log(result.content); // (3)!
- Enables guardrails middleware with default configuration
- All generate/stream calls automatically apply filtering
- Content is already filtered - safe to display to users
Custom Guardrails Configuration¶
import { NeuroLink } from "@juspay/neurolink";
const neurolink = new NeuroLink({
middleware: {
preset: "security",
middlewareConfig: {
guardrails: {
enabled: true, // (1)!
config: {
badWords: {
enabled: true, // (2)!
list: ["spam", "scam", "inappropriate-term"], // (3)!
},
modelFilter: {
enabled: true, // (4)!
filterModel: "gpt-4o-mini", // (5)!
},
},
},
},
},
});
- Master switch for guardrails middleware
- Enable keyword-based filtering (fast, regex-based)
- Custom terms to filter/redact from outputs
- Enable AI-powered content safety check (slower, more accurate)
- Use fast, cheap model for safety evaluation
CLI Usage¶
# Enable guardrails via environment variable
export NEUROLINK_MIDDLEWARE_PRESET="security"
npx @juspay/neurolink generate "Write a product description" --enable-analytics
# Guardrails are automatically applied to all generations
Configuration¶
Option | Type | Default | Required | Description |
---|---|---|---|---|
enabled |
boolean |
true |
No | Enable/disable guardrails middleware |
badWords.enabled |
boolean |
false |
No | Enable keyword-based filtering |
badWords.list |
string[] |
[] |
No | List of terms to filter/redact |
modelFilter.enabled |
boolean |
false |
No | Enable AI-based content safety check |
modelFilter.filterModel |
string |
- | No | Model to use for safety evaluation |
Environment Variables¶
# Enable guardrails preset
export NEUROLINK_MIDDLEWARE_PRESET="security"
# Or enable all middleware (includes guardrails + analytics)
export NEUROLINK_MIDDLEWARE_PRESET="all"
Config File¶
// .neurolink.config.ts
export default {
middleware: {
preset: "security",
middlewareConfig: {
guardrails: {
enabled: true,
config: {
badWords: {
enabled: true,
list: [
// Custom filtered terms
"confidential",
"internal-use-only",
// PII patterns
"ssn",
"credit-card",
],
},
modelFilter: {
enabled: true,
filterModel: "gpt-4o-mini", // Fast, cheap safety model
},
},
},
},
},
};
How It Works¶
Filtering Pipeline¶
- User prompt → Sent to AI model
- AI generates response → Initial content created
- Guardrails middleware intercepts:
- Bad word filtering: Regex-based term replacement
- Model-based filtering: AI evaluates content safety
- Filtered response → Delivered to user
Bad Word Filtering¶
Simple regex-based replacement:
// Input: "This contains spam and other spam words"
// Output: "This contains **** and other **** words"
- Case-insensitive matching
- Replaces with asterisks (
*
) of equal length - Works in both
generate
andstream
modes
Model-Based Filtering¶
PII Detection Accuracy
While guardrails filter common PII patterns, always review critical outputs manually. False negatives can occur with obfuscated data or uncommon PII formats. For high-stakes compliance, combine with dedicated PII detection services.
AI-powered safety check:
// Guardrails sends content to filter model:
// "Is the following text safe? Respond with only 'safe' or 'unsafe'."
// If unsafe:
// Output: "<REDACTED BY AI GUARDRAIL>"
- Uses separate, lightweight model (e.g.,
gpt-4o-mini
) - Binary safe/unsafe classification
- Full redaction on unsafe detection
Advanced Usage¶
Combining with Other Middleware¶
import { NeuroLink } from "@juspay/neurolink";
const neurolink = new NeuroLink({
middleware: {
preset: "all", // Enables guardrails + analytics + others
middlewareConfig: {
guardrails: {
enabled: true,
config: {
badWords: {
enabled: true,
list: ["profanity1", "profanity2"],
},
},
},
analytics: {
enabled: true,
},
},
},
});
Streaming with Guardrails¶
const stream = await neurolink.streamText({
prompt: "Write a long story",
});
// Chunks are filtered in real-time as they stream
for await (const chunk of stream) {
console.log(chunk.content); // Already filtered
}
Dynamic Guardrails¶
// Add/remove filtered terms dynamically
const customWords = await loadBlocklistFromDatabase();
const neurolink = new NeuroLink({
middleware: {
middlewareConfig: {
guardrails: {
config: {
badWords: {
enabled: true,
list: [...customWords, "static-term"],
},
},
},
},
},
});
API Reference¶
Middleware Configuration¶
preset: "security"
→ Enables guardrails with defaultspreset: "all"
→ Enables guardrails + all other middlewaremiddlewareConfig.guardrails
→ Custom guardrails configuration
See GUARDRAILS-AI-INTEGRATION.md for complete integration guide.
Troubleshooting¶
Problem: Guardrails not filtering content¶
Cause: Middleware not enabled or preset not configured Solution:
// Ensure preset is set or guardrails explicitly enabled
const neurolink = new NeuroLink({
middleware: {
preset: "security", // ← Must set this
},
});
Problem: Too many false positives (legitimate content filtered)¶
Cause: Overly aggressive bad word list Solution:
// Use more specific terms, avoid common words
config: {
badWords: {
list: [
"very-specific-bad-term", // Good
// "free", // Bad - too common
],
},
}
Problem: Model-based filter is slow¶
Cause: Using large/expensive model for filtering Solution:
// Switch to faster, cheaper model
config: {
modelFilter: {
enabled: true,
filterModel: "gpt-4o-mini", // ← Fast and cheap
// filterModel: "gpt-4", // ❌ Too slow/expensive
},
}
Problem: Guardrails not working in streaming mode¶
Cause: Streaming guardrails only support bad word filtering (not model-based) Solution:
// For streaming, rely on bad word filtering
// Model-based filtering works in generate() mode only
const result = await neurolink.generate({
// Use generate, not stream
prompt: "...",
});
Best Practices¶
Content Filtering Strategy¶
- Start with presets - Use
preset: "security"
as baseline - Layer protections - Combine bad words + model filtering
- Use lightweight filter models -
gpt-4o-mini
for speed - Test thoroughly - Verify filtering doesn't break legitimate content
- Monitor and iterate - Track false positives/negatives
Bad Word List Curation¶
✅ Do:
- Include specific harmful terms
- Use exact phrases, not single characters
- Regularly update based on user reports
- Consider context-specific terms for your domain
❌ Don't:
- Add common English words (high false positive rate)
- Include single letters or short words
- Rely solely on bad words (use model filter too)
Performance Optimization¶
// For high-throughput applications:
config: {
guardrails: {
badWords: {
enabled: true, // Fast regex filtering
list: [...criticalTerms],
},
modelFilter: {
enabled: false, // Disable for speed (or use sampling)
},
},
}
Compliance Use Cases¶
COPPA (Children's Online Privacy)¶
config: {
badWords: {
enabled: true,
list: ["email", "phone", "address", "age", "location"],
},
modelFilter: {
enabled: true, // Detect attempts to collect PII
},
}
GDPR Data Protection¶
config: {
badWords: {
enabled: true,
list: [
"credit-card", "ssn", "passport",
"bank-account", "medical-record",
],
},
}
Related Features¶
- HITL Workflows - User approval for risky actions
- Middleware Architecture - Custom middleware development
- Analytics Integration - Track filtered content metrics
Migration Notes¶
If upgrading from versions before v7.42.0:
- Guardrails are now enabled via middleware presets
- Old
guardrailsConfig
option deprecated - usemiddlewareConfig.guardrails
- No breaking changes - existing configs still work
- Recommended: Switch to
preset: "security"
for simplified setup
For complete technical documentation and advanced integration patterns, see GUARDRAILS-AI-INTEGRATION.md.