OpenAI Compatible Provider Guide¶

Connect to any OpenAI-compatible API: OpenRouter, vLLM, LocalAI, and more

Overview¶

The OpenAI Compatible provider enables NeuroLink to work with any service that implements the OpenAI API specification. This includes third-party aggregators like OpenRouter, self-hosted solutions like vLLM, and custom OpenAI-compatible endpoints.

Key Benefits¶

🌐 Universal Compatibility: Works with any OpenAI-compatible endpoint
🔄 Provider Aggregation: Access multiple providers through one endpoint (OpenRouter)
🏠 Self-Hosted: Run your own models with vLLM, LocalAI
💰 Cost Optimization: Compare pricing across providers
🔧 Custom Endpoints: Integrate proprietary AI services
📊 Auto-Discovery: Automatic model detection via /v1/models endpoint

Supported Services¶

Service	Description	Best For
OpenRouter	AI provider aggregator (100+ models)	Multi-provider access
vLLM	High-performance inference server	Self-hosted models
LocalAI	Local OpenAI alternative	Privacy, offline usage
Text Generation WebUI	Community inference server	Local LLMs
Custom APIs	Your own OpenAI-compatible service	Proprietary models

Quick Start¶

Option 1: OpenRouter (Recommended for Beginners)¶

OpenRouter provides access to 100+ models from multiple providers through a single API.

1. Get OpenRouter API Key¶

Visit OpenRouter.ai
Sign up for free account
Go to Keys
Create new key
Add credits ($5 minimum)

2. Configure NeuroLink¶

# Add to .env
OPENAI_COMPATIBLE_BASE_URL=https://openrouter.ai/api/v1
OPENAI_COMPATIBLE_API_KEY=sk-or-v1-your-key-here

3. Test Setup¶

# Auto-discover available models
npx @juspay/neurolink models --provider openai-compatible

# Generate with specific model
npx @juspay/neurolink generate "Hello from OpenRouter!" \
  --provider openai-compatible \
  --model "anthropic/claude-3.5-sonnet"

Option 2: vLLM (Self-Hosted)¶

vLLM is a high-performance inference server for running models locally.

1. Install vLLM¶

# Install vLLM
pip install vllm

# Start server with a model
python -m vllm.entrypoints.openai.api_server \
  --model mistralai/Mistral-7B-Instruct-v0.2 \
  --port 8000

2. Configure NeuroLink¶

# Add to .env
OPENAI_COMPATIBLE_BASE_URL=http://localhost:8000/v1
OPENAI_COMPATIBLE_API_KEY=none  # vLLM doesn't require key

3. Test Setup¶

npx @juspay/neurolink generate "Hello from vLLM!" \
  --provider openai-compatible

Option 3: LocalAI (Privacy-Focused)¶

LocalAI runs completely offline for maximum privacy.

1. Install LocalAI¶

# Using Docker
docker run -p 8080:8080 \
  -v $PWD/models:/models \
  localai/localai:latest

# Or install directly
curl https://localai.io/install.sh | sh

2. Configure NeuroLink¶

OPENAI_COMPATIBLE_BASE_URL=http://localhost:8080/v1
OPENAI_COMPATIBLE_API_KEY=none

Model Auto-Discovery¶

NeuroLink automatically discovers available models through the /v1/models endpoint.

Discover Available Models¶

# List all models from endpoint
npx @juspay/neurolink models --provider openai-compatible

SDK Auto-Discovery¶

import { NeuroLink } from "@juspay/neurolink";

const ai = new NeuroLink();

// Discover models programmatically
const models = await ai.listModels("openai-compatible");
console.log("Available models:", models);

// Use discovered model
const result = await ai.generate({
  input: { text: "Hello!" },
  provider: "openai-compatible",
  model: models[0].id, // Use first available model
});

OpenRouter Integration¶

OpenRouter aggregates 100+ models from multiple providers.

Available Models on OpenRouter¶

# List all OpenRouter models
npx @juspay/neurolink models --provider openai-compatible

# Popular models available:
# - anthropic/claude-3.5-sonnet
# - openai/gpt-4-turbo
# - google/gemini-pro-1.5
# - meta-llama/llama-3-70b-instruct
# - mistralai/mistral-large

Model Selection by Provider¶

// Use Claude through OpenRouter
const claude = await ai.generate({
  input: { text: "Explain quantum computing" },
  provider: "openai-compatible",
  model: "anthropic/claude-3.5-sonnet",
});

// Use GPT-4 through OpenRouter
const gpt4 = await ai.generate({
  input: { text: "Write a poem" },
  provider: "openai-compatible",
  model: "openai/gpt-4-turbo",
});

// Use Gemini through OpenRouter
const gemini = await ai.generate({
  input: { text: "Analyze this data" },
  provider: "openai-compatible",
  model: "google/gemini-pro-1.5",
});

OpenRouter Features¶

// Cost tracking (OpenRouter provides in response)
const result = await ai.generate({
  input: { text: "Your prompt" },
  provider: "openai-compatible",
  model: "anthropic/claude-3.5-sonnet",
  enableAnalytics: true,
});

console.log("Tokens used:", result.usage.totalTokens);
console.log("Cost:", result.cost); // OpenRouter returns actual cost

// Provider selection preferences
const result = await ai.generate({
  input: { text: "Your prompt" },
  provider: "openai-compatible",
  model: "openai/gpt-4",
  headers: {
    "X-Provider-Preferences": "order:cost", // Cheapest first
  },
});

vLLM Integration¶

vLLM provides high-performance inference for self-hosted models.

Starting vLLM Server¶

# Basic setup
python -m vllm.entrypoints.openai.api_server \
  --model mistralai/Mistral-7B-Instruct-v0.2 \
  --port 8000

# With GPU optimization
python -m vllm.entrypoints.openai.api_server \
  --model mistralai/Mistral-7B-Instruct-v0.2 \
  --tensor-parallel-size 2 \  # Multi-GPU
  --gpu-memory-utilization 0.9 \
  --port 8000

# With quantization for lower memory
python -m vllm.entrypoints.openai.api_server \
  --model TheBloke/Mistral-7B-Instruct-v0.2-AWQ \
  --quantization awq \
  --port 8000

NeuroLink Configuration for vLLM¶

const ai = new NeuroLink({
  providers: [
    {
      name: "openai-compatible",
      config: {
        baseUrl: "http://localhost:8000/v1",
        apiKey: "none", // vLLM doesn't require authentication
        defaultModel: "mistralai/Mistral-7B-Instruct-v0.2",
      },
    },
  ],
});

// Use vLLM-hosted model
const result = await ai.generate({
  input: { text: "Explain Docker containers" },
  provider: "openai-compatible",
});

Multiple vLLM Instances¶

// Load balance across multiple vLLM servers
const ai = new NeuroLink({
  providers: [
    {
      name: "openai-compatible-1",
      config: {
        baseUrl: "http://server1:8000/v1",
        apiKey: "none",
      },
      priority: 1,
    },
    {
      name: "openai-compatible-2",
      config: {
        baseUrl: "http://server2:8000/v1",
        apiKey: "none",
      },
      priority: 1,
    },
  ],
  loadBalancing: "round-robin",
});

SDK Integration¶

Basic Usage¶

import { NeuroLink } from "@juspay/neurolink";

const ai = new NeuroLink();

// Simple generation
const result = await ai.generate({
  input: { text: "Hello from OpenAI Compatible!" },
  provider: "openai-compatible",
});

console.log(result.content);

With Model Selection¶

// Specify exact model (OpenRouter format)
const result = await ai.generate({
  input: { text: "Explain blockchain" },
  provider: "openai-compatible",
  model: "anthropic/claude-3.5-sonnet",
});

// Or use auto-discovered model
const models = await ai.listModels("openai-compatible");
const result = await ai.generate({
  input: { text: "Your prompt" },
  provider: "openai-compatible",
  model: models[0].id,
});

Streaming¶

// Stream responses for better UX
for await (const chunk of ai.stream({
  input: { text: "Write a long story" },
  provider: "openai-compatible",
  model: "anthropic/claude-3.5-sonnet",
})) {
  process.stdout.write(chunk.content);
}

Custom Headers¶

// Pass custom headers (e.g., for OpenRouter)
const result = await ai.generate({
  input: { text: "Your prompt" },
  provider: "openai-compatible",
  headers: {
    "HTTP-Referer": "https://your-app.com",
    "X-Title": "YourApp",
    "X-Provider-Preferences": "order:cost",
  },
});

Error Handling¶

try {
  const result = await ai.generate({
    input: { text: "Your prompt" },
    provider: "openai-compatible",
    model: "non-existent-model",
  });
} catch (error) {
  if (error.message.includes("model not found")) {
    // List available models
    const models = await ai.listModels("openai-compatible");
    console.log(
      "Available models:",
      models.map((m) => m.id),
    );
  } else if (error.message.includes("connection")) {
    console.error("Cannot connect to endpoint");
  } else {
    throw error;
  }
}

CLI Usage¶

Basic Commands¶

# Generate with default model
npx @juspay/neurolink generate "Hello world" --provider openai-compatible

# Use specific model
npx @juspay/neurolink gen "Write code" \
  --provider openai-compatible \
  --model "anthropic/claude-3.5-sonnet"

# Stream response
npx @juspay/neurolink stream "Tell a story" \
  --provider openai-compatible

# List available models
npx @juspay/neurolink models --provider openai-compatible

OpenRouter-Specific Commands¶

# Use cheap models for cost optimization
npx @juspay/neurolink gen "Customer support query" \
  --provider openai-compatible \
  --model "meta-llama/llama-3-8b-instruct"  # Cheap

# Use premium models for complex tasks
npx @juspay/neurolink gen "Complex analysis task" \
  --provider openai-compatible \
  --model "anthropic/claude-3-opus"  # Premium

Configuration Options¶

Environment Variables¶

# Required
OPENAI_COMPATIBLE_BASE_URL=https://openrouter.ai/api/v1
OPENAI_COMPATIBLE_API_KEY=sk-or-v1-your-key

# Optional
OPENAI_COMPATIBLE_MODEL=anthropic/claude-3.5-sonnet  # Default model
OPENAI_COMPATIBLE_TIMEOUT=60000  # Timeout (ms)
OPENAI_COMPATIBLE_VERIFY_SSL=true  # SSL verification

Programmatic Configuration¶

const ai = new NeuroLink({
  providers: [
    {
      name: "openai-compatible",
      config: {
        baseUrl: process.env.OPENAI_COMPATIBLE_BASE_URL,
        apiKey: process.env.OPENAI_COMPATIBLE_API_KEY,
        defaultModel: "anthropic/claude-3.5-sonnet",
        timeout: 60000,
        headers: {
          "HTTP-Referer": "https://yourapp.com",
          "X-Title": "YourApp",
        },
      },
    },
  ],
});

Use Cases¶

1. Multi-Provider Access via OpenRouter¶

// Access multiple providers through one endpoint
const providers = {
  claude: "anthropic/claude-3.5-sonnet",
  gpt4: "openai/gpt-4-turbo",
  gemini: "google/gemini-pro-1.5",
  llama: "meta-llama/llama-3-70b-instruct",
};

for (const [name, model] of Object.entries(providers)) {
  const result = await ai.generate({
    input: { text: "Explain quantum computing in one sentence" },
    provider: "openai-compatible",
    model,
  });
  console.log(`${name}: ${result.content}`);
}

2. Self-Hosted Private Models¶

// Complete privacy with local vLLM
const privateAI = new NeuroLink({
  providers: [
    {
      name: "openai-compatible",
      config: {
        baseUrl: "http://localhost:8000/v1",
        apiKey: "none",
      },
    },
  ],
});

// Process sensitive data locally
const result = await privateAI.generate({
  input: { text: sensitiveData },
  provider: "openai-compatible",
});
// Data never leaves your infrastructure

3. Cost Optimization¶

// Compare costs across providers via OpenRouter
async function generateCheapest(prompt: string) {
  const models = [
    {
      name: "llama-3-8b",
      model: "meta-llama/llama-3-8b-instruct",
      costPer1M: 0.2,
    },
    {
      name: "mistral-7b",
      model: "mistralai/mistral-7b-instruct",
      costPer1M: 0.15,
    },
    { name: "gemma-7b", model: "google/gemma-7b-it", costPer1M: 0.1 },
  ];

  // Sort by cost
  models.sort((a, b) => a.costPer1M - b.costPer1M);

  // Try cheapest first
  for (const { model } of models) {
    try {
      return await ai.generate({
        input: { text: prompt },
        provider: "openai-compatible",
        model,
      });
    } catch (error) {
      continue; // Try next model
    }
  }
}

Troubleshooting¶

Common Issues¶

1. "Connection refused"¶

Problem: Endpoint is not accessible.

Solution:

# Test endpoint manually (local development)
curl http://localhost:8000/v1/models

# Test endpoint manually (production - always use HTTPS)
curl https://your-production-endpoint.com/v1/models

# Check if server is running
ps aux | grep vllm

# Verify firewall allows connection
telnet localhost 8000

2. "Model not found"¶

Problem: Model ID is incorrect or not available.

Solution:

# List available models first
npx @juspay/neurolink models --provider openai-compatible

# Use exact model ID from list
npx @juspay/neurolink gen "test" \
  --provider openai-compatible \
  --model "exact-model-id-from-list"

3. "Invalid API key"¶

Problem: API key format is incorrect (OpenRouter).

Solution:

# OpenRouter keys start with sk-or-v1-
OPENAI_COMPATIBLE_API_KEY=sk-or-v1-your-key  # ✅ Correct

# For local servers, use 'none' or empty string
OPENAI_COMPATIBLE_API_KEY=none  # ✅ For vLLM

Best Practices¶

1. Model Discovery¶

// ✅ Good: Auto-discover models on startup
const models = await ai.listModels("openai-compatible");
console.log(
  "Available models:",
  models.map((m) => m.id),
);

// Cache model list
const modelCache = new Map();
modelCache.set("openai-compatible", models);

2. Endpoint Health Checks¶

// ✅ Good: Verify endpoint before use
async function healthCheck() {
  try {
    const models = await ai.listModels("openai-compatible");
    return models.length > 0;
  } catch (error) {
    return false;
  }
}

if (await healthCheck()) {
  // Use provider
} else {
  // Fall back to alternative
}

3. Cost Tracking¶

// ✅ Good: Track usage with OpenRouter
const result = await ai.generate({
  input: { text: prompt },
  provider: "openai-compatible",
  enableAnalytics: true,
});

await costTracker.record({
  provider: "openrouter",
  model: result.model,
  tokens: result.usage.totalTokens,
  cost: result.cost,
});

Provider Setup Guide - General provider configuration
Cost Optimization - Reduce AI costs
Enterprise Multi-Region - Self-hosted and vLLM deployment

Additional Resources¶

OpenRouter - Multi-provider aggregator
vLLM Documentation - Self-hosted inference
LocalAI - Local OpenAI alternative
OpenAI API Spec - API standard

Need Help? Join our GitHub Discussions or open an issue.

OpenAI Compatible Provider Guide¶

Overview¶

Key Benefits¶

Supported Services¶

Quick Start¶

Option 1: OpenRouter (Recommended for Beginners)¶

1. Get OpenRouter API Key¶

2. Configure NeuroLink¶

3. Test Setup¶

Option 2: vLLM (Self-Hosted)¶

1. Install vLLM¶

2. Configure NeuroLink¶

3. Test Setup¶

Option 3: LocalAI (Privacy-Focused)¶

1. Install LocalAI¶

2. Configure NeuroLink¶

Model Auto-Discovery¶

Discover Available Models¶

SDK Auto-Discovery¶

OpenRouter Integration¶

Available Models on OpenRouter¶

Model Selection by Provider¶

OpenRouter Features¶

vLLM Integration¶

Starting vLLM Server¶

NeuroLink Configuration for vLLM¶

Multiple vLLM Instances¶

SDK Integration¶

Basic Usage¶

With Model Selection¶

Streaming¶

Custom Headers¶

Error Handling¶

CLI Usage¶

Basic Commands¶

OpenRouter-Specific Commands¶

Configuration Options¶

Environment Variables¶

Programmatic Configuration¶

Use Cases¶

1. Multi-Provider Access via OpenRouter¶

2. Self-Hosted Private Models¶

3. Cost Optimization¶

Troubleshooting¶

Common Issues¶

1. "Connection refused"¶

2. "Model not found"¶

3. "Invalid API key"¶

Best Practices¶

1. Model Discovery¶

2. Endpoint Health Checks¶

3. Cost Tracking¶

Related Documentation¶

Additional Resources¶