Skip to main content

POST /api/ai/chat (Streaming)

Real-time streaming chat endpoint for interactive AI applications.

Endpoint

POST https://regpilot.dev/api/ai/chat

Authentication

Required header: X-API-Key: sk_your_api_key See Authentication for details.

Request

Headers

HeaderRequiredDescription
X-API-KeyYesYour project API key
Content-TypeYesMust be application/json

Body Parameters

ParameterTypeRequiredDefaultDescription
messagesArrayYes-Array of message objects
qualityStringNobalancedcheap, balanced, or frontier
temperatureNumberNo0.7Randomness (0-2)
modelStringNoAutoSpecific model override
governorMetadataObjectNo-Compliance validation metadata

Message Object

{
  role: 'user' | 'assistant' | 'system',
  content: string
}

Request Example

{
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "Explain quantum computing in simple terms."
    }
  ],
  "quality": "balanced",
  "temperature": 0.7
}

Response

Response Headers

HeaderDescription
Content-Typetext/event-stream
x-regpilot-modelModel used (e.g., openai/gpt-4o-mini)
x-credits-chargedCredits charged for request
x-credits-remainingRemaining credit balance
x-regpilot-prompt-tokensInput tokens used
x-regpilot-completion-tokensOutput tokens generated
x-cache-statusHIT or MISS

Governor Headers (if enabled)

HeaderDescription
x-governor-validatedtrue if validated
x-governor-approvedtrue if approved
x-governor-risk-levellow, medium, high, critical
x-governor-risk-scoreRisk score (0-100)
x-governor-violationsNumber of violations found
x-governor-audit-idAudit trail ID

Response Body

Streaming text response via Server-Sent Events.

Code Examples

const response = await fetch('https://regpilot.dev/api/ai/chat', {
  method: 'POST',
  headers: {
    'X-API-Key': process.env.REGPILOT_API_KEY!,
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    messages: [
      { role: 'user', content: 'Write a haiku about coding' }
    ],
    quality: 'balanced'
  })
});

// Stream the response
const reader = response.body?.getReader();
const decoder = new TextDecoder();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  
  const text = decoder.decode(value);
  process.stdout.write(text);
}

// Check response headers
console.log('Model:', response.headers.get('x-regpilot-model'));
console.log('Credits:', response.headers.get('x-credits-charged'));

Quality Tiers

RegPilot automatically routes your request to the optimal model based on quality:
QualityModelSpeedCostBest For
cheapClaude Haiku⚡ Fast$High-volume, simple queries
balancedGPT-4o Mini⚡ Fast$$General-purpose applications
frontierGPT-4o🔄 Moderate$$$Complex reasoning, code generation

Governor Integration

Add compliance validation to your requests:
const response = await fetch('https://regpilot.dev/api/ai/chat', {
  method: 'POST',
  headers: {
    'X-API-Key': process.env.REGPILOT_API_KEY!,
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    messages: [
      { role: 'user', content: 'Can I sue my employer for discrimination?' }
    ],
    quality: 'balanced',
    governorMetadata: {
      actionType: 'legal_advice',
      recipientCountry: 'US',
      senderId: 'user_12345',
      senderRole: 'customer_support',
      department: 'legal'
    }
  })
});

// Check Governor validation results
const validated = response.headers.get('x-governor-validated');
const approved = response.headers.get('x-governor-approved');
const riskLevel = response.headers.get('x-governor-risk-level');
const riskScore = response.headers.get('x-governor-risk-score');

console.log(`Validated: ${validated}, Approved: ${approved}`);
console.log(`Risk: ${riskLevel} (${riskScore}/100)`);

Governor Metadata Fields

FieldTypeRequiredDescription
actionTypeStringYesType of action (see below)
recipientCountryStringNoRecipient’s country code
recipientUserIdStringNoRecipient user ID
recipientEmailStringNoRecipient email
senderIdStringYesSender identifier
senderRoleStringNoSender’s role
departmentStringNoDepartment name

Action Types

  • customer_support - General customer service (Low risk)
  • legal_advice - Legal queries (Medium risk)
  • medical_advice - Health/medical queries (Medium risk)
  • hr_message - Human resources communications (Medium risk)
  • suspension - Account actions (High risk)
  • refund_denial - Payment decisions (High risk)
  • policy_warning - Policy enforcement (Medium risk)
  • other - General content (Low risk)

Multi-turn Conversations

Maintain conversation context by including message history:
const conversationHistory = [
  { role: 'user', content: 'What is machine learning?' },
  { role: 'assistant', content: 'Machine learning is a subset of AI...' },
  { role: 'user', content: 'Can you give me an example?' }
];

const response = await fetch('https://regpilot.dev/api/ai/chat', {
  method: 'POST',
  headers: {
    'X-API-Key': process.env.REGPILOT_API_KEY!,
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    messages: conversationHistory,
    quality: 'balanced'
  })
});
Keep conversation history under 4000 tokens for optimal performance. RegPilot automatically handles context window management.

Model Override

Specify a particular model instead of using quality tiers:
const response = await fetch('https://regpilot.dev/api/ai/chat', {
  method: 'POST',
  headers: {
    'X-API-Key': process.env.REGPILOT_API_KEY!,
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    messages: [{ role: 'user', content: 'Hello!' }],
    model: 'gpt-4o'  // Override quality tier
  })
});

Supported Models

  • OpenAI: gpt-4o, gpt-4o-mini, gpt-4-turbo
  • Anthropic: claude-3-opus-20240229, claude-3-sonnet-20240229, claude-3-haiku-20240307
  • Mistral: mistral-large-latest, mistral-medium-latest

Caching

RegPilot automatically caches responses for identical requests:
// First request - Cache MISS
const response1 = await fetch('https://regpilot.dev/api/ai/chat', {
  method: 'POST',
  headers: {
    'X-API-Key': process.env.REGPILOT_API_KEY!,
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    messages: [{ role: 'user', content: 'What is 2+2?' }],
    quality: 'cheap'
  })
});
console.log('Cache:', response1.headers.get('x-cache-status')); // MISS

// Second identical request - Cache HIT
const response2 = await fetch('https://regpilot.dev/api/ai/chat', {
  method: 'POST',
  headers: {
    'X-API-Key': process.env.REGPILOT_API_KEY!,
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    messages: [{ role: 'user', content: 'What is 2+2?' }],
    quality: 'cheap'
  })
});
console.log('Cache:', response2.headers.get('x-cache-status')); // HIT
console.log('Credits:', response2.headers.get('x-credits-charged')); // 0 (free!)
Cache hits are free! You’re only charged for cache misses.

Error Handling

try {
  const response = await fetch('https://regpilot.dev/api/ai/chat', {
    method: 'POST',
    headers: {
      'X-API-Key': process.env.REGPILOT_API_KEY!,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      messages: [{ role: 'user', content: 'Hello!' }],
      quality: 'balanced'
    })
  });

  if (!response.ok) {
    const error = await response.json();
    
    switch (response.status) {
      case 400:
        console.error('Bad request:', error.error);
        break;
      case 401:
        console.error('Invalid API key');
        break;
      case 429:
        console.error('Rate limit exceeded');
        const retryAfter = response.headers.get('retry-after');
        console.log(`Retry after ${retryAfter} seconds`);
        break;
      case 500:
        console.error('Server error:', error.message);
        break;
      default:
        console.error('Unexpected error:', error);
    }
    return;
  }

  // Process stream
  const reader = response.body?.getReader();
  const decoder = new TextDecoder();
  
  while (true) {
    const { done, value } = await reader.read();
    if (done) break;
    process.stdout.write(decoder.decode(value));
  }

} catch (error) {
  console.error('Network error:', error);
}

Rate Limits

Rate limits vary by plan:
PlanRequests/MinuteBurst
Free1020
Startup100200
Growth5001000
EnterpriseUnlimitedUnlimited
See Rate Limits for details.

Best Practices

1. Stream Processing

Always process streams efficiently:
const reader = response.body?.getReader();
const decoder = new TextDecoder();
let buffer = '';

while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  
  buffer += decoder.decode(value, { stream: true });
  
  // Process complete chunks
  const lines = buffer.split('\n');
  buffer = lines.pop() || '';
  
  for (const line of lines) {
    if (line.trim()) {
      console.log(line);
    }
  }
}

2. Error Recovery

Implement retry logic for transient errors:
async function chatWithRetry(messages, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      const response = await fetch('https://regpilot.dev/api/ai/chat', {
        method: 'POST',
        headers: {
          'X-API-Key': process.env.REGPILOT_API_KEY!,
          'Content-Type': 'application/json'
        },
        body: JSON.stringify({ messages, quality: 'balanced' })
      });
      
      if (response.ok) return response;
      
      if (response.status === 429) {
        const retryAfter = parseInt(response.headers.get('retry-after') || '60');
        await new Promise(resolve => setTimeout(resolve, retryAfter * 1000));
        continue;
      }
      
      throw new Error(`HTTP ${response.status}`);
    } catch (error) {
      if (i === maxRetries - 1) throw error;
      await new Promise(resolve => setTimeout(resolve, 1000 * Math.pow(2, i)));
    }
  }
}

3. Context Management

Trim old messages to stay within token limits:
function trimConversation(messages, maxTokens = 4000) {
  // Rough estimate: 1 token ≈ 4 characters
  let totalChars = messages.reduce((sum, m) => sum + m.content.length, 0);
  
  while (totalChars > maxTokens * 4 && messages.length > 2) {
    messages.splice(1, 1); // Remove oldest non-system message
    totalChars = messages.reduce((sum, m) => sum + m.content.length, 0);
  }
  
  return messages;
}

Next: Complete (Non-Streaming) →