llm
Large Language Models
Fast LLM Inference
Fast LLM Inference is a large language models capability available through Groq on Aweb. Ultra-low latency LLM inference optimized for speed via dedicated hardware. Access it through a single unified API with automatic failover and intelligent routing.
Best for
Highest quality
Groq
Premium tier
Most affordable
Groq
Economy tier
Contract
Providers (1)
Public discovery and orchestration
Inspect the live capability descriptor directly, then route orchestration through a capability filter. Generic public execute examples are intentionally withheld until the canonical public execute contract is normalized.
cURL
curl "https://aweblabs.ai/api/v2/capabilities/llm.fast-inference"TypeScript
import Aweb from '@aweb/sdk';
const client = new Aweb({
baseUrl: 'https://aweblabs.ai/api/v2',
});
const capability = await client.capabilities.get('llm.fast-inference');
console.log(capability.data.runtime.providers);Orchestration pipeline
import Aweb from '@aweb/sdk';
const aweb = new Aweb({ apiKey: process.env.AWEB_API_KEY });
const result = await aweb.orchestrate.run({
query: 'Use Fast LLM Inference to help with a hello-world task and summarize the output',
capabilities: ['llm.fast-inference'],
policy: 'balanced',
});
console.log(result.data.status);