|
NeoGraph 0.10.0
A C++17 Graph Agent Engine Library — LangGraph for C++
|
Abstract base class for LLM providers. More...
#include <provider.h>
Public Member Functions | |
| virtual ChatCompletion | complete (const CompletionParams ¶ms) |
| Perform a synchronous LLM completion. | |
| virtual asio::awaitable< ChatCompletion > | complete_async (const CompletionParams ¶ms) |
| Perform an LLM completion as a coroutine. | |
| virtual ChatCompletion | complete_stream (const CompletionParams ¶ms, const StreamCallback &on_chunk) |
| Perform a streaming LLM completion. | |
| virtual asio::awaitable< ChatCompletion > | complete_stream_async (const CompletionParams ¶ms, const StreamCallback &on_chunk) |
| Async streaming completion. | |
| virtual std::string | get_name () const =0 |
| Get the provider name (e.g., "openai", "claude"). | |
| virtual asio::awaitable< ChatCompletion > | invoke (const CompletionParams ¶ms, StreamCallback on_chunk=nullptr) |
| Single-dispatch async-streaming completion (v1.0 canonical). | |
Abstract base class for LLM providers.
Subclass this to integrate any LLM backend (OpenAI, Claude, Gemini, etc.). Both synchronous and streaming completion must be implemented.
Definition at line 127 of file provider.h.
|
virtual |
Perform a synchronous LLM completion.
Default implementation bridges to complete_async() via an internal io_context (see neograph::async::run_sync). Subclasses written against the sync path override this directly; async- native subclasses override complete_async() and inherit this.
| params | Completion parameters including model, messages, and tools. |
invoke(params, on_chunk). New code: co_await provider->invoke(params, nullptr) (async) or neograph::async::run_sync(provider->invoke(params, nullptr)) (sync). The legacy complete() keeps working through the deprecation window and is removed in v1.0.0. See ROADMAP_v1.md Candidate 6. Reimplemented in neograph::observability::OpenInferenceProvider.
|
virtual |
Perform an LLM completion as a coroutine.
Returns an asio::awaitable that resolves to the completion response. The awaitable does no I/O on the caller's thread — resume it on an io_context to run the request.
Default implementation delegates to the synchronous complete() (runs on whatever thread resumes the coroutine — caller's I/O loop will block on HTTP). Subclasses that perform async HTTP should override this to co_await non-blocking operations; when they do, complete() transparently bridges via run_sync().
complete / complete_async. Overriding neither results in infinite mutual recursion when the method is called.| params | Completion parameters. |
invoke(params, nullptr). v1.0 removes this. Reimplemented in neograph::llm::OpenAIProvider, neograph::llm::RateLimitedProvider, neograph::llm::SchemaProvider, and neograph::observability::OpenInferenceProvider.
|
virtual |
Perform a streaming LLM completion.
Calls on_chunk for each token as it arrives, then returns the full assembled completion when done.
Default implementation calls complete and forwards the full assembled message content as a single chunk — sufficient for mocks, unit-test fixtures, and non-streaming-native providers that just want to satisfy the streaming surface. Streaming-native subclasses (OpenAI, schema-driven, etc.) override this to emit tokens incrementally.
| params | Completion parameters. |
| on_chunk | Callback invoked per received token. |
invoke(params, on_chunk) (or neograph::async::run_sync(invoke(params, on_chunk)) for a sync caller). v1.0 removes this. Reimplemented in neograph::llm::OpenAIProvider, neograph::llm::RateLimitedProvider, neograph::llm::SchemaProvider, and neograph::observability::OpenInferenceProvider.
|
virtual |
Async streaming completion.
Awaitable peer of complete_stream.
Default implementation (post-#4): spawns a dedicated worker thread that runs the synchronous complete_stream, dispatches each token onto the awaiting coroutine's executor (so the user's on_chunk runs single-threaded with the awaiter — no reentrancy), and resumes the coroutine via a one-shot steady_timer.cancel() posted on the awaiter's executor when streaming finishes. Subclasses with a fully async streaming transport (WebSocket Responses, native SSE coroutine, etc.) SHOULD override this to drop the worker thread and stream tokens straight onto the coroutine's executor.
co_return complete_stream(...) inline, which blocked the awaiting executor for the whole stream and — when complete_stream itself called run_sync() — nested two io_contexts on the same thread, racing on shared provider state. The current default avoids both: the executor stays responsive, and complete_stream runs on its own thread with no implicit io_context reentry. SchemaProvider overrides the WebSocket Responses branch to skip even the worker thread (it's already async-native).complete / complete_async pair above): subclasses MUST override at least one of complete_stream / complete_stream_async. Subclasses whose native sync complete_stream itself drives a run_sync() on an internal io_context (the WebSocket Responses path in SchemaProvider is the canonical example) MUST override complete_stream_async directly to expose the async-native peer — relying on the default bridge is functional but spawns an extra worker thread per call. Subclasses whose complete_stream is purely synchronous (e.g. blocking httplib) can leave the default bridge in place — it routes the sync work onto a worker thread and dispatches tokens back onto the awaiter's executor.asio::io_context.run() placement for the awaiter** (issue #16): drive the outer asio::io_context.run() from your application's main thread or a long-lived worker thread. Nesting io.run() inside an HTTP server per-request callback (e.g. httplib's set_chunked_content_provider lambda) has been observed to SEGV in getaddrinfo under the per-request worker-thread spawn this default bridge issues, on some glibc/OpenSSL combinations. See docs/concepts.md §8 for tested-good shapes and the two recommended workarounds.| params | Completion parameters. |
| on_chunk | Callback invoked per received token (runs on the awaiting coroutine's executor — never on the internal worker thread). |
invoke(params, on_chunk). v1.0 removes this. Reimplemented in neograph::llm::SchemaProvider, and neograph::observability::OpenInferenceProvider.
|
pure virtual |
Get the provider name (e.g., "openai", "claude").
Opaque debug identifier, not a typed dispatch key. Different subclasses pick different conventions:
OpenAIProvider always returns "openai".SchemaProvider returns whatever's in the schema's name field — could be "openai", "claude", "openai-responses", "gemini", or a user-defined schema id.RateLimitedProvider delegates to its inner provider.Code branching on the exact string (e.g. if (get_name() == "openai-responses")) is brittle — a custom schema named "openai-responses-v2" slips through, or a future rename silently breaks the branch. Use it for logging, telemetry, or version-pinning diagnostics. For typed behaviour, add a typed ProviderKind accessor or branch on the schema's actual fields.
Implemented in neograph::llm::OpenAIProvider, neograph::llm::RateLimitedProvider, neograph::llm::SchemaProvider, and neograph::observability::OpenInferenceProvider.
|
virtual |
Single-dispatch async-streaming completion (v1.0 canonical).
The unified entry point that future v1.0+ Provider subclasses override. Replaces the current (sync/async) × (stream/non-stream) 4-virtual cross-product (complete / complete_async / complete_stream / complete_stream_async) with one async- streaming-superset method. ROADMAP_v1.md Candidate 6.
Semantics:
on_chunk == nullptr → caller wants the full assembled completion only; provider may skip streaming framing.on_chunk != nullptr → caller wants tokens incrementally; provider invokes on_chunk on each chunk AND returns the full assembled completion when the stream ends.The returned awaitable resolves on the caller's executor; the on_chunk callback runs there too (single-threaded with the awaiter, same invariant complete_stream_async already provides).
Deprecation strategy (Candidate 6 PR sequence):
invoke() lands as a new virtual; default forwards to the legacy 4-virtual chain (complete_stream_async when on_chunk is set, complete_async otherwise) so every existing Provider subclass keeps working unchanged.OpenAIProvider, SchemaProvider, RateLimitedProvider) override invoke() directly; their old 4 overrides become thin adapters.[[deprecated]] markers.invoke() becomes the only Provider virtual besides get_name().Override contract: subclasses overriding invoke() MUST NOT call any of the 4 legacy complete_* methods on themselves — those default to forwarding to invoke() (in a future PR), which would recurse infinitely. New code: override invoke() and only invoke(). Old code: keep overriding the 4 legacy methods; the default invoke() here delegates to them.
| params | Completion parameters (model, messages, tools, ...). |
| on_chunk | Optional per-chunk callback. nullptr for non- streaming use. |
Reimplemented in neograph::llm::OpenAIProvider, neograph::llm::RateLimitedProvider, neograph::llm::SchemaProvider, and neograph::observability::OpenInferenceProvider.