|
NeoGraph 0.10.0
A C++17 Graph Agent Engine Library — LangGraph for C++
|
Provider wrapper that emits OpenInference LLM spans. More...
#include <openinference.h>
Public Member Functions | |
| ChatCompletion | complete (const CompletionParams ¶ms) override |
| Perform a synchronous LLM completion. | |
| asio::awaitable< ChatCompletion > | complete_async (const CompletionParams ¶ms) override |
| Perform an LLM completion as a coroutine. | |
| ChatCompletion | complete_stream (const CompletionParams ¶ms, const StreamCallback &on_chunk) override |
| Perform a streaming LLM completion. | |
| asio::awaitable< ChatCompletion > | complete_stream_async (const CompletionParams ¶ms, const StreamCallback &on_chunk) override |
| Async streaming completion. | |
| std::string | get_name () const override |
| Get the provider name (e.g., "openai", "claude"). | |
| asio::awaitable< ChatCompletion > | invoke (const CompletionParams ¶ms, StreamCallback on_chunk) override |
| v1.0 single-dispatch override (Candidate 6 PR6). | |
| OpenInferenceProvider (std::shared_ptr< Provider > inner, Tracer &tracer, std::function< Span *()> parent_lookup=nullptr, std::string span_name="llm.complete") | |
Provider wrapper that emits OpenInference LLM spans.
Pass-through to the inner Provider for all four virtual paths (complete / complete_async / complete_stream / complete_stream_async). Each call opens an llm.complete child span, attaches the OpenInference LLM-kind attribute set, runs the inner call, captures the response + usage onto the span, and ends it. Streaming overloads additionally append each token to output.value and emit a llm.token span event per chunk so Phoenix's timeline view shows stream cadence.
Tracing failures are caught and swallowed — observability must never break the LLM call.
Definition at line 147 of file openinference.h.
| neograph::observability::OpenInferenceProvider::OpenInferenceProvider | ( | std::shared_ptr< Provider > | inner, |
| Tracer & | tracer, | ||
| std::function< Span *()> | parent_lookup = nullptr, |
||
| std::string | span_name = "llm.complete" |
||
| ) |
| inner | Provider to delegate to. |
| tracer | Tracer for span emission. Must outlive *this. |
| parent_lookup | Optional callback returning the current node span the LLM call should nest under (e.g. bound to an OpenInferenceTracerSession::current_parent). May be null — null parent opens the LLM span as a root. |
| span_name | Span name for each LLM call (default "llm.complete"). |
|
overridevirtual |
Perform a synchronous LLM completion.
Default implementation bridges to complete_async() via an internal io_context (see neograph::async::run_sync). Subclasses written against the sync path override this directly; async- native subclasses override complete_async() and inherit this.
| params | Completion parameters including model, messages, and tools. |
invoke(params, on_chunk). New code: co_await provider->invoke(params, nullptr) (async) or neograph::async::run_sync(provider->invoke(params, nullptr)) (sync). The legacy complete() keeps working through the deprecation window and is removed in v1.0.0. See ROADMAP_v1.md Candidate 6. Reimplemented from neograph::Provider.
|
overridevirtual |
Perform an LLM completion as a coroutine.
Returns an asio::awaitable that resolves to the completion response. The awaitable does no I/O on the caller's thread — resume it on an io_context to run the request.
Default implementation delegates to the synchronous complete() (runs on whatever thread resumes the coroutine — caller's I/O loop will block on HTTP). Subclasses that perform async HTTP should override this to co_await non-blocking operations; when they do, complete() transparently bridges via run_sync().
complete / complete_async. Overriding neither results in infinite mutual recursion when the method is called.| params | Completion parameters. |
invoke(params, nullptr). v1.0 removes this. Reimplemented from neograph::Provider.
|
overridevirtual |
Perform a streaming LLM completion.
Calls on_chunk for each token as it arrives, then returns the full assembled completion when done.
Default implementation calls complete and forwards the full assembled message content as a single chunk — sufficient for mocks, unit-test fixtures, and non-streaming-native providers that just want to satisfy the streaming surface. Streaming-native subclasses (OpenAI, schema-driven, etc.) override this to emit tokens incrementally.
| params | Completion parameters. |
| on_chunk | Callback invoked per received token. |
invoke(params, on_chunk) (or neograph::async::run_sync(invoke(params, on_chunk)) for a sync caller). v1.0 removes this. Reimplemented from neograph::Provider.
|
overridevirtual |
Async streaming completion.
Awaitable peer of complete_stream.
Default implementation (post-#4): spawns a dedicated worker thread that runs the synchronous complete_stream, dispatches each token onto the awaiting coroutine's executor (so the user's on_chunk runs single-threaded with the awaiter — no reentrancy), and resumes the coroutine via a one-shot steady_timer.cancel() posted on the awaiter's executor when streaming finishes. Subclasses with a fully async streaming transport (WebSocket Responses, native SSE coroutine, etc.) SHOULD override this to drop the worker thread and stream tokens straight onto the coroutine's executor.
co_return complete_stream(...) inline, which blocked the awaiting executor for the whole stream and — when complete_stream itself called run_sync() — nested two io_contexts on the same thread, racing on shared provider state. The current default avoids both: the executor stays responsive, and complete_stream runs on its own thread with no implicit io_context reentry. SchemaProvider overrides the WebSocket Responses branch to skip even the worker thread (it's already async-native).complete / complete_async pair above): subclasses MUST override at least one of complete_stream / complete_stream_async. Subclasses whose native sync complete_stream itself drives a run_sync() on an internal io_context (the WebSocket Responses path in SchemaProvider is the canonical example) MUST override complete_stream_async directly to expose the async-native peer — relying on the default bridge is functional but spawns an extra worker thread per call. Subclasses whose complete_stream is purely synchronous (e.g. blocking httplib) can leave the default bridge in place — it routes the sync work onto a worker thread and dispatches tokens back onto the awaiter's executor.asio::io_context.run() placement for the awaiter** (issue #16): drive the outer asio::io_context.run() from your application's main thread or a long-lived worker thread. Nesting io.run() inside an HTTP server per-request callback (e.g. httplib's set_chunked_content_provider lambda) has been observed to SEGV in getaddrinfo under the per-request worker-thread spawn this default bridge issues, on some glibc/OpenSSL combinations. See docs/concepts.md §8 for tested-good shapes and the two recommended workarounds.| params | Completion parameters. |
| on_chunk | Callback invoked per received token (runs on the awaiting coroutine's executor — never on the internal worker thread). |
invoke(params, on_chunk). v1.0 removes this. Reimplemented from neograph::Provider.
|
overridevirtual |
Get the provider name (e.g., "openai", "claude").
Opaque debug identifier, not a typed dispatch key. Different subclasses pick different conventions:
OpenAIProvider always returns "openai".SchemaProvider returns whatever's in the schema's name field — could be "openai", "claude", "openai-responses", "gemini", or a user-defined schema id.RateLimitedProvider delegates to its inner provider.Code branching on the exact string (e.g. if (get_name() == "openai-responses")) is brittle — a custom schema named "openai-responses-v2" slips through, or a future rename silently breaks the branch. Use it for logging, telemetry, or version-pinning diagnostics. For typed behaviour, add a typed ProviderKind accessor or branch on the schema's actual fields.
Implements neograph::Provider.
|
overridevirtual |
v1.0 single-dispatch override (Candidate 6 PR6).
Anchors invoke() so engine provider->invoke(...) calls land here directly and emit one span, instead of the base default re-forwarding to the 4-virtual chain (which would still emit a span via the 4-virtual override but go through one extra vtable hop). v1.0 collapses the 4 overrides' span-recording bodies into invoke().
Reimplemented from neograph::Provider.