Accepted
QNTX's embedding engine (ONNX inference + HDBSCAN clustering) was compiled into the main binary via CGO/Rust FFI, gated behind cgo && rustembeddings build tags. This created build complexity, prevented hot-reload, and locked to a single hardcoded model (384-dim).
The existing provider pattern (ADR-014 for LLM, ADR-015 for search) provides the model: a plugin declares embedding_provider = true during Initialize, QNTX routes embedding calls to it via gRPC.
Add embedding_provider to the plugin provider pattern. Any plugin implementing EmbeddingService gRPC (Embed, BatchEmbed, Cluster, ModelInfo) alongside DomainPluginService can serve as the embedding backend. QNTX creates a PluginEmbeddingService that satisfies the existing Service interface by making gRPC calls to the plugin instead of CGO/FFI calls.
embedding.proto: Add Cluster and ModelInfo RPCsdomain.proto: Add embedding_provider bool to InitializeResponseqntx-proto, qntx-grpc): Include embedding.protoEmbeddingService trait (Embed, BatchEmbed, Cluster, ModelInfo)EmbeddingServiceServer on the gRPC server alongside DomainPluginServiceServerembedding_provider: true in InitializeResponsePluginEmbeddingService (no build tags): gRPC client implementing Service interfaceembedding_provider flag, call SetupPluginEmbeddingServiceRunHDBSCANClustering accepts a ClusterFunc — plugin provides its own via Cluster RPConEmbeddingProviderReady callback re-establishes the embedding backend when a plugin restartsSerializeEmbedding, DeserializeEmbedding, ComputeSimilarity are pure math — implemented directly in Go on PluginEmbeddingService, not routed through gRPC.
PluginEmbeddingService has no build tags — pure Go gRPC client