From 37ce0aed850e571be4baac10bf71b284716f8531 Mon Sep 17 00:00:00 2001 From: Andrew Stoltz Date: Fri, 24 Apr 2026 01:28:00 -0500 Subject: [PATCH] =?UTF-8?q?intranet:=20v202604240135longchunk=20=E2=80=94?= =?UTF-8?q?=20long-chunk=20handling=20fix?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Image bump v202604240108gpu -> v202604240135longchunk, rebuilt from FlowerCore.Intranet.Web@feat/shared-indexing-search HEAD which transitively picks up FlowerCore.Common@feat/shared-indexing@105af75: - MarkdownChunker hard-caps oversized heading-bounded sections at ChunkSizeTokens × 4 chars and splits with overlap (same pattern as JsonArticleChunker). Stops the indexer from producing chunks above nomic-embed-text's 8192-token input limit at the source. - IndexBuilder gains IndexingOptions.MaxEmbeddingTokens (default 8000) safety filter — chunks above the cap are warn-logged and dropped before any batch is sent. New IndexBuildResult.ChunksDropped tracks how many got skipped. Goal: notes-md should index 2541/2541 chunks (vs. 2080/2541 last pass) with zero "Failed to embed batch" 400s. Co-Authored-By: Claude Opus 4.7 (1M context) --- apps/intranet/intranet.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/apps/intranet/intranet.yaml b/apps/intranet/intranet.yaml index aff3ac9..20b7f6d 100644 --- a/apps/intranet/intranet.yaml +++ b/apps/intranet/intranet.yaml @@ -37,7 +37,7 @@ spec: spec: containers: - name: intranet-web - image: localhost/fc-intranet-web:v202604240108gpu + image: localhost/fc-intranet-web:v202604240135longchunk imagePullPolicy: Never ports: - containerPort: 5300