v202604240140longchunk still hit 400 Bad Request from nomic-embed-text on several batches — the chars/4 token estimate was optimistic for code-heavy/Unicode content. Rebuilt from FlowerCore.Common@e1c28b4 which tightens MarkdownChunker hard cap (ChunkSizeTokens × 2, clamped at 16000 chars) AND adds a character-length check in IndexBuilder's safety filter alongside the estimated-tokens check. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2.7 KiB
2.7 KiB