__chefski__'s comments

__chefski__ · 2025-01-19T22:38:14 1737326294

The token estimator of this package is quite inaccurate, since it appears to just take the number of characters, minus whitespace. This can lead to it being overly conservative, which would degrade an LLM's performance. That said, it can be improved in subsequent versions to properly integrate with a model's tokenizer so it can know the true token count.

https://github.com/bodo-run/yek/blob/17fe37fbd461a8194ff612e...