Hacker Newsnew | past | comments | ask | show | jobs | submit | duckb's commentslogin

What LLM do you use for this? Self hosted or cloud?


This uses the OpenAI API currently. I tried Gemini for embeddings for a little while but didn't seem materially better.


Looks beautiful. How difficult would it be if I want to develop a plugin - Mood detection, Chat Helper (LLM linked), ...


Does this support table detection and extraction?


Yes, it's experimental at the moment: https://docs.vlm.run/guides/doc-ai/guide-visual-grounding


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: