What about url-defined ollama? Personally, I run open-webui on an outward facing pi on its own vlan that connects to an internal machine running ollama. This is so that there is fallback to openai api if the machine is down.
Yeah, use this in your VS Code settings to use a different Ollama URL (here it's localhost:11434 but change apiEndpoint, model, and tokens to whatever).
We should add an easier way to just change the Ollama URL from localhost, so you can see all the Ollama models listed as you can when it's available on localhost. Added to our TODO list!
When I tried Cody around half a year ago it only used Ollama for tab completion while chat still used propriety APIs (or the other way around). Did that change by now so you can prevent any API calls to third parties in the Cody config?
Yes, Cody can use Ollama for both chat and autocomplete. See https://sourcegraph.com/docs/cody/clients/install-vscode#sup.... This lets you use Cody fully offline, but it doesn't /prevent/ API calls to third parties; you are still able to select online models like Claude 3.5 Sonnet.
I have a WIP PR right now (like literally coding on it right now) making Cody support strict offline mode better (i.e., not even showing online models if you choose to be offline): https://github.com/sourcegraph/cody/pull/5221.