I have good results running Ollama locally with olen models like Gemma 3, Qwen 3, etc. The major drawback is slower inference speed. Commercial APIs like Google Gemini are so much faster.
Still, I find local models very much worth using after taking the time to set them up with Emacs, open-codex, etc.
Still, I find local models very much worth using after taking the time to set them up with Emacs, open-codex, etc.