Small local AI
for those with limited resources
If you’re interested in trying a local privacy friendly text generation AI that doesn’t share your data and lack the resources for the better larger models, there is still a way you can try it out.
There is a fairly new model from IBM that is the best small model i have seen to date, i have been completely unimpressed with others to the point of them being entirely useless, they cannot even give decent results and usually hallucinate and generate nonsense, but this new model actually performs well giving decent results. (It is not as good as extremely large models but it performs extremely well for its size).
You can run ollama on any system windows mac or linux, for gpu usage you need a gpu with 2GB or more vram but it can be used without any gpu at all.
Get up and running with large language models.
Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 2, and other large language models. - ollama/ollama
The ibm model is here, Granite 3.1 MoE:
The IBM Granite 1B and 3B models are long-context mixture of experts (MoE) Granite models from IBM designed for low latency usage.
The 1B version averages 1.7gb ram(or vram) usage and the 3B model uses 2.7gb on average.
All you do is install ollama and open powershell or cmd, type
ollama
To make sure it is installed once you get a output of the commands showing it is recognized type
ollama pull granite3.1-moe:1b
(or granite3.1-moe for the 3b version)
Then ollama run granite3.1-moe:1b
And thats it, start chatting. There are also a wide variety of other models available, just make sure you have enough ram/vram to run them (on average you need just slightly more than the size of the file as a good rule of thumb)
This is one of the first models that is actually decent enough to be worth recommending in my opinion and it is completely offline and privacy friendly. It also seems to not have much censorship from my tests. (Although it does seem to have a little bit of bias so results may vary)
You can install open-webui in docker for a front end webui if you want one, but ollama supports chats in terminal
User-friendly AI Interface (Supports Ollama, OpenAI API, ...) - open-webui/open-webui


















