Gemma 4 was just released recently, so I tried installing a local LLM for the first time. I chose Ollama as the LLM management tool because it seems to work well with my coding tools.
Ollama Gemma 4 PageI used the DMG file for installing Ollama on my Mac, as that was the recommended method.
"The preferred method of installation is to mount the ollama.dmg and drag-and-drop the Ollama application to the system-wide Applications folder."
After installing Ollama, check version on terminal.
myuser@my-Mac-mini ~ % ollama -v
ollama version is 0.20.3
Ollama for Mac
Gemma4:26B is a workstation model and mixture of experts model with 4B active parameters. I thought 32GB of memory would be too small, but it seems that 26B is the smallest option available for the Gemma4 workstation.
myuser@my-Mac-mini ~ % ollama run gemma4:26b
pulling manifest
pulling 7121486771cb: 9% ▕█ ▏ 1.6 GB/ 17 GB 63 MB/s 4m16s
Installation failed on Wi-Fi due to a 'read operation timed out' error. It succeeded after switching to a wired connection.
Let's check how much memory is used when running the model. I ran a simple prompt to see if it works.
myuser@my-Mac-mini ~ % ollama run gemma4:26b "Hello"
Thinking...
The user said "Hello".
This is a standard greeting.
Acknowledge the greeting and offer assistance.
* "Hello! How can I help you today?"
* "Hi there! Is there anything I can assist you with?"
* "Hello! What's on your mind?"
"Hello! How can I help you today?" (Simple, polite, and open-ended).
...done thinking.
Hello! How can I help you today?
myuser@my-Mac-mini ~ % ollama ps
NAME ID SIZE PROCESSOR CONTEXT UNTIL
gemma4:26b 18712148f3a52 20 GB 100% GPU 32768 4 minutes from now
myuser@my-Mac-mini~ % ollama show gemma4:26b
Model
architecture gemma4
parameters 25.8B
context length 262144
embedding length 2816
quantization Q4_K_M
requires 0.20.0
Capabilities
completion
vision
tools
thinking
Parameters
temperature 1
top_k 64
top_p 0.95
License
Apache License
Version 2.0, January 2004
...
It seems that the model is using 20GB of memory, which is quite a lot. So machine memory with 32GB is minimal for running this model, and it may not be able to run other applications smoothly while the model is running. I will see how it performs under different scenarios.