In recent years, large language models (LLMs) have become increasingly popular for their ability to generate human-like text and assist in various tasks. Not all models are open-source, but pretty much every big tech company has a version that they allow users to download and run.
Running these models locally on your Mac can be beneficial for privacy and cost reasons. In this article, we’ll explore how to install and run LLMs using Ollama, a powerful tool for developers and novice users, and discuss some GUI tools that can simplify the process for those who prefer not to use the macOS terminal.
This way you can access the latest AI text models from Google, Microsoft, Meta, and other AI companies like Mistral and DeepSeek.
Understanding LLMs: What Do Parameters Mean?
Before diving into the installation process, let’s clarify what numbers like “7B” or “14B” mean when referring to LLMs. These numbers represent the model’s size in terms of parameters (billions), which are essentially the “knobs and switches” that get fine-tuned during training.
A larger number of parameters allows the model to capture more complex patterns and relationships in language, potentially leading to better performance. However, it’s important to note that more parameters don’t always guarantee better results; the quality of the training data and computational resources available to the model also play significant roles.
System Requirements
You’ll want to make sure your Mac has at least the following specs:
- macOS 10.15 or later (macOS 13 of higher recommended)
- At least 8GB of RAM (16GB or higher recommended)
- 10GB of free storage (min needed for the smallest models; the most advanced models with the max number of parameters top out around 700 GB)
- A multi-core Intel or Apple Silicon processor (M2 or higher preferred)
Installing Ollama
Ollama is an open-source tool that allows you to run LLMs directly on your local machine. Here’s how to get started:
- Download Ollama: Visit the Ollama website and download the macOS version. You can also use Homebrew for installation by running
brew install ollama
in your terminal.
- Install Ollama: If you downloaded the installer, double-click it, and follow the setup wizard. If using Homebrew, skip to step 4.
- The next step for the installer is to simply click the Install button, which will ask for your administrator password.
- Run Ollama: Open a terminal and start Ollama as a service with
brew services start ollama
. This will make Ollama accessible athttp://localhost:11434/
. - Download and Run a Model: Use the command
ollama pull <model-name>
to download a model, and thenollama run <model-name>
to run it. For example, to run the DeepSeek-R1 model, useollama pull deepseek-r1
followed byollama run deepseek-r1
. - To download a larger parameter version of a model, just add a colon followed by the size. For example, to download the 14B DeepSeek model, you would use the ollama run deepseek-r1:14b command. On the Ollama, website, you can click on any of the models and it will show you all the different commands for each version.
You can find the full list of models for Ollama here. You can also just use the second command as shown above and if it’s not already installed, it’ll pull the model first and then run it.
Now when you see the three right angle arrows (>>>), you can start typing commands to the model.
That’s it! You can now interact locally with your LLM and don’t have to worry about costs, overusing it, or that other people can read what you are typing. This is great if you have some sensitive or personal topics you would like to discuss with an AI LLM, but don’t want a big tech company reading your thoughts.
Using GUI Tools with Ollama
While Ollama is powerful and efficient for developers, some users might prefer a graphical user interface (GUI) to interact with LLMs. Here are a few tools that can provide a GUI frontend for Ollama:
- Ollama GUI: This is a free and open-source app for macOS users built using SwiftUI. It offers a pretty interface and is an excellent option for those who want to access LLMs locally without using the terminal.
- Ollama UI: A simple HTML-based web UI that allows you to select and interact with models directly in your browser. It also includes a Chrome extension for easy access.
- Chatbox AI: This one is the easiest for beginners and the one I explain how to use below. Note that you do not need to get the Chatbox AI service, which is a subscription they offer so that you can access all the LLMs without having to install ollama yourself.
To use the nicer Chatbox AI interface, go to the download page and get the version for your Mac.
Next, open the installer and run the program. On the first screen, a popup will appear asking how you want to use Chatbox AI.
You’ll want to click the Use My Own API Key / Local Model button.
Next, click on Ollama API as the AI model provider.
Chatbox AI should automatically detect that Ollama is running and it sets the API host to the default value, which is your loopback IP address and the port number 11434. You don’t have to change anything here. Under model, you should see the models that you had installed earlier using Terminal. In my case, it’s the llama 3.2 model.
Back on the main page of the app, click on Just chat or New Chat and make sure the correct model is selected down at the bottom right. You can change the model anytime, but I would recommend creating a new chat for each model you use so you can see the differences easily.
Alternative Tools for Running LLMs Locally
If you’re looking for alternatives to Ollama or prefer a more user-friendly experience from the start, LM Studio is another excellent option. It provides an intuitive interface for exploring and using various AI models, allowing you to download and run models with ease. LM Studio is available on Linux, Mac, and Windows and offers features like model parameter customization and chat history. It’s free for now, so that’s why I would recommend it over Chatbox AI’s service subscription.
Conclusion
Running large language models locally on your Mac will be important for anyone looking to gain more control over their AI applications and prioritize data privacy. With tools like Ollama and the GUI frontends mentioned making it easy to integrate LLMs into your workflow, you can securely unlock new possibilities for productivity and creativity.
However, getting started with a local LLM deployment requires some understanding of parameters and how they impact the performance of these models on your specific hardware. The best way to figure this out is by playing around with the different models to see which one gives you the best results. Once you understand that concept, you’ll be able to make better decisions about which models to use and how to optimize their performance for your own needs.