Run LLM Inference Using Apple Hardware | by Christopher Karg | Jan, 2024

Editor
3 Min Read


Unlock Apple GPU power for LLM inference with MLX

Source: https://www.pexels.com/photo/train-railway-near-trees-552779/

We are in a position to run inference and fine-tune our own LLMs using Apple’s native hardware. This article will cover the setup for creating your own experiments and running inference. In the future I will be making an article on how to fine-tune these LLMs (again using Apple hardware).

If you haven’t checked out my previous articles, I suggest doing so as I make a case for why you should consider hosting (and fine-tuning) your own open-source LLM. I also cover strategies as to how you can optimise the process to reduce inference and training times. I will brush over topics such as quantisation as these are covered in depth in the aforementioned articles.

I will be using the mlx framework in combination with Meta’s Llama2 model. In-depth information on how to access the models can be found in my previous article. However, I’ll briefly explain how to do so in this article as well.

Let’s get started.

  1. A machine with an M-series chip (M1/M2/M3)
  2. OS >= 13.0
  3. Python between 3.8–3.11

For my personal hardware set-up, I am using a MacBook Pro with an M1 Max chip — 64GB RAM // 10-Core CPU // 32-Core GPU.

My OS is Sonoma 14.3 // Python is 3.11.6

As long as you meet the 3 requirements listed above, you should be able to follow along. If you have around 16GB RAM, I suggest sticking with the 7B models. Inference times etc. will of course vary depending on your hardware specs.

Feel free to follow along and set-up a directory where you will store all files relating to this article. It will make the process a lot easier if they’re all in one place. I’m calling mine mlx.

First we need to ensure you are running a native arm version of Python. Otherwise we will be unable to install mlx. You can do so by running the following command in your terminal:

python -c "import platform; print(platform.processor())"
Share this Article
Please enter CoinGecko Free Api Key to get this plugin works.