LoRA Fine-Tuning On Your Apple Silicon MacBook | by Matthew Gunton

Contents

Let’s Go Step-By-Step Fine-Tuning On Your MacBook What is MLX (and who can use it?)

Let’s Go Step-By-Step Fine-Tuning On Your MacBook

As models become smaller, we are seeing more and more consumer computers capable of running LLMs locally. This both dramatically reduces the barriers for people training their own models and allows for more training techniques to be tried.

One consumer computer that can run LLMs locally quite well is an Apple Mac. Apple took advantage of its custom silicon and created an array processing library called MLX. By using MLX, Apple can run LLMs better than many other consumer computers.

In this blog post, I’ll explain at a high-level how MLX works, then show you how to fine-tune your own LLM locally using MLX. Finally, we’ll speed up our fine-tuned model using quantization.

Let’s dive in!

What is MLX (and who can use it?)

MLX is an open-source library from Apple that lets Mac users more efficiently run programs with large tensors in them. Naturally, when we want to train or fine-tune a model, this library comes in handy.

The way MLX works is by being very efficient with memory transfers between your Central Processing Unit (CPU), Graphics Processing Unit (GPU), and…