Deploying LLMs locally with Apple’s MLX framework | by Heiko Hotz

A technical deep dive into the new deep learning library MLX

In December 2023, Apple released their new MLX deep learning framework, an array framework for machine learning on Apple silicon, developed by their machine learning research team. This tutorial will explore the framework and demonstrate deploying the Mistral-7B model locally on a MacBook Pro (MBP). We’ll set up a local chat interface to interact with the deployed model and test its inference performance in terms of tokens generated per second. Additionally, we’ll delve into the MLX API to understand the available levers for altering the model’s behaviour and influencing the generated text.

As usual, the code is available in a public GitHub repository: https://github.com/marshmellow77/mlx-deep-dive

Apple’s new machine learning framework, MLX, offers notable advantages over other deep learning frameworks with its unified memory architecture for machine learning on Apple silicon. Unlike traditional frameworks such as PyTorch and Jax, which require costly data copying between CPU and GPU, MLX maintains data in shared memory accessible to both. This design eliminates the overhead of data transfers, facilitating faster execution, particularly with the large datasets common in machine learning. For complex ML tasks on Apple devices, MLX’s shared memory architecture could lead to significant speed-ups. This feature makes MLX highly relevant for developers looking to run models on-device, such as on iPhones.

With Apple’s expertise in silicon design, MLX hints at the exciting capabilities that could be integrated into their chips for future on-device AI applications. The potential of MLX to accelerate and streamline ML tasks on Apple platforms makes it a framework developers should keep on their radar.

Before deploying the model, some setup is required. Firstly, it’s essential to install certain libraries. Remember to create a virtual environment before proceeding with the installations:

pip install mlx-lm