
The video blogger demonstrated a computing cluster of new Apple Mac minis running on the M4 processor. Sometimes it is better than a powerful video card.
Many people think that getting a more powerful computer means buying one expensive device. But there are other ways to perform a large number of calculations. The concept of clusters allows you to use many computers, or at least computing units. Working together to perform tasks in parallel leads to a significant reduction in processing time.
In a YouTube video, enthusiast Alex Ziskind demonstrates how to set up cluster computing using five M4 Mac mini. The cluster receives tasks that are distributed among all machines. Typical small clusters rely on an Ethernet network to communicate between nodes, but YouTuber has harnessed the potential of Thunderbolt connectivity with the Thunderbolt Bridge. This significantly speeds up communication between nodes and also allows for larger data packets to be sent.
Ethernet can operate at 1 Gbps under normal conditions, or up to 10 Gbps if you have the right computers that support this speed. Thunderbolt Bridge, on the other hand, reaches speeds of up to 40 Gbps for Thunderbolt 4 ports or 80 Gbps for Thunderbolt 5 in bidirectional mode on models with M4 Pro and M4 Max chips.
Ziskind notes that using Apple Silicon for cluster computing can be more profitable than a PC with a powerful graphics card. GPU data processing depends on having a significant amount of available video memory. On a graphic card, this can be, for example, 8 GB, which is not much even for games. The use of unified memory on Apple Silicon is less restrictive in terms of configuration and allows you to use larger amounts — in fact, the Apple Silicon GPU has access to much more memory, especially in the case of a Mac with 32 GB of RAM.
In addition, graphics cards consume a lot of power. High consumption means higher ongoing operating costs. It turned out that Mac mini computers consume very little — a cluster of five Mac minis consumes less than a single high-performance graphics card.
To run the cluster, Alex Ziskind uses MLX, an open-source Apple project described as a «array structure designed for efficient and flexible machine learning research on Apple Silicon». MLX uses the standard MPI distributed computing methodology to operate. The project can run multiple Macs of different performance, without significant hardware costs. Among other things, MLX is optimized for small clusters.
Effective, but not always
While combining the performance of multiple Mac minis into a cluster seems appealing, not every task will benefit. There is little to no benefit for typical Mac use — running an application, playing games, etc. The technology is designed for processing large amounts of data or for high-intensity tasks that benefit from parallel processing. This makes the cluster ideal for working with artificial intelligence, including language models (LLM).
It also does not the easiest way to use computer for a typical Mac user. In his tests, Ziskind found that buying a Mac with an M4 Pro provided more LLM performance than two M4s in a cluster. Such a cluster can be useful when you need more performance than you can get from a single powerful Mac. If a model is too large to run on a single Mac, for example due to memory limitations, a cluster can offer more.
The enthusiast argues that at this stage, a high-end Mac with M4 Max and a large amount of memory is more efficient than a cluster of less productive machines. But if the requirements of the task somehow go beyond the highest Mac configuration, a cluster can help.
Spelling error report
The following text will be sent to our editors: