Three months wrong about why my 4-node AMD cluster was slow

Alex Ziskind
Alex Ziskind
Published on 15.05.2026

Three months on a 4-node Minisforum MS-S1 Max Strix Halo cluster, and the one assumption that was quietly killing my inference speed.
Try out ChatLLM - http://chatllm.abacus.ai/ltf and Abacus AI DeepAgent - http://deepagent.abacus.ai/ltf

๐Ÿ›’ Gear Links ๐Ÿ›’
๐Ÿ‘€ 2 400Gbps switch: https://bhpho.to/4r9vJi1
๐Ÿ‘€๐Ÿ‘€ 4 400Gbps switch: https://bhpho.to/4qQqOlI
๐Ÿ’ปโ˜• Thunderbolt 5 external SSD: https://amzn.to/3XqetZO
๐Ÿ’ปโ˜• Favorite 15" display with magnet: https://amzn.to/3zD1DhQ
๐ŸŽงโšก Great 40Gbps T4 enclosure: https://amzn.to/3JNwBGW
๐Ÿ› ๏ธ๐Ÿš€ My nvme ssd: https://amzn.to/3YLEySo
๐Ÿ“ฆ๐ŸŽฎ My gear: https://www.amazon.com/shop/alexziskind

๐ŸŽฅ Related Videos ๐ŸŽฅ
๐Ÿ† Skip M3 Ultra & RTX 5090 for LLMs | NEW 96GB KING - https://youtu.be/bAao58hXo9w
๐Ÿ’ป Smallest RTX Pro 6000 rig | OVERKILL - https://youtu.be/JbnBt_Aytd0
๐Ÿ”ง Cheap mini runs a 70B LLM ๐Ÿคฏ - https://youtu.be/xyKEQjUzfAk
๐ŸŒ™ RAM torture test on Mac - https://youtu.be/l3zIwPgan7M
๐Ÿš€ FREE Local LLMs on Apple Silicon | FAST! - https://youtu.be/bp2eev21Qfo
๐Ÿชž REALITY vs Appleโ€™s Memory Claims | vs RTX4090m - https://youtu.be/fdvzQAWXU7A
๐Ÿ“ฆ Set up Conda - https://youtu.be/2Acht_5_HTo
๐Ÿค– INSANE Machine Learning on Neural Engine - https://youtu.be/Y2FOUg_jo7k

* ๐Ÿ› ๏ธ Developer productivity Playlist - https://www.youtube.com/playlist?list=PLPwbI_iIX3aQCRdFGM7j4TY_7STfv2aXX
๐Ÿ”— AI for Coding Playlist: ๐Ÿ“š - https://www.youtube.com/playlist?list=PLPwbI_iIX3aSlUmRtYPfbQHt4n0YaX0qw

โ€” โ€” โ€” โ€” โ€” โ€” โ€” โ€” โ€”

โค๏ธ SUBSCRIBE TO MY YOUTUBE CHANNEL ๐Ÿ“บ
Click here to subscribe: https://www.youtube.com/@AZisk?sub_confirmation=1

โ€” โ€” โ€” โ€” โ€” โ€” โ€” โ€” โ€”

Join this channel to get access to perks:
https://www.youtube.com/channel/UCajiMK_CY9icRhLepS8_3ug/join

โ€” โ€” โ€” โ€” โ€” โ€” โ€” โ€” โ€”

๐Ÿ“ฑ ALEX ON X: https://twitter.com/digitalix
Donato's channel: https://www.youtube.com/@donatocapitella

#macmini #llm #nvidia

Runtime 00:22:26

software developer, programmer, software development, programming, developer, developer tests, m3 chip, machine learning, llm, m3max, m3 machine learning, m3 ai, webui, open webui, local ai, gmktec, nuc, beelink, mini pc, m4 pro, mac mini, apple, apple mini, mini, m4 mini, ryzen, Al Max+ 395, ollama, comfy ui, tiiny, tiiny pocket, tiiny pocket lab, pocket lab, macbook, macbook neo, turboquant, quantization, nvidia, mac nvidia, apple nvidia, strix halo, minisforum, ms-s1 max, framework,

COMMENTS: 0