d-Matrix has launched Corsair, an innovative platform that promises to revolutionize the landscape of generative AI applications. This groundbreaking processor delivers lightning-fast token generation, making it commercially viable for various interactive AI tasks. The introduction of Corsair marks a significant advancement in the capabilities of AI inference, particularly in handling large models.
Corsair is engineered to achieve remarkable processing speeds, boasting the ability to manage 60,000 tokens per second at just one millisecond per token. This impressive performance is powered by an astounding 2400 TFLOPs of 8-bit peak computing power. With 2GB of integrated performance memory and an off-chip memory capacity of 256GB, Corsair is purpose-built for demanding AI inference tasks.
The architecture of Corsair is founded upon d-Matrix’s advanced Nighthawk and Jayhawk II tiles, utilizing a sophisticated 6nm manufacturing process. Each Nighthawk tile features four neural cores alongside a RISC-V CPU, optimized specifically for large-model inference. This design enables Corsair to incorporate digital in-memory computation (DIMC) and supports diverse datatype processing, including block floating point (BFP).
In terms of compatibility, Corsair adheres to the PCIe Gen5 full-height, full-length card form factor. This feature allows it to integrate seamlessly with DMX Bridge cards, thereby enhancing scalability for larger deployments. The chiplet packaging further integrates memory and computation, optimizing operational efficiency across various applications.
Corsair has garnered support from tech giant Microsoft, which is aiding in its rollout to early-access customers. Broader availability is anticipated by the second quarter of 2025, allowing a wider range of businesses to leverage its capabilities.
Sid Sheth, co-founder and CEO of d-Matrix, emphasized the company’s vision for Corsair. He stated:
“Our vision for d-Matrix was to address the massive computing challenges of generative AI and transformers.”
Additionally, Micron Technology is collaborating with d-Matrix to bolster Corsair’s development and expansion, ensuring that this innovative processor meets the growing demands of the industry.
For larger-scale applications, Corsair is capable of generating 30,000 tokens per second at two milliseconds per token on a single rack. This capability positions it as a formidable tool for organizations looking to harness the power of generative AI effectively.
Featured image courtesy of ITPro