Skip to content

Home
About Us
Startup
Finance

Search

All News, Featured

Apple Unveils Cutting-Edge AI Models Surpassing Mistral and Hugging Face

Merima Hadžić

July 24, 2024

In a significant move, Apple has unveiled a new family of open-source AI models, named DCLM, on Hugging Face. This release includes two primary models: one with 7 billion parameters and another with 1.4 billion parameters, both showing strong performance on various benchmarks.

Details of Apple DCLM Models:

The DCLM-7B model, featuring 7 billion parameters and a 2K context window (extendable to 8K), was trained on 2.5 trillion tokens and achieved a 5-shot accuracy of 63.7% on the MMLU benchmark. It is available under Apple’s Sample Code License. The DCLM-1.4B model, with 1.4 billion parameters and a 2K context window, was trained on 2.6 trillion tokens and scored 41.9% on the same benchmark. This smaller model is released under the Apache 2.0 license, allowing for commercial use and modification.

Apple’s new DCLM models were developed as part of the DataComp for Language Models project, a collaborative effort involving researchers from Apple, University of Washington, Tel Aviv University, and Toyota Institute of Research. The project’s aim is to create high-quality datasets for training AI models, focusing on data curation strategies to enhance model performance.

The DCLM-7B model, trained on 2.5 trillion tokens, achieves a 5-shot accuracy of 63.7% on the MMLU benchmark, outperforming previous state-of-the-art open-data language models like MAP-Neo. Its performance is close to leading open models such as Mistral-7B-v0.3 and Llama3 8B. The model’s capabilities were further improved by extending the context window to 8K using additional training on the same dataset.

Similarly, the DCLM-1.4B model, trained jointly with Toyota Research Institute, also shows impressive results. It scored 41.9% on the MMLU 5-shot test, outperforming other models in its category, including Hugging Face’s SmolLM and Qwen-1.5B.

Both models have been released under different licenses, with the larger model available under Apple’s Sample Code License and the smaller one under Apache 2.0, allowing for commercial use and modification. Additionally, an instruction-tuned version of the 7B model is available in the Hugging Face library.

«Nebius Rises from Yandex’s Ashes to Become Europe’s Leading AI Compute Contender

ThinkMarkets Integrates with TradingView»

Merima Hadžić

Hi! I’m Merima, but you can call me Meri. I’ve been writing for as long as I can remember, and I’m often drawn to business news because stories of thriving companies inspire me, and I hope they inspire you too.

Search

Search

Categories

All News (1,247)
Analysis (5)
Economy (29)
Enterprise (79)
Featured (395)
Finance (58)
Funding Rounds (378)
General (113)
Investment (224)
IPO (70)
Market Research (2)
Mergers & Acquisitions (131)
Others (16)
Press Release (5,946)
Startup (509)
Uncategorized (39)

Latest posts

May 16, 2025

.

Merima Hadžić

Indian Startups Navigate Funding Challenges Amid Selective Investment Climate
May 16, 2025

.

Merima Hadžić

SpaceX Secures Approval for Seventh Starship Launch
April 27, 2025

.

Merima Hadžić

Slip Robotics Revolutionizes Freight Loading with Innovative SlipBots
March 24, 2025

.

Merima Hadžić

Ramp Ventures into Treasury Services with New Product Launch
March 16, 2025

.

Merima Hadžić

TC All Stage Set to Revolutionize Startup Scaling at Boston Event

VCNN is your primary source for all venture capital news. We provide the latest breaking news and insider stories straight from the venture capital scene.

© Copyright © 2016 – 2024 VCNewsnetwork