MLNews

Mistral 7B AI offers its first major language model available to the public for free.

Mistral AI offers its first major language model available to the public for free.

Although the most popular language models are accessible via API, open models — as far as that term can be taken seriously — are gaining traction. Mistral, a French AI firm that raised a large seed round in June, has recently unveiled its first model, which it claims outperforms others of its size – and it’s completely free to use.

The Mistral 7B model is now available for download in a variety of formats, including a 13.4-gigabyte torrent (with a few hundred seeders). For collaboration and troubleshooting, the company has also launched a GitHub repository and a Discord channel.

The model was distributed under the Apache 2.0 license, which is a highly permissive system with no limits on usage or copying other than attribution. That is, the model can be utilized by anyone, whether a hobbyist, a multibillion-dollar enterprise, or the Pentagon, as long as they have a machine capable of running it locally or are willing to pay for the necessary cloud resources.

Mistral 7B is a modification of earlier “small” large language models such as Llama 2, providing comparable capabilities (according to some common benchmarks) at a significantly lower compute cost. GPT-4 foundation models can accomplish much more, but they are significantly more expensive and complicated to run, therefore they are only available via APIs or remote access.

“Our ambition is to become the leading supporter of the open generative AI community, and to bring open models to state-of-the-art performance,” Mistral’s team stated in a blog post accompanying the model’s release. “The performance of the Mistral 7B demonstrates what small models can do with enough conviction.” This is the culmination of three months of hard work in which we established the Mistral AI team, rebuilt a high-performance MLops stack from scratch, and designed the most advanced data processing pipeline.”

That list may seem like more than three months’ effort to some (maybe most), but the creators had a head start because they had worked on similar models at Meta and Google DeepMind. That doesn’t make it any easier, but they knew what they were doing.

Of course, while it may be downloaded and used by anybody, it is not “open source” or any variation of that term, as we addressed last week at Disrupt. Though the license is quite broad, the model was built privately, with private funds, and the datasets and weights are also private.

That looks to be Mistral’s business plan: the free model is free to use, but if you want to delve deeper, you’ll need their subscription version.

“[Our commercial offering] will be distributed as white-box solutions, with access to both weights and code sources.” “We are actively developing hosted solutions and dedicated deployment for enterprises,” according to the blog post.

I’ve asked Mistral for clarification on some of the openness and their plans for future releases, and I’ll update this page if I hear back.

Reference

TechCrunch.com


Similar Posts

Signup MLNews Newsletter

What Will You Get?

Bonus

Get A Free Workshop on
AI Development