Seven Free Open Source GPT Models Released
- NewsSoftware
- April 7, 2023
- No Comment
- 232
[ad_1]
Silicon Valley AI firm Cerebras launched seven open supply GPT fashions to offer an alternative choice to the tightly managed and proprietary methods accessible right this moment.
The royalty free open supply GPT fashions, together with the weights and coaching recipe have been launched beneath the extremely permissive Apache 2.0 license by Cerebras, a Silicon Valley primarily based AI infrastructure for AI functions firm.
To a sure extent, the seven GPT fashions are a proof of idea for the Cerebras Andromeda AI supercomputer.
The Cerebras infrastructure permits their prospects, like Jasper AI Copywriter, to shortly practice their very own customized language fashions.
A Cerebras blog post in regards to the {hardware} know-how famous:
“We educated all Cerebras-GPT fashions on a 16x CS-2 Cerebras Wafer-Scale Cluster known as Andromeda.
The cluster enabled all experiments to be accomplished shortly, with out the normal distributed methods engineering and mannequin parallel tuning wanted on GPU clusters.
Most significantly, it enabled our researchers to give attention to the design of the ML as an alternative of the distributed system. We consider the potential to simply practice giant fashions is a key enabler for the broad group, so we now have made the Cerebras Wafer-Scale Cluster accessible on the cloud by way of the Cerebras AI Model Studio.”
Cerebras GPT Fashions and Transparency
Cerebras cites the focus of possession of AI know-how to only a few firms as a motive for creating seven open supply GPT fashions.
OpenAI, Meta and Deepmind maintain a considerable amount of details about their methods non-public and tightly managed, which limits innovation to regardless of the three companies determine others can do with their information.
Is a closed-source system finest for innovation in AI? Or is open supply the longer term?
Cerebras writes:
“For LLMs to be an open and accessible know-how, we consider it’s essential to have entry to state-of-the-art fashions which can be open, reproducible, and royalty free for each analysis and industrial functions.
To that finish, we now have educated a household of transformer fashions utilizing the most recent methods and open datasets that we name Cerebras-GPT.
These fashions are the primary household of GPT fashions educated utilizing the Chinchilla formulation and launched by way of the Apache 2.0 license.”
Thus these seven fashions are launched on Hugging Face and GitHub to encourage extra analysis by way of open entry to AI know-how.
These fashions have been educated with Cerebras’ Andromeda AI supercomputer, a course of that solely took weeks to perform.
Cerebras-GPT is totally open and clear, not like the most recent GPT fashions from OpenAI (GPT-4), Deepmind and Meta OPT.
OpenAI and Deepmind Chinchilla don’t supply licenses to make use of the fashions. Meta OPT solely provides a non-commercial license.
OpenAI’s GPT-4 has completely no transparency about their coaching information. Did they use Widespread Crawl information? Did they scrape the Web and create their very own dataset?
OpenAI is preserving this info (and extra) secret, which is in distinction to the Cerebras-GPT strategy that’s totally clear.
The next is all open and clear:
- Mannequin structure
- Coaching information
- Mannequin weights
- Checkpoints
- Compute-optimal coaching standing (sure)
- License to make use of: Apache 2.0 License
The seven variations are available in 111M, 256M, 590M, 1.3B, 2.7B, 6.7B, and 13B fashions.
IT was announced:
“In a primary amongst AI {hardware} firms, Cerebras researchers educated, on the Andromeda AI supercomputer, a collection of seven GPT fashions with 111M, 256M, 590M, 1.3B, 2.7B, 6.7B, and 13B parameters.
Usually a multi-month endeavor, this work was accomplished in a number of weeks because of the unbelievable pace of the Cerebras CS-2 methods that make up Andromeda, and the power of Cerebras’ weight streaming structure to get rid of the ache of distributed compute.
These outcomes exhibit that Cerebras’ methods can practice the biggest and most advanced AI workloads right this moment.
That is the primary time a set of GPT fashions, educated utilizing state-of-the-art coaching effectivity methods, has been made public.
These fashions are educated to the best accuracy for a given compute funds (i.e. coaching environment friendly utilizing the Chinchilla recipe) so that they have decrease coaching time, decrease coaching price, and use much less power than any current public fashions.”
Open Supply AI
The Mozilla basis, makers of open supply software program Firefox, have started a company called Mozilla.ai to construct open supply GPT and recommender methods which can be reliable and respect privateness.
Databricks additionally lately launched an open supply GPT Clone called Dolly which goals to democratize “the magic of ChatGPT.”
Along with these seven Cerebras GPT fashions, one other firm, known as Nomic AI, launched GPT4All, an open supply GPT that may run on a laptop computer.
At the moment we’re releasing GPT4All, an assistant-style chatbot distilled from 430k GPT-3.5-Turbo outputs that you could run in your laptop computer. pic.twitter.com/VzvRYPLfoY
— Nomic AI (@nomic_ai) March 28, 2023
The open supply AI motion is at a nascent stage however is gaining momentum.
GPT know-how is giving delivery to huge adjustments throughout industries and it’s doable, possibly inevitable, that open supply contributions might change the face of the industries driving that change.
If the open supply motion retains advancing at this tempo, we could also be on the cusp of witnessing a shift in AI innovation that retains it from concentrating within the arms of some companies.
Learn the official announcement:
Cerebras Systems Releases Seven New GPT Models Trained on CS-2 Wafer-Scale Systems
Featured picture by Shutterstock/Merkushev Vasiliy
window.addEventListener( 'load2', function() console.log('load_fin');
if( sopp != 'yes' && !window.ss_u )
!function(f,b,e,v,n,t,s) if(f.fbq)return;n=f.fbq=function()n.callMethod? n.callMethod.apply(n,arguments):n.queue.push(arguments); if(!f._fbq)f._fbq=n;n.push=n;n.loaded=!0;n.version='2.0'; n.queue=[];t=b.createElement(e);t.async=!0; t.src=v;s=b.getElementsByTagName(e)[0]; s.parentNode.insertBefore(t,s)(window,document,'script', 'https://connect.facebook.net/en_US/fbevents.js');
if( typeof sopp !== "undefined" && sopp === 'yes' ) fbq('dataProcessingOptions', ['LDU'], 1, 1000); else fbq('dataProcessingOptions', []);
fbq('init', '1321385257908563');
fbq('track', 'PageView');
fbq('trackSingle', '1321385257908563', 'ViewContent', content_name: 'seven-free-open-source-gpt-models-released', content_category: 'news' );
);
[ad_2]
Source link