How Meta’s (Facebook) challenge to GPT-3 will affect you [Storytime Saturdays]
There are many sides to consider.
To learn more about the newsletter, check our detailed About Page + FAQs
To help me understand you better, please fill out this anonymous, 2-min survey. If you liked this post, make sure you hit the heart icon in this email.
Greetings all,
As I’ve been hinting at for a while, there have been some massive partnerships underway for this newsletter. There are several educational institutions that will be sharing my content with their students!!! Among these is The Plenum School, the best international boarding school in India. Those involved with the organization will (including prospective parents) now be receiving this exceptional newsletter so that they can see how the Plenum School is preparing its students to win in the rapidly changing world order.
To the Plenum Parents, welcome aboard. You’re going to love what you see. To learn more about me, go through my LinkedIn and Medium blogs. You can also go through the FAQs, linked above. To the rest of my awesome readers, I have a special gift for you. To celebrate this occasion, today’s StoryTime Saturday will be open to all.
Today’s email will cover an extremely important development in the field of AI and Machine Learning. This development affects us all but hasn’t been given enough attention. Don’t worry, you won’t need any Deep Learning expertise to understand what we will cover, how it affects you, and what steps you can take to take advantage of these developments.
Highlights
This post/email will cover the following ideas:
What happened: Microsoft and Open AI have been making a lot of headlines with their GPT-3. Many people were lining up to buy their model. Meta released their version of GPT-3 completely for free.
Why this matters: Obviously this is a challenge for Microsoft’s plan to make money from GPT-3. However, that is not all. Meta sharing everything about their model is a huge change from how big tech normally operates. This shakes up the status quo.
Why you should care: I’ve been talking about ‘The Big Tech war’ for a while. All the big tech companies have monopolized their segments, and thus are now trying to encroach on each other’s turf. They are throwing money to acquire both customers and experts (developers and otherwise). By keeping your eyes open and following the advice given here, you will be able to make a lot of money and gain permanent security.
We’ll look at the story and see how this story affects different parties. Sound good? Let’s get right into it.
The Story
Meta AI recently released Open Pretrained Transformer (OPT-175B), “a language model with 175 billion parameters trained on publicly available data sets”. While this might seem like another big company joining the LLM wars, the way they did it was a shock in the Machine Learning Community. In their post, Democratizing access to large-scale language models with OPT-175B, Meta had the following to say
For the first time for a language technology system of this size, the release includes both the pretrained models and the code needed to train and use them.
This is quite exciting for a lot of reasons. One most people have no hope of understanding what the details of working with problems of this scale entail. Therefore from a purely educational perspective, this will be exceptional learning for anybody that goes through this (more details later). However, this also had profound implications on the Deep Learning industry, one that many people haven’t thought about.
The assumed monetization strategy for the complex LLMs was simple. They could seriously boost productivity, so they could have been sold as APIs either directly or embedded in another service. This article will cover how Meta releasing their model seriously changes the landscape of the industry. To fully understand the implications, let’s first understand the context around this.
Background- LLMs and Machine Learning
The introduction of LLMs (Large Language Models) has been a game-changer. Large language models — natural language processing (NLP) systems with more than 100 billion parameters — have produced insane results in NLP and AI research over the last few years. I have covered some of them in my content. The video Machine Learning News you must know- April 2022 and my article, Google AI sparks a revolution in Machine Learning are the some recent examples.
Most popular among these is the legendary GPT-3 Model by OpenAI. The trend-setter, GPT really showed us the potential of using Transformers and large datasets to achieve performance at a variety of complex tasks. It caused shockwaves in the mainstream narratives when OpenAI debuted Github Copilot, an AI code completion service trained on a descendant on GPT-3.
Since then, GPT-3 has gone on to add tons of new abilities including editing texts (including code) in particular styles and error-correcting. Around Late March/Early April they made waves when they released DALLE-2, a Deep Learner that could generate images from text descriptions. Here is an example
However, Meta AI’s decision to completely open up its models has hijacked the discourse recently. As promised, I will go into some of the interesting aspects of the resources they shared and why this is a game-changer.
Important Talking Points
This decision affects several stakeholders in different ways. Here are a few-
Researchers/Other people looking to learn from this.
Meta itself
OpenAI
The ML/Software Development industry.
Educational/Research Impact
This is a huge win for researchers or anyone looking to learn about Machine Learning. Most notably, this is the antidote to the replication crisis in Machine Learning. I have covered AI’s replication crisis in this article. However, to give you a nutshell, much of Machine Learning is impossible/impractical to reproduce and verify. When it comes to the big companies- like Facebook, Google, and Microsoft- much of this occurs because they are able to train models at a scale that no one else can replicate.
This becomes a problem since it makes it impossible for outside people to break down their findings and find flaws in their methodology. It also severely limits the amount of meaningful discussion we can surrounding a paper/finding when you can’t dig into the nuances of the setup for it.
However, that is not all that makes this a big win for Machine Learning Education. When Meta released their code, they also released a lot of other resources. These resources detail the various facets of their large-scale system. My personal recommendation is to read through their Chronicles of OPT-175B training. They detail a lot of the challenges they went through as they were training at this insane scale. Take a look at the following section
It’s been really rough for the team since the November 17th update. Since then, we’ve had 40+ restarts in the 175B experiment for a variety of hardware, infrastructure, or experimental stability issues.
The vast majority of restarts have been due to hardware failures and the lack of ability to provision a sufficient number of “buffer” nodes to replace a bad node with once it goes down with ECC errors.
Taken from their log, Update on 175B Training Run: 27% through
This was an amazing decision taken by the Meta AI team. Reading through these has been interesting, and for anybody who wants to get into Large Scale Deep Learning, understanding their challenges is a must. From a research/education perspective, this publication is a huge win.
Impact on Meta
The impact of this on Meta is going to be harder to evaluate. Releasing this model in the way allowed them to really gain a lot of positive publicity. And the model being released for free also means that people are now much less likely to use paid models from their competitors. This is an edge by itself.
This process also has two other notable advantages. Firstly, since the model is open, it is possible for people to find and discover areas for improvement. This facet of the open-source culture is what is responsible for the explosive growth of tech over the last 2 decades. This gives them access to potentially millions of hours of free debugging/testing done by the community. And they get a lot of insight about what facets the ML community finds the most important/engages with the most.
The second advantage is the familiarity with the Meta tech and tools. This is something that a lot of people overlook. Let’s take the example of Tensorflow, by Google. Most serious ML practitioners are proficient with it. This makes it easy for Google to hire ML engineers since most developers will be familiar with the tech. The amount of resources they need to spend training new engineers thus goes down drastically
All of these are huge positives. However, this is offset by a huge problem. Training such a model was extremely costly. To give the whole thing away for free will have a lot of financial implications. While it puts a damper on Open AI and their monetization of GPT-3, it is also going to make it harder for Meta to monetize such a model in the future. However, Meta was wildly profitable last year, so perhaps the pros outweigh this.
Impact on Open AI
This is a huge L for Open AI. We have already covered how this will take away a large chunk of the potential customers. It seems like Meta AI has recently decided to pick a fight with Open AI products. Between Make-A-Scene, their work modernizing CNNs to match Vision Transformers, and OPT, we see a lot of recent releases being competitors to Open AI products.
We developed OPT-175B with energy efficiency in mind by successfully training a model of this size using only 1/7th the carbon footprint as that of GPT-3
The industry
The Machine Learning industry is definitely licking its lips at this development. For the reasons already mentioned, this is a huge win for AI researchers and developers. This is indirectly a win for the industry.
There are two ways that this situation can play out-
Other tech companies join this trend and they start undercutting each other to gain an edge in the market. Economics tells us that this is amazing for consumers (us).
Business as usual in the industry. The other big companies don’t take the bait. For all the reasons mentioned earlier, this is already a huge win for consumers.
What is often lost when we cover important Machine Learning research is that most of the industry consists of small to medium-sized companies/groups solving very specific problems. While this puts pressure on the big tech companies, this is overwhelmingly a win for the smaller companies, since they get to learn from and use the insights generated from these massive companies without having to churn through the resources themselves. Therefore, this is a huge win for the industry as a whole.
This is also an indication of the increasing competition amongst all the big tech companies. They’re all fighting each other to gain new customers/avenues for monetization. This means that they will spend a lot to acquire domain experts and great developers to gain an edge. To those of you looking to develop your foundations in Tech, Computer Science, and Software Engineering, a premium subscription to this newsletter is one of the best ways to do so. You can get a 60-Day free trial, by using this link.
This will put upward pressure on the tech industry as a whole. We are likely to see some amazing products using all the technology that will be released. Keep your eyes open, your quality of life is about to explode. I’m already excited about all the amazing apps and services we’re about to see.
Make sure you share your thoughts on this question/any interesting questions/developments in the comments or through my links. If you liked this post, make sure you tap that heart button and let the world know.
Reach out to me on:
Instagram: https://www.instagram.com/iseethings404/
Message me on Twitter: https://twitter.com/Machine01776819
My LinkedIn: https://www.linkedin.com/in/devansh-devansh-516004168/
My content:
Read my articles: https://rb.gy/zn1aiu
My YouTube: https://rb.gy/88iwdd
Get a free stock on Robinhood. No risk to you, so not using the link is losing free money: https://join.robinhood.com/fnud75
The partnership is huge news. We'll deserved. Look forward to your growth