Authorās Note: This post is free because I think itās a super important topic that is overlooked. If you want to support this newsletter so I can put more time into creating better-quality articles on AI augmentation, please consider becoming a paid subscriber.
The rapid advancement of AI is about to change the world as we know it. The more research I do, the more my conviction increases.
To navigate this transformative era, it's crucial to understand two fundamental aspects of AI:
Its incredible potential
Its accelerating timeline
Why This Understanding Of AI Matters
Your grasp of AI's potential and timeline directly shapes your strategies for the future. If AI were progressing slowly with limited impact, it might not warrant much attention. However, if it were evolving at breakneck speed with transformative power, it would demand a radical shift in your approach to life and work.
The AI Scaling Laws and Moore's Law
The importance of understanding timelines and capabilities is why Iāve dedicated so much time to understanding:
These principles shed light on the significance of the recent launch of ChatGPT o1-preview, OpenaAIās most advanced model. This model announcement is a bigger deal than people think.
ChatGPT o1-preview: A Game Changer
Traditional AI Scaling: The Old Way
Traditionally, AI got smarter by:
Using more computing power
Increasing model size (parameters)
Training on more data
This scaling happened during the training phase, before the AI was ready to use.
The O1-Preview Breakthrough: Scaling During Use
ChatGPT O1-Preview introduces a game-changing approach:
Scaling happens during the inference phase (when you're actually using the AI)
The AI "thinks" before answering, like a human taking time to ponder
How It Works
You ask a question
Instead of answering immediately, the AI spends time reasoning (10-100 seconds)
This extra "thinking time" leads to more insightful answers
The New Scaling Law: More Compute = Smarter AI
As computers get faster and more efficient:
AI can "think" longer for the same cost
Example: What's now 10 minutes of AI thinking could become 10 years in the future
Potential Impact
Tasks that take humans days could be done by AI in seconds
More compute ā Longer reasoning time ā Smarter, more capable AI
The path to AI agents that go out in the world and do things becomes more feasible.
The Big Picture Implications
Compute is a critical resource to both scaling laws. This is why hundreds of billions of dollars are being spent on more and more advanced chips. According to Jensen Huang, the CEO of the largest chip company in the world (see video below), because of three exponentials (Mooreās Law, Training Scaling Law, Inference Scaling Law), the effective improvement rate of compute is 100,000x per 10-year period. To put the power of compute in context, Sam Altman, CEO of OpenAI, says, āWhat are the limitations of GPT? āThere are many questions as to whether it exists, but I will confidently say āno.āā We are confident that there are no limits to the GPT model and that if sufficient computational resources are invested, it will not be difficult to build AGI that surpasses humans.ā
AGI may come faster than we think. As my conviction on AIās timeline and capabilities evolves, Iām spending more and more of my time using and thinking about AI. And, Iām throwing away my long-term strategies, so I can think from AI-first principles and rapidly adapt in a āhyper-changeā world.
Other AI Experts Add Context
UPENN Researcher Ethan Mollick gives a solid overview in Scaling: The State of Play in AI:
When the o1-preview and o1-mini models from OpenAI were revealed last week, they took a fundamentally different approach to scalingā¦ o1-preview achieves really amazing performance in narrow areas by using a new form of scaling that happens AFTER a model is trained. It turns out that inference compute - the amount of computer power spent āthinkingā about a problem, also has a scaling law all its own. This āthinkingā process is essentially the model performing multiple internal reasoning steps before producing an output, which can lead to more accurate responsesā¦
Unlike your computer, which can process in the background, LLMs can only āthinkā when they are producing words and tokens. We have long known that one of the most effective ways to improve the accuracy of a model is through having it follow a chain of thought (prompting it, for example: first, look up the data, then consider your options, then pick the best choice, finally write up the results) because it forces the AI to āthinkā in steps. What OpenAI did was get the o1 models to go through just this sort of āthinkingā process, producing hidden thinking tokens before giving a final answer. In doing so they revealed another scaling law - the longer a model āthinks,ā the better its answer is.
Just like the scaling law for training, this seems to have no limit, but also like the scaling law for training, it is exponential, so to continue to improve outputs, you need to let the AI āthinkā for ever longer periods of time. It makes the fictional computer in The Hitchhikers Guide to the Galaxy, which needed 7.5 million years to figure out the ultimate answer to the ultimate question, feel more prophetic than a science fiction joke. We are in the early days of the āthinkingā scaling law, but it shows a lot of promise for the future.
Thank you Michael for sharing these fascinating insights and trends. You are exciting me to delve deeper into this AI space by opening my eyes and mind
Thank you for your take on the AI industry and their many announcements and launches. Along with the many advances in ChatGPT and related products, the noise level increases. We will need some thought leadership on this. I was delighted to see people I've worked with and included in my community are connected at 1st level LinkedIn with you too, such as Ken Yancy, Michael Thompson, and Mary Henderson. You were wise to bring this into the open instead of limiting to paywall only.