YouTube video summary

Why The Next AI Breakthroughs Will Be In Reasoning, Not Scaling

Artificial intelligence14 Nov 202413 min summaryFrom Y Combinator
Why The Next AI Breakthroughs Will Be In Reasoning, Not Scaling
Y Combinator
YouTube

Intro 0s

  • A conversation about achieving Artificial General Intelligence (AGI) took place around a year ago, with one argument suggesting that AI will eventually become capable of designing better chips than humans, eliminating a bottleneck to greater intelligence 5s.
  • The idea is that this development will put us on a pathway to AGI in a way that wasn't possible before 15s.
  • In a previous episode, the topic of discussion was what to do with two more orders of magnitude, but since then, Sam has expressed a desire to go to four orders of magnitude 25s.
  • Currently, AI models are rapidly improving, with capabilities emerging that weren't possible a month ago 37s.
  • This rapid progress is considered a significant moment in history 43s.
  • The hosts of the podcast are Gary, Jared Harge, and Diana, and they are affiliated with Y Combinator, which has funded companies worth over $600 billion and hundreds of companies every year 58s.
  • Recently, Sam Altman wrote an article that is relevant to the topic of discussion 1m14s.

The intelligence age 1m15s

  • A wild essay predicted that Artificial General Intelligence (AGI) and Artificial Superintelligence (Asi) are coming within thousands of days, with an estimated timeframe of 4 to 15 years 1m22s.
  • The essay's ideas are similar to those discussed by Sam Altman, the founder of Open AI, in 2015, which at the time sounded like the ideas of a "crazy person" but now seem plausible 1m51s.
  • In 2015, Sam Altman believed that AGI would be better at doing science than humans and would accelerate the rate of scientific progress in every field 2m57s.
  • One of the motivations behind Open AI was to create an AI that could accelerate scientific progress, and this idea is still connected to the current work on advanced reasoning capabilities 3m10s.
  • The development of advanced reasoning capabilities is crucial for AI to be able to do science and accelerate technological progress 3m32s.
  • The paper on O1, a model developed by Open AI, highlights its capabilities and potential for the future, including its ability to perform well in chip design 3m49s.
  • The ability of AI to design chips better than humans could potentially eliminate one of its bottlenecks for getting greater intelligence 4m1s.
  • The current progress in AI development, including the capabilities of O1, suggests that we are on the pathway to achieving AGI 4m10s.

YC o1 hackathon 4m18s

  • Diode Computer is a company that builds AI designers for circuit design, and their previous product could handle PCB design, which involves four major steps: system design, component selection, schematic design, and layout and routing 4m37s.
  • The company's previous product could automate schematic design and, to some extent, routing, but it was not able to handle system design and component selection 5m34s.
  • The company has now demonstrated a new product, called 01, which can automate system design and component selection, allowing it to read data sheets and select the right components for a specific project 5m50s.
  • The 01 product can take high-level constraints, such as building a wearable heart rate monitor, and match the specific components needed, including a microcontroller, accelerometer, and heart rate monitor sensor 6m7s.
  • The product can then output a system diagram and generate code in a language called Arile, which can be used to build a PCB 6m57s.
  • The output of the 01 product can be used to generate a layout for the board, which can then be fine-tuned and used to create a fully working printed circuit board 7m12s.
  • The company's system can also call an auto-router on the specific board, allowing for the creation of a fully working PCB 7m41s.
  • The 01 product goes beyond the traditional EDA (Electronic Design Automation) process, which involves design, simulation, and bug verification, by automating the entire process from system design to component selection and layout 7m58s.
  • A paper used different models for different tasks and workflows, such as 40 mini for PDF extraction and 01 for reasoning, to select the correct components for a circuit board before placing it on the board, which is a common pattern in building interesting products with AI 8m26s.
  • The process of selecting components, such as servos, motors, and sensors, requires a lot of thinking and is a hard task for humans, making it a suitable application for AI models like 01 9m12s.
  • Diode, a company, tried to use GPT-40 for component selection but failed, and then successfully used 01 for the same task, demonstrating a step-function capability unlock 9m25s.
  • A hackathon organized by Diana featured actual YC-funded startups building features for their products using 01, showcasing how the model can unlock capabilities for real companies 9m52s.
  • Camper, a company, uses 01 to create cat designs with natural language input, allowing users to design complex systems like air foils optimized for specific conditions without requiring extensive mechanical engineering knowledge 10m34s.
  • Camper's system can run multiple simulations simultaneously and solve partial differential equations, making it a co-pilot for solid works and allowing users to design complex systems with ease 11m13s.
  • The system can even write and solve equations, such as the Navier-Stokes equations, to solve air foil design problems, demonstrating the capabilities of 01 in reasoning and problem-solving 11m39s.
  • Sam wants to increase the magnitude by four orders to reach a trillion-dollar spend, which is considered a significant and ambitious goal 12m0s.

4 orders of magnitude 12m9s

  • Abstracting complex concepts, such as understanding the nature of physics, could be possible if scaling laws hold, making it plausible to tackle difficult engineering challenges like room temperature fusion, weather prediction, and complex physical phenomena 12m21s.
  • These complex physical phenomena are hard to solve and typically require PhDs, but advancements in AI, particularly with Chain of Thought and reasoning, could lead to breakthroughs in these areas 12m57s.
  • The concept of providing feedback not just on outputs, but on the steps to get there, is a key idea in teaching models how to think, allowing for fine-tuning of various steps to ensure the model thinks as desired 13m21s.
  • This approach is similar to the AGI conversations, which focus on teaching models to think better, rather than just producing correct answers, and is made possible by the scaling laws, which provide more surface area for throwing compute at the problem 13m47s.
  • The scaling laws enable iterative improvement of results by spending more money and time, similar to what might be expected from a human scientific organization, and potentially even more consistently 14m10s.
  • The architecture of the AI model is inspired by previous work, including the beginning of OpenI, and has been developed over many years 14m34s.

The architecture of o1 14m42s

  • The AI model that won video game competitions, specifically DOTA, was a breakthrough in the tech industry, showcasing the power of reinforcement learning techniques, which were also inspired by Alpha Go and Alpha Zero 14m42s.
  • DOTA is a complex game that requires resources and planning, and the AI model's success was due to its ability to learn from playing against itself a million times, using Q-learning as the fundamental algorithm behind reinforcement learning 14m42s.
  • The connection between DOTA and GPT-type models lies in incorporating reinforcement learning into the generative model, which requires a large amount of factually correct data and a reward function to reason about the output 15m57s.
  • The training process likely involved interesting techniques, including the use of secret data sources, such as math and science problems, to improve the model's performance 16m20s.
  • There are two research directions being explored in parallel: scaling up the underlying language model (LM) and using reinforcement learning to unlock the model's potential in the real world 17m16s.
  • The 01 model, which uses reinforcement learning, is a significant step forward, and its full version is expected to be a huge improvement over the 01 preview, with 02 and 03 models not far behind 17m35s.
  • The 01 model is still opaque, and its development required creating a new dataset to train the chain of thoughts, which was a costly endeavor 18m0s.
  • Large language models (LLMs) can be improved by breaking down tasks into steps and using evaluation sets, as discovered by Jake Heler for case text, which is also applicable to other tasks 18m29s.
  • The prescription for improving LLMs involves two parts: breaking down tasks into steps and using evaluation sets, with the latter being crucial even if the model can break down tasks on its own 18m52s.
  • Some companies have achieved 100% success by following Jake Heler's recommendations, which include having a large evaluation set and carefully testing every step of the reasoning pipeline 19m28s.
  • The key to future AI breakthroughs may lie in reasoning, not scaling, with a focus on creating large evaluation sets and proprietary data that is not readily available online 19m42s.
  • The value of a company's AI model (Moe) may ultimately lie in its ability to access and utilize proprietary data that is not publicly available, which can be achieved through enterprise sales and partnerships 20m25s.
  • Startups may benefit from targeting customers who are willing to pay for high accuracy and perfection, such as those in industries that require precise and specialized knowledge 21m41s.
  • Companies like Camper may be a good example of this approach, as they focus on providing high-quality and specialized products that require precise and accurate information 21m51s.
  • The next AI breakthroughs may come from companies that are willing to do the hard work of collecting and utilizing proprietary data, rather than relying solely on publicly available information 21m36s.

Getting that final 10-15% of accuracy 21m52s

  • There is a growing interest in text-to-CAD design, particularly among hobbyists and those who want to quickly prototype and test their ideas, but also among professionals who require high accuracy and precision, such as those designing airplane parts 21m53s.
  • The strongest technical teams will have the option to go all the way and cater to customers who demand 100% accuracy and are willing to pay a premium for it 22m20s.
  • The use of AI in design and prototyping may not commoditize technology and make it less important to have a strong technical team, but rather the opposite, as the value will likely be captured by the strongest technical teams who can build on top of existing technology and achieve the final 10% of accuracy 22m49s.
  • The key differentiators for companies using AI in design and prototyping will be the prompts, evaluations, UI layer, and integrations, as simply having good prompts is not enough for a company to adopt the technology 23m2s.
  • Distribution, branding, and difficulty in switching will also be important factors in the success of companies using AI in design and prototyping 23m25s.
  • The classic moats of software still apply, and companies that can establish a strong brand and make it difficult for customers to switch will have an advantage 23m45s.
  • Evaluations will still be crucial in the world of AI, as founders will need to figure out how to build the best product on top of the technology 23m56s.
  • Gigl, a company that was funded for a different idea, has pivoted to helping companies fine-tune open-source models to achieve equivalent performance 25m17s.
  • Open AI was initially the primary focus, but it was found that such businesses are not great due to decreasing model costs and increasing performance of open-source models, making fine-tuning less necessary 25m30s.
  • Companies like AO pivoted to finding vertical applications for their AI expertise, such as AI customer support, which is a competitive space but allows for squeezing out a comparative edge 26m1s.
  • AI customer support deals with many edge cases and squishy problems, making it challenging, but intensely technical teams can still find ways to gain an edge 26m34s.
  • Despite the potential, hardly any adoption of AI customer support has happened yet, and the space remains wide open 26m52s.
  • Rules-based systems work well for simple cases, and there's a lack of trust in AI's ability to solve complex problems, which contributes to the slow adoption 27m10s.
  • Companies like Zepto have started to adopt AI customer support, with Zepto automating 30,000 tickets per day and having over 1,000 people working on those tickets previously 27m50s.
  • The automation of customer support jobs, although potentially replacing human jobs, can also free people from rote and unfulfilling work, allowing them to pursue more meaningful careers 28m20s.
  • A previous implementation of a model had a 70% error rate, but after using a technique described by Jake Keer, the error rate was reduced to 5%, a significant improvement of an order of magnitude 29m29s.
  • This improvement is particularly notable for complex, time-consuming, and expensive problems that were previously unsolvable, with the model now achieving 85% accuracy, up from 0% 30m9s.
  • The model in question is 01 Preview, a new technique that is still being developed and refined, with the company trying to protect its advantage by hiding the actual model and using a fake one to give the impression of breaking down problems into steps 30m38s.
  • The next step for 01 is expected to be the addition of interpretability and direct ability, allowing users to see the steps and edit them, which would be a significant unlock for the model's capabilities 30m56s.
  • Currently, 01 can output a chain of thought, but it cannot be edited, and the ability to edit each step would take the model to the next level of fine-tuning 31m26s.
  • The current state of models like 01 is the worst they will ever be, with rapid improvements being made week to week, and new capabilities emerging that were not possible just a month ago 31m39s.
  • The improvement in models like 01 is expected to have a significant impact on various companies and ideas, with some benefiting greatly from the uplift, while others may not be as affected 31m56s.
  • The opposite kinds of ideas that may not benefit as much from 01 are not explicitly stated, but it is implied that companies and ideas that rely on complex, time-consuming, and expensive problems may be less affected by the model's improvements 32m5s.

The companies/ideas that should pivot because of o1 32m6s

  • Companies building AI coding agents or AI program engineers may need to reassess their strategies due to the advancements in o1, which is outperforming in solving programming problems 32m31s.
  • The Chain of Thought infrastructure, which some teams have heavily invested in, may not be a significant leap forward with o1, as it is already directable 32m45s.
  • The opaque nature of the Chain of Thought and the difficulty in altering its path once it starts are current challenges for users and systems 33m2s.
  • New model capabilities can unlock new startup ideas, and the recent advancements in AI models have made phone calling-related startups successful 33m29s.
  • The o1 series of models may enable new startup ideas that can improve the physical world, particularly in fields like mechanical engineering, electrical engineering, chemical engineering, and bioengineering 33m54s.
  • These advancements could lead to real-world abundance and improvements in people's lives, rather than just minor conveniences 34m29s.
  • There may be a sense of urgency to develop and apply these technologies, as there is a fear of AI in society, and it is up to the developers to create positive change 34m38s.

Outro 34m44s

  • The goal is for a technologist to help bring about an age of abundance sooner rather than later 34m46s.
  • Achieving this goal could lead to abundance winning out over fear 34m52s.
  • This concludes the current episode of "The Light Cone" with the next episode to follow 35m1s.
Made with Recall · in 3 seconds

Get a summary like this for anything you read, watch or save.

Recall summarizes any link you paste, then keeps it in your personal library so you can search, chat with it, and never lose a key idea again.

YouTube videosArticlesPodcastsPDFsAnything else
Save this summary

Then save anything you watch or read next.

Bookmark this summary, then save any video, article or PDF you read next.

Save to your library

Ready to get started?

Save, summarize & chat with your content.

GET STARTED

IT'S FREE

No credit card required · 30 Day Refund on Premium · 24 Hour Support

Recall web app on laptop