YouTube video summary

A conversation with Kevin Weil (OpenAI CPO), Mike Krieger (Anthropic CPO), Sarah Guo (Conviction)

Artificial intelligence

06 Nov 202416 min summaryFrom Lenny's Podcast

A conversation with Kevin Weil (OpenAI CPO), Mike Krieger (Anthropic CPO), Sarah Guo (Conviction)

Lenny's Podcast

Save to your library

Chat with this summary

Kevin Weil's Role at OpenAI and Initial Reactions

Kevin Weil took the role of Chief Product Officer at OpenAI, which he found to be one of the most interesting and impactful roles, with many challenges to figure out, including building a product with a constantly evolving technology base, as computers can do something new every two months that they have never been able to do before in the history of the world 1m14s.
Kevin's friends and team generally reacted with excitement to his new role, and he has been having a blast working on the inside as AI gets developed 1m18s.

Mike Krieger's Move to Anthropic and His Experience

Mike Krieger, the founder of Instagram, joined Anthropic as Chief Product Officer, and people's reactions to the news varied, with some thinking it made sense, others wondering why he would take the job, and some being impressed that Anthropic could hire the founder of Instagram 2m21s.
Mike couldn't resist the opportunity to work on something new and was drawn to the company's research-driven approach, and he has been enjoying learning about enterprise and serving customers different from those he worked with at Instagram 2m37s.
Mike has been surprised by the childish delight he feels in learning about new things, such as enterprise and research-driven organizations, and he appreciates the opportunity to have a different experience every year, as he had vowed to do when he was 18 3m31s.

The Nature of Enterprise Sales and Customer Feedback

Enterprise sales have a different pace compared to other sales, with a longer timeline that can take six months from the initial conversation to deployment, requiring adjustment to different timelines 4m5s.
The feedback and engagement from enterprise customers can be more rewarding, as they have a financial incentive to provide honest feedback on the product's performance 4m31s.
In enterprise sales, the focus is not just on the product, but also on the buyer's goals, and building a great product does not necessarily guarantee success 5m2s.
Enterprise customers may have specific requirements, such as advance notice of product launches, which can be challenging to accommodate 5m29s.

Product Development at OpenAI and the Role of Model Capabilities

At OpenAI, they have multiple products, including consumer, enterprise, and developer products, which can be managed simultaneously 5m44s.
Instincts can be helpful in product development, but only in about half of the job, especially when the product is in the final stages of development 5m53s.
The beginning of product development can be uncertain, with unknown capabilities and emergent properties of models, requiring a wait-and-see approach 6m31s.
The product development process can be influenced by the capabilities of the model, and the product that is built will depend on the model's performance, which can vary from 60% to 99% 7m1s.
The research team's progress is regularly checked to determine the model's capabilities and potential applications 7m18s.
Model training is a research process where the outcome is not always certain, making it exciting and stochastic, similar to the experience of working at Instagram during Apple's WWDC announcements, where a new feature could either be awesome or cause chaos, but in this case, the disruption comes from within the company 7m22s.
The cycle of discovering new capabilities and planning for the next set of features is challenging when the outcome is uncertain, but it's possible to plan by squinting at the advancements in intelligence and building products around expected capabilities 8m2s.

Approaches to Product Development with Evolving AI Models

There are three ways to approach this: watching the advancements in intelligence, deciding on product capabilities and fine-tuning with research teams, and co-designing and co-researching with the actual research teams 8m29s.
Embedding designers early in the process is crucial, but it's essential to understand that the output of experimentation should be learning, not perfect products, and partnering with research should lead to demos or informative things that spark product ideas 9m4s.
Research is both product-oriented and academically focused, and sometimes, new capabilities are discovered by chance, leading to unexpected opportunities 9m32s.
When investing in new capabilities, it's essential to consider whether a model can be useful even if it's only 60% successful at a task, especially if the task is valuable and important 10m13s.
Evaluating progression on a task and deciding what to prioritize involves considering the importance and value of the task, even if the model is not 99% successful 10m30s.
The burden of product design is to make AI models work gracefully, even when they're not perfect, and to expect human involvement in the loop, especially when models are only 60% right, which can still be valuable for users 10m40s.

The Importance of Imperfect AI Models and Real-World Testing

GitHub Co-Pilot, an AI product that assists with coding, was launched with a model that wasn't perfect but still provided significant value by getting the code partially correct, allowing users to edit and complete it 10m59s.
Similar experiences will occur with the shift towards agents and longer-form tasks, where models may not be perfect but can still save users time and be valuable, especially if they can understand their limitations and ask for help 11m45s.
The 60% benchmark is not a fixed number, but rather a rough estimate, and AI models often perform well on some tasks and poorly on others, making it essential to design for these variations 12m9s.
Pilot programs with customers have shown that AI models can receive vastly different feedback, with some companies finding them highly effective and others finding them less useful, highlighting the importance of real-world testing 12m27s.
The effectiveness of AI models can be influenced by various factors, including custom data sets, internal use cases, and prompting styles, which can lead to unexpected results when deployed in the real world 12m58s.

Evaluating and Improving AI Models

Current AI models are not limited by their intelligence but rather by their evaluation methods, and they can be taught to perform better on a wider range of tasks with proper training and evaluation 13m13s.
The lack of evaluation methods has hindered the development of AI models, and it's essential to establish clear success metrics to improve their performance 13m47s.
The problem often solved involves determining what success looks like for a task and iteratively improving it, with tools like Claud able to automate evaluations and grading, but requiring input on what success entails 13m51s.
At Anthropic, the interview process involves making candidates improve a prompt from a "crappy eval" to a good one, showcasing their thought process, as the company believes writing evaluations is a crucial skill for product managers (PMs) 14m23s.
There is a lack of talent with this skill, and the company is trying to teach people how to write evaluations, considering it a core skill for PMs 14m31s.

The Changing Role of Product Managers in the Age of AI

The job of a PM in 2024-2025, building AI-powered features, is looking more like the role of research PMs, who work on model capabilities and development, rather than product surface PMs or API PMs 14m55s.
The quality of a feature is now gated on how well PMs have done evaluations and prompts, making the PM definition more meritorious 15m20s.
Anthropic set up a boot camp to teach PMs how to write evaluations and the difference between good and bad evaluations, but still needs to iterate and improve 15m29s.
To develop intuition for getting good at evaluations and iteration, one can use the models themselves, asking for sample evaluations and learning from the results 16m3s.
Looking at data and examining cases where models fail is also crucial, as it can reveal issues with the grader rather than the model itself 16m27s.

Challenges and Evolution of AI Model Evaluation

Every model release has a model card, and some model evaluations have shown that even the golden evaluations can be improved 16m47s.
Evaluating the performance of AI models is challenging, and even grading them is difficult, so it's essential to look at the actual answers and evolve the evaluation methods as the models improve 16m53s.
As AI models move towards longer-form and more agentic tasks, evaluation will become more nuanced and personalized, requiring a softer grading approach 17m18s.
The concept of capabilities in AI models may evolve to resemble a career ladder, with evaluation resembling performance reviews that assess whether the model meets or exceeds expectations 18m8s.
The increasing ability of AI models to beat humans at certain tasks raises questions about the role of humans in writing evaluations and the potential need for new evaluation methods 18m36s.

Skills and Adaptation for Working with AI

To effectively work with AI models, product people should learn to write evaluations that assess the models' skills and abilities 18m52s.
Prototyping with AI models is an underused skill that can be useful for quickly testing and evaluating different ideas and approaches 19m1s.
The use of AI models will push product managers to go deeper into the tech stack and develop a deeper understanding of the technology 19m46s.
The skills required to work with AI models will continue to evolve over time, and product people should be prepared to adapt and learn new skills 19m51s.
Product managers (PMs) do not need to be researchers, but having an appreciation for and understanding of how AI works can be beneficial in building products that utilize AI 20m4s.

Building Products with Stochastic AI Systems and User Feedback

AI systems are stochastic and non-deterministic, making it challenging to design products where the outcome is not entirely predictable 20m22s.
To address this challenge, PMs need to establish feedback mechanisms to understand when the model is not working as intended and collect feedback rapidly 20m35s.
This requires a different set of skills, as the traditional bug report is no longer applicable, and PMs need to understand the output of the AI across multiple outputs and users 20m51s.
Adapting to non-deterministic user interfaces is a new challenge, and even tech-savvy individuals are still adjusting to this new paradigm 21m8s.

User Research and Adapting to AI-Powered Products

Building products with AI requires considering the user's perspective and understanding how they will interact with the product, which can have both positive and negative consequences 21m44s.
Conducting user research is essential in understanding how users interact with AI-powered products, and it can be surprising to see how users react to new features and the model's output 22m11s.
PMs need to be prepared to let go of control and be flexible when working with AI, as the outcome is not always predictable 22m40s.
The development of AI products is happening rapidly, and PMs and technical people need to develop intuition for how to use them effectively 22m55s.

Rapid Technological Advancements and User Education

The rapid advancement of technology, such as ChatPT, has led to a situation where people quickly adapt to new innovations, and what was once considered "magic" becomes outdated in a short period, with the current state of technology expected to be seen as inferior in 12 months 24m24s.
The speed of adaptation is also influenced by people's excitement and understanding that the world is moving in the direction of technological advancements, making it essential to make the best possible progress 24m43s.
To address the challenge of educating end-users at scale, efforts are being made to make products more educational, such as providing information about the product itself and its features, which was not done initially 24m55s.
User research has shown that people want to know how to use the product, and providing clear instructions and documentation can help solve UI problems and user confusion 25m14s.

Educating Users in Enterprise Settings and Empowering Power Users

The approach to educating users is different in an Enterprise setting, where there is a status quo for how things are done, and organizational processes need to be considered when introducing productivity improvements or new technologies 25m47s.
In the Enterprise context, power users are often early adopters who are familiar with technology, but there is also a long tail of users who may need more guidance and education on how to use new products and features effectively 26m8s.
Non-technical users are being exposed to chat-powered LLMs for the first time, and it's essential to learn from these experiences to teach the next 100 million people how to use these UIs effectively 26m16s.
Power users within organizations are creating custom GPTs to make AI more accessible and valuable for those who might not know how to use it otherwise, and these power users can act as evangelists 26m50s.
The organizations mentioned are comprised of power users who are living in a "pocket of the future" and are finding innovative ways to utilize AI 27m16s.

Internal Use Cases of AI within Organizations

Internally, the organizations are using AI to automate tasks, such as ordering pizzas, and are exploring various use cases, including UI testing and data manipulation 27m28s.
AI is being used for UI testing, which is typically challenging and brittle, but early signs indicate that it works well for testing whether a UI functions as intended 28m12s.
The organizations are also exploring the use of AI for agentic tasks that involve data manipulation, such as automating repetitive tasks and filling out forms 28m38s.
The goal is to automate "drudgery" tasks, allowing humans to focus on more creative and high-value tasks 28m55s.

Workflows and Orchestration between Models

Many sophisticated customers and internal teams are experimenting with workflows and orchestration between models, utilizing each model for its strengths, such as reasoning, but also acknowledging limitations like time to think and multimodality 29m9s.
Reasoning in this context refers to the ability to form hypotheses, refute or affirm them, and continue reasoning, similar to how humans solve complex problems or make scientific breakthroughs 29m52s.

Scaling Intelligence and the Evolution of Reasoning in AI Models

The concept of scaling pre-training is well-known, where models like GPT2, 3, 4, and 5 are trained on increasingly larger datasets, resulting in smarter models, but with limitations, such as system one thinking, which provides immediate answers without much thought 29m59s.
In contrast, the new approach to scaling intelligence, as seen in models like 01, involves doing it at query time, allowing the model to pause, think, and reason before providing an answer, similar to human problem-solving 30m52s.
This new approach has the potential to revolutionize problem-solving, as models can think for extended periods, refining their answers, and can be used in various applications, including cybersecurity, where models can be fine-tuned to work together to achieve precise results 31m43s.
The use of models in concert with each other enables them to check each other's outputs, ensuring more accurate results, and can be applied to various tasks, such as finding and fine-tuning models to be good at specific tasks 32m32s.
The current state of this new approach to scaling intelligence is still in its early stages, similar to the GPT1 phase, but it has the potential to significantly impact various fields and applications 31m53s.
Models will be able to realize when something doesn't make sense and ask to try again, providing more value in specific use cases and orchestrations of models working together to accomplish complex tasks 32m40s.

The Future of AI: Proactive, Asynchronous, and Interactive Models

The future of AI may involve models becoming more proactive, such as monitoring emails and spotting interesting trends to provide users with proactive recaps and research 33m47s.
Another aspect of the future of AI is being more asynchronous, allowing users to expand their time horizon and not expect immediate answers, enabling them to work on other tasks while the model is processing 34m20s.
This asynchronicity will allow users to ask more complex questions and tasks, such as fleshing out a mini project plan, fixing bugs, or adapting a product requirement document (PRD) for new market conditions 35m4s.
The models are expected to get smarter at an accelerating rate, which will contribute to the development of these capabilities 35m28s.
A key aspect of the future of AI is seeing models interact in various ways, enabling more complex and powerful applications 35m35s.

Advancements in Voice Mode and Natural Interactions with AI

Humans interact with AI systems mostly through typing, but advancements in voice mode are changing this, allowing for more natural interactions like speaking and seeing, with the potential to become commonplace fast 35m42s.
The launch of advanced voice mode has enabled users to have conversations with people who speak different languages, acting as a universal translator, and has the potential to increase people's willingness to travel to new places 35m54s.
The combination of voice mode and other AI capabilities is creating new experiences, such as young people using voice mode to pour their hearts out and interact with AI in ways that are becoming increasingly natural 37m0s.
The digitally native generation is growing up with the expectation that AI will be able to understand and interact with them in various ways, including voice conversations 37m17s.
Children as young as 5 and 7 are already interacting with AI systems like ChatGPT, asking it bizarre questions and having weird conversations, and are perfectly happy talking to an AI 37m35s.
Children are also using AI to create their own entertainment, such as telling stories and asking AI to create images in real-time, showcasing a new way of creating and interacting with content 38m14s.

Developing Empathy and Understanding Nuances in AI Interactions

The most surprising behavior seen in AI products recently is the development of a nuanced understanding of the AI model and its capabilities, with users forming a kind of two-way empathy and befriending the AI 38m38s.
Users are also noticing differences in the behavior of new AI models, such as feeling smarter but more distant, and are developing a relationship with the AI based on its nuances 39m0s.
Developing AI products requires empathy, as they involve shipping intelligence and empathy, which are key components of interpersonal relationships, making it essential to consider how users will adapt to and interact with these products 39m9s.

The Personality of AI Models and User Preferences

The personality of AI models is crucial, and there are interesting questions around how much they should customize versus having a single personality, such as OpenAI and Claude having distinct personalities 39m40s.
Users may choose to use one AI model over another based on their personality, which is a human-like preference, as people tend to be friends with those they like and have an affinity for 40m1s.
A recent experiment where an AI model described users based on their past interactions went viral on Twitter, showcasing how people are starting to interact with AI models in a more personal and human-like way 40m13s.
This experiment demonstrated how AI models can be seen as entities that users can interact with, and their reactions to these interactions can be fascinating and provide valuable insights 40m39s.

Conclusion: Perspectives on the Future of AI

Kevin Weil and Mike Krieger shared their perspectives on the future of AI and its development, providing a glimpse into the potential of these technologies 40m46s.

Made with Recall · in 3 seconds

Get a summary like this for anything you read, watch or save.

Recall summarizes any link you paste, then keeps it in your personal library so you can search, chat with it, and never lose a key idea again.

YouTube videosArticlesPodcastsPDFsAnything else

Save this summary

Keep it in your library.

Save to your library

Browse all from Lenny's Podcast →

How to ship hardware in the AI era | Caitlin Kalinowski (Apple, Meta, OpenAI)

How to ship hardware in the AI era | Caitlin Kalinowski (Apple, Meta, OpenAI)

YouTube19 May 2026

Mission protection can't wait

Mission protection can't wait

YouTube17 May 2026

10 growth tactics that never work | Elena Verna (Amplitude, Miro, Dropbox, SurveyMonkey)

10 growth tactics that never work | Elena Verna (Amplitude, Miro, Dropbox, SurveyMonkey)

YouTube20 Jan 2025

How to break out of autopilot and create the life you want | Graham Weaver (Stanford GSB professor)

How to break out of autopilot and create the life you want | Graham Weaver (Stanford GSB professor)

YouTube20 Jan 2025

Inside Gong: How teams work with design partners, their pod structure, autonomy, trust, and more

Inside Gong: How teams work with design partners, their pod structure, autonomy, trust, and more

YouTube06 Jan 2025

Scripts for navigating difficult conversations | Alisa Cohn (executive coach)

Scripts for navigating difficult conversations | Alisa Cohn (executive coach)

YouTube06 Jan 2025

Ready to get started?

Save, summarize and chat with your content.

IT'S FREE

No credit card required · 30 Day Refund on Premium · 24 Hour Support

Recall web app on laptop, personal AI knowledge base for summarizing and chatting with your content