YouTube video summary

Tokens Or Humans? The New AI Cost Trade-Off Reshaping Corporate Budgets

Business

02 Jun 202615 min summaryFrom CNBC

Tokens Or Humans? The New AI Cost Trade-Off Reshaping Corporate Budgets

Save to your library

Chat with this summary

The Cost Trade-Off Between AI and Human Workforce

Companies are facing a unique choice between investing in tokens or humans, as technology costs have become comparable to the cost of people, with annual budgets being blown through in weeks, and this comparison is a new conversation that has never been had before 10s.
The growing AI budget is coming at the expense of future headcount growth, and companies are now considering which one to prioritize, with the AI spend per employee being a key factor in this decision 1m15s.
A year ago, the term "tokens" was primarily discussed among engineers on social media, but now analysts are asking about it on earnings calls, and CFOs are paying close attention to it as a line item in their budgets 2m6s.
The AI supply chain is priced based on the assumption that demand will remain enormous and price-insensitive, but there are concerns that the spend is running ahead of the proof, and companies are starting to question whether they need the most expensive models for every workload 3m30s.

Rising AI Budgets and Their Impact on Headcount

Companies like Glenn Enterprise AI assistant, which sells to Fortune 500 companies, are experiencing significant revenue growth, with Arvind Jain's company reaching $300 million in annual recurring revenue, up from $100 million in just 15 months 5m30s.
The cost conversation around AI is becoming increasingly important, with smart buyers routing easy queries to cheap models and hard ones to frontier models, using an orchestration layer to make the call automatically and hopefully save money along the way 7m15s.
Arvind Jain's company, Glenn Enterprise AI assistant, is able to use about 30% fewer tokens than off-the-shelf tools, with the gap widening on harder tasks, which is a key consideration for CFOs who are reckoning with the cost of AI adoption 10m30s.

AI Cost Management and Model Selection

The Glean platform is a multi-model company that works with both closed domain models, such as GPT and Gemini, and open domain models trained on the Nvidia platform, to manage AI costs by picking the best model for a specific task behind the scenes, which helps customers avoid burning tokens and using super expensive models for simpler tasks 10s.
Companies are experiencing rising AI costs, with some exhausting their annual AI budgets in just one or two months, highlighting the need for a solution to control these costs and make AI spending more sustainable 2m6s.
The conversation around AI has evolved, with enterprises now prioritizing controlling their AI budgets and being more careful about how they use AI within their companies, as the growing AI budget often comes at the expense of future headcount growth 4m42s.
The cost of models has increased, with new versions of models from Frontier Labs being almost twice as expensive as the previous versions on a per token basis, making it essential for businesses to develop a strategy for managing their AI spending 6m15s.
To address the rising costs, there will be an increasing use of open domain models, which are significantly cheaper than closed domain models, with the potential to reduce costs by an order of magnitude 8m30s.

AI Budget Prioritization and Strategic Decisions

Executives are now primarily concerned with keeping their AI costs under control, and while they are still looking for a return on investment, the rising model costs are making it challenging to justify the extra expense, prompting a re-evaluation of their AI strategies 10m50s.
Enterprises are currently in an investment phase, spending more money on AI, and although the value has yet to be fully realized, they are confident that it will come, and they believe investing in AI and their workforce is the right decision 10s.

AI Use Cases and ROI Challenges

The top two use cases for AI are software engineering and coding, and customer success and customer support, with significant successes being seen in these areas, particularly in terms of return on investment for customer support 2m6s.
While AI has increased efficiencies and resolved issues such as tickets, it has yet to translate into significant revenue growth, with companies still not seeing a substantial uplift in sales and product development 4m30s.
However, AI has shown success in making sales teams more efficient and effective, with customers seeing increased sales per account executive, and higher productivity for business development reps and account executives 6m20s.
The increase in AI budgets has created a gap, as executives are looking for measurable returns, such as a 10% increase in sales, which is not yet being fully attributed to AI, despite AI budgets increasing by 10, 20, 30, or 50% 8m40s.

The Tokens vs. People Dilemma

As AI costs rise, companies are having to make a choice between spending on AI or hiring new staff, with some companies reducing planned headcount to make room for increased AI spending, essentially choosing between "tokens or people" 10m50s.
Companies are currently deciding between allocating their budget to tokens or headcount, with many choosing to reduce future headcount to make room for increasing AI token spend, as no one wants to be the first to cut back on AI, and this decision is often forced due to locked financial plans 10s.

Cost Variations in AI Models and Pricing Trends

The cost of AI models, particularly those on the frontier, is rising, with the most advanced models from companies like Anthropic and OpenAI being more expensive, while open source models are cheaper but currently not a major factor in enterprise spend, accounting for only a small percentage of the bulk spend 2m6s.
Open source models are expected to become more prominent, with some of the best ones currently being Chinese, but development is also happening in the US, allowing companies to build task-specific models that are as good as frontier models but consume less money 4m6s.
Using open source models can be significantly cheaper, with a comparison showing that using a Chinese low-cost model is nine times less expensive than using a model like Claude, and companies can achieve significant savings by making smart choices, such as routing tasks to smaller model versions 6m6s.
Even with closed domain models, there is a wide range of price points available, with the largest and most advanced models being ten times more expensive than cheaper versions, and companies can manage their token costs by making smart choices, such as model routing, which can lead to significant savings 8m6s.
The use of green technology is also exciting, as it allows customers to reduce costs by more than 30% by picking the right models, and it enables a smarter approach to completing tasks, rather than the current brute force method, which is often why models are so expensive 10m6s.

Optimizing AI Consumption and Cost Efficiency

AI models spend a significant amount of time assembling the right context and information from within an enterprise before starting the actual work, which is where they are weak and spend a lot of money, and this is an area where clean models are seeing success by assembling the right information and bringing it to the models in one shot, allowing them to solve tasks faster and with less than half the tokens needed 10s.
There are techniques to optimize AI consumption, and companies are starting to look into these options as costs become a reality, with 95% of enterprise use currently on the frontier model, and the possibility of switching to other models like sonnet or deep seek 2m6s.
Open source models, such as Chinese models, are capable but enterprises are still reluctant to use them in the US, although this may change as budgets get out of control and companies are forced to consider cheaper options, with the potential for US-based open source models to emerge 4m30s.
The current valuations of US labs, such as Anthropic, are enormous, but the pricing power and total addressable market may not hold as companies start to look at cheaper models, including open source and Chinese models, which could impact the upcoming batch of AI IPOs 6m40s.

Challenges in AI Investment and Market Stability

The AI industry is fast-moving, and the underlying foundation keeps changing quickly, making it a difficult investment for public market investors, and companies may also find it challenging to go public due to these rapid changes 10m20s.
Companies are currently facing a significant trade-off between using tokens, which represent AI technology, and humans in their operations, with the cost of AI being comparable to the cost of hiring people, a situation that has never been seen before in the history of technology 10s.
The market is not yet well-defined, and there is a lack of stability, which is typically demanded by public markets, making it challenging for companies to make decisions about their future 10s.
Space X, Anthropic, and OpenAI are moving towards their own paths, and it is hard to say whether this is the right approach, as the dominance of these companies may not last, and the landscape is constantly shifting 42s.
The way AI works today is not great, as it is very powerful but also very inefficient, and the cost of using AI is not at a sustainable level, making it essential for companies to find ways to optimize their resource allocation 2m6s.

Resource Allocation and AI Model Selection

Companies like Glenn and Factory AI are working on deciding which model to use for specific tasks, and this is becoming a critical resource allocation problem, as companies need to choose between optimizing the number of employees, compute per employee, or AI spend per employee 2m6s.
The concept of companies being like mini AGIs, where humans are like parameters, and AI spend is like compute, is an interesting perspective, and companies like Block, led by Jack Dorsey, are exploring this idea, which has been followed by other companies, indicating a potential shift in how companies approach resource allocation 4m30s.
The conversation around tokens or humans is a really interesting discussion, and companies are excited to see what the future holds, with many expecting significant developments in the next couple of years 6m0s.

Workforce Anxiety and ROI Concerns

Companies are becoming ruthlessly efficient in delivering business value by choosing between spending their budget on hiring more employees or using AI tools to optimize their existing workforce, and this decision is expected to continue evolving over the next ten years 10s.
The use of AI tools allows companies to focus on optimizing their business outcomes, but it also creates anxiety among the current workforce, as they worry that AI might replace them, and executives like Uber's COO have expressed concerns about the return on investment (ROI) of AI 2m6s.
The pendulum could swing back in favor of hiring humans again if companies realize they are not getting enough value from their AI investments, and engineers are likely to play a key role in solving many problems that exist within society using software 4m6s.

Phases of AI Adoption and Cost Management

There are three phases in the adoption of AI, starting with CEOs being pressured to adopt AI, followed by a phase of using AI by any means necessary regardless of the cost, and the current phase is focused on finding ways to make AI more affordable and efficient, such as routing tasks to cheaper AI models 8m6s.
Companies like Factory AI are working on making AI more affordable by developing technologies that enable model routing, which allows companies to route tasks to the most suitable AI model, and this is creating demand for their services as companies look for ways to reduce their AI costs 10m6s.
The adoption of AI has led to a significant increase in corporate budgets, with companies now reassessing their spending and considering whether they need to use high-level intelligence for every task, prompting a shift towards more cost-effective solutions 10s.

Model Agnosticism and Cost-Effective AI Solutions

Factory, a company that has been around for three years, has been steadfast in its need to be model agnostic, recognizing that different models are good at different tasks and have different trade-offs between cost, quality, and speed 1m5s.
The company aims to help its customers dynamically adjust where they want to live in the spectrum between cost, quality, and speed, and notes that some tasks, such as software development and writing documentation, may not require high-level intelligence 1m30s.
For non-technical users building internal dashboards, models like Gemini Flash or open models can provide good quality output ten times faster and ten times cheaper than high-level models like Opus 2m6s.
The majority of enterprise work, around 95%, is currently being done with frontier models, making it challenging for companies to switch to more efficient models, but Factory has made it easy for people to seamlessly jump in by using synchronous workflows 3m20s.
However, for asynchronous tasks, having a model agnostic stance is crucial to avoid vendor lock-in, which can lead to significant price increases and reduced pricing power 4m30s.
The importance of benchmarks and leaderboards has decreased, with new models being released more frequently, and the focus has shifted to real-world usage, although Factory still prioritizes zero-day releases for new models to ensure customers have access to the latest technology 6m40s.

Model Evaluation and Routing Systems

Engineers at large enterprises may struggle to keep up with the latest models, so they rely on routers to direct them to the best model for their specific tasks, and this process involves measuring and evaluating new models to determine their strengths and weaknesses 10s.
The differences between newer models, such as Opus 4.7 and Opus 4.8, can be subtle and may not be noticeable to a layperson, making it essential to have a routing system in place to understand the exact situations where the differences matter 2m6s.
The routing system can help determine when to use a more advanced model, such as a "professor" with 15 years of experience, and when a less advanced model, such as a "high school student," will suffice, allowing for more efficient use of resources 2m6s.
Even consumers may need time to figure out which model is giving them slightly better answers, and this can be a challenging task, especially when the differences between models are subtle 4m30s.

Model Preferences and Use Cases

OpenAI's GPT 5.5 is considered a good model for planning and is very meticulous, making it suitable for tasks that require attention to detail, while GLM 5.1 is used for execution, and Opus is more friendly but sometimes less reliable 6m40s.
The switch to using OpenAI's models, such as GPT 5.5 and GLM 5.1, was made about two months ago, after OpenAI improved their models for code, and now about 80% of tokens are used for open models like GLM 10m30s.

Competitive Landscape and Emerging AI Players

Other companies, such as Microsoft, are reportedly working on their own coding models, and Gemini is also mentioned as a potential player in this conversation, indicating a competitive landscape in the development of AI models 12m50s.
The current AI landscape appears to be a two-horse race, but other companies like Google and Xai are expected to catch up by the end of the year, with OpenAI and Anthropic currently at the frontier of AI performance 10s.
Open models from China are considered to be at the frontier in terms of token efficiency or cost for quality, and there is hope that open models will step up to compete with the major labs 1m5s.
For coding problems, Codex is the preferred choice, while glm 5.1 from a Chinese lab is considered a cheap workhorse for high-volume tasks, and it can be hosted on domestic inference providers 4m30s.
There is a lack of understanding about the ability to host open-source models on one's own infrastructure, which may be holding back enterprise adoption 5m20s.
Opus is considered an overrated model, as it is often used for tasks that do not require its capabilities, such as asking about the weather, while OpenAI is considered underrated due to its actual performance 6m40s.
Xai is also expected to be a major player in the AI landscape by the end of the year, with its Colossus cluster and new models, and its acquisition of cursor is seen as a smart move to gain more data 8m30s.

Consumer Access and Reliability of AI Models

The availability of AI models for consumer use is still limited, and while some companies like Perplexity offer versions of these models, their reliability and trustworthiness are still uncertain 11m20s.
The desire for the most advanced model, referred to as the "frontier model", is present, but the cost becomes a significant factor when spending large amounts, such as half a billion dollars a month, rather than a relatively small amount of $100 per month 10s.
There is a tendency for individuals to believe that their work requires the most expensive and advanced model, and they are often unwilling to admit that cheaper alternatives could be sufficient, which is a key aspect of changing behavior 42s.
Experimenting with cheaper models can help determine if they can produce satisfactory results, allowing for a reduction in costs and avoiding the depletion of budgets, which can be a significant incentive for making changes 1m26s.
The topic of cost trade-offs with AI is expected to continue to evolve and play out, with potential implications for corporate budgets and the way companies approach AI adoption and implementation 2m6s.

Made with Recall · in 3 seconds

Get a summary like this for anything you read, watch or save.

Recall summarizes any link you paste, then keeps it in your personal library so you can search, chat with it, and never lose a key idea again.

YouTube videosArticlesPodcastsPDFsAnything else

Save this summary

Keep it in your library.

Save to your library

Browse all from CNBC →

How Elon Musk's AI Empire In Memphis Became A Cautionary Tale

How Elon Musk's AI Empire In Memphis Became A Cautionary Tale

YouTube19 Jul 2026

AI’s Next Race: Cost, Control, and Compute

Artificial Intelligence

AI’s Next Race: Cost, Control, and Compute

YouTube13 Jul 2026

Why Amex And Chase Love Lounges

Why Amex And Chase Love Lounges

YouTube04 Jul 2026

Why Retail Investors Are Betting On SpaceX’s Massive IPO

Why Retail Investors Are Betting On SpaceX’s Massive IPO

YouTube14 Jun 2026

U.S. Confronts The Hidden Risk Of Chinese Circuit Boards Fundamental To AI Chips

U.S. Confronts The Hidden Risk Of Chinese Circuit Boards Fundamental To AI Chips

YouTube07 Jun 2026

The Fix For AI's Spending Problem Is Not Good For OpenAI And Anthropic

The Fix For AI's Spending Problem Is Not Good For OpenAI And Anthropic

YouTube06 Jun 2026

Ready to get started?

Save, summarize and chat with your content.

IT'S FREE

No credit card required · 30 Day Refund on Premium · 24 Hour Support

Recall web app on laptop, personal AI knowledge base for summarizing and chatting with your content