YouTube video summary

How to measure the effectiveness of GitHub Copilot

Technology13 Nov 20249 min summaryFrom GitHub
How to measure the effectiveness of GitHub Copilot
GitHub
YouTube

Introduction 0s

  • This session is led by Kitty Trueu, a Senior Developer Architect, and Mickey, a Staff Developer Architect, both from the Customer Success Strategy team at GitHub, who work with customers to solve complex problems on a daily basis 12s.
  • The session aims to provide three key takeaways: productive insights from the Copilot platform, insights from the Copilot Matrix, and suggestions on how to survey developers without causing fatigue 53s.
  • The session is focused on providing numbers and insights to help make a business case for using GitHub Copilot, rather than just relying on the hype surrounding AI tools 2m49s.
  • The presenters will provide a demo and offer suggestions on how to measure the effectiveness of GitHub Copilot, with the goal of helping attendees to show the value of the tool to their business and make a case for its adoption 3m10s.
  • The session is motivated by a common scenario where a developer requests approval for a GitHub Copilot subscription, but the boss wants to know what value it will bring to the business and what numbers can be used to justify the expense 1m33s.
  • The presenters will help attendees to go beyond the hype surrounding AI tools and to understand exactly what they are measuring, with the goal of providing a clear business case for using GitHub Copilot 2m58s.

Why we measure 3m17s

  • When modernizing Engineering Systems, it's essential to build a Competitive Edge for the business by constantly building competency around people, process, and technology 3m33s.
  • The "people" aspect involves investing in the team to increase productivity, ensuring they have the right tools, skills, and communication, as well as adequate feedback and mental space to work effectively 3m57s.
  • The "process" aspect refers to the orchestration of activities to achieve a goal, and leaders should constantly challenge the status quo and remove red tape to improve efficiency 4m26s.
  • The quote from James Clear's "Atomic Habits" emphasizes that systems need to evolve to achieve goals, and leaders should focus on building systems that support their objectives 4m42s.
  • The "technology" aspect involves using tools to enable productivity, and investing in GitHub Copilot is not just about individual productivity but also about increasing system efficiency 5m9s.
  • The ultimate goal of investing in GitHub Copilot is to ship faster and reduce rework 5m27s.
  • To measure the effectiveness of GitHub Copilot, it's essential to establish a baseline of the engineering system and start measuring now, as you can't manage what you don't measure 6m25s.
  • GitHub users have access to metrics and statistics that can provide insights into their platform's productivity, and it's crucial to have a holistic set of signals from people, process, and technology to measure performance 6m50s.
  • Engineering systems may have performance signals from various dimensions, including high-quality code, which is a delight to both developers and the business 7m39s.
  • A developer's achievement is to write quality code, while a business's goal is to avoid the cost of delay and rework, and adapt to the market dynamics 7m45s.
  • High-quality code is characterized by being readable, reusable, maintainable, concise, resilient, and secure, which are the quality attributes that developers aim to write into their code 8m0s.
  • Developer happiness is about how satisfied developers are with their tasks, whether they can focus, feel challenged, and are engaged, as well as their mental energy when working from home 8m21s.
  • A philosophy in engineering systems is not just about speed at a single point in time, but about delivery throughput, pace, friction, waste, delays, and manual validation 8m46s.
  • The goal is to build things right and build the right things, ensuring that modernizing work and effort enable and accelerate engineering system success, making a positive business impact and return on investment 9m23s.
  • When defining the purpose of using a tool like GitHub Copilot, considerations should include building quality code, developer happiness, philosophy in engineering systems, and making a positive business impact 9m49s.

What we measure 9m52s

  • When implementing GitHub Copilot, organizations typically go through four phases: adoption, activity, satisfaction, and impact, with the adoption phase focusing on ensuring users are actually using the product 10m21s.
  • To measure adoption, an API can be used to track basic usage information, such as the number of people using Copilot every day 10m54s.
  • The activity phase involves analyzing how users are utilizing Copilot's features, such as chat suggestions and auto-completion, which can be pulled from the API 11m26s.
  • Acceptance rate is a key metric, calculated by dividing the number of accepted suggestions by the total number of suggestions made, with a 30% acceptance rate considered good 11m45s.
  • Low acceptance rates often indicate that users need training or guidance on how to effectively use Copilot 12m21s.
  • The satisfaction phase involves assessing whether developers are happy using the tool, feel satisfied, and believe it helps them solve problems and get into a flow state 13m18s.
  • The length of these phases can vary, lasting from weeks to months, and cannot be rushed 13m2s.
  • Measuring the effectiveness of GitHub Copilot ties into the developer experience, but there isn't an API available for this, making surveys a viable solution 13m29s.
  • When creating surveys, it's essential to keep them short, as respondents are more likely to complete a four-question survey than a 20-question one 13m46s.
  • The ultimate goal is to determine the downstream impact of using GitHub Copilot, such as whether it improves software development speed and quality 14m0s.
  • Various metrics can be used to calculate the downstream impact, some of which can be obtained through the GitHub API, while others may require data from other systems 14m11s.
  • GitHub-specific metrics that can be used include average merge request frequency, which can help determine if GitHub Copilot is leading to shorter, smaller commits and faster development 14m41s.
  • When working with customers to adopt GitHub Copilot, it's essential to take a phased approach and not rush the process, as changing people's behavior takes time 15m0s.
  • It's crucial to understand that people's behavior, not just machine performance, is a critical factor in measuring the effectiveness of GitHub Copilot 15m19s.
  • A reasonable timeframe for measuring the downstream impact of GitHub Copilot is months, not days or hours 15m28s.

How we measure 15m31s

  • To measure the effectiveness of GitHub Copilot, the first step is to pull information from the API, which can be done using the GitHub CLI, a tool that encapsulates authentication and allows for easy interaction with GitHub and API calls 15m33s.
  • The API call returns the last 28 days of information, which is a rolling 28-day period, and the data is in JSON format, including metrics such as total suggestion count, total acceptance count, and active user count 16m15s.
  • The data can be used to calculate the acceptance rate, which is a key metric for measuring the effectiveness of GitHub Copilot, and can also be broken down by language and IDE 16m52s.
  • A Power BI dashboard created by GitHub can be used to visualize the data and provide a nicer look at metrics such as engaged users, suggestions made, and lines accepted 17m42s.
  • The dashboard provides a better understanding of what's going on in the environment, including acceptance rate, acceptance by language and IDE, and engaged users 18m14s.
  • The data can also be used to identify trends and areas for improvement, such as a decline in acceptance rate over time, and can be used to inform decisions about how to improve the use of GitHub Copilot 18m20s.
  • As GitHub adds more features to Copilot, more data will be added to the API, allowing for more detailed analysis and visualization 18m42s.
  • The data and visualization can be used to provide insights and recommendations to managers and stakeholders, who often prefer to see data presented in a visual format 17m24s.
  • To measure the effectiveness of GitHub Copilot, surveys can be used to gather information from developers, and there are various options available, including third-party survey engines or a simple email to developers 19m2s.
  • A free, open-source project called the Co-Pilot Survey Engine is available, which can be set up with a database backend and uses a GitHub app to open an issue with four questions every time a pull request is closed 19m21s.
  • The four questions in the Co-Pilot Survey Engine ask if Copilot helped with the task, how the developer felt about using it, if it saved time, and what was done with the time saved 19m52s.
  • Another example of a Co-Pilot developer survey is available, which has 10 questions with more detail, and the choice of survey depends on the specific needs and understanding of the users 20m11s.
  • It is recommended to start with fewer questions, at least initially, to get developers on board and used to answering the questions seriously 20m32s.
  • Downstream metrics, such as deployment frequency and faster merge rates, can be used to measure the effectiveness of GitHub Copilot, and some of this data can be obtained from GitHub using the GitHub API 20m51s.
  • A shell script can be used to pull data from GitHub and calculate downstream metrics, such as deployment frequency, and this data can be used to determine if Copilot is helping to solve problems 21m15s.
  • The discussion revolves around measuring the effectiveness of GitHub Copilot, focusing on key metrics that matter in different phases 22m29s.

Key takeaways 22m51s

  • The key takeaways from the presentation include recalling why measuring the effectiveness of GitHub Copilot is important and going beyond the hype to understand the value it brings to the engineering system and the business as a whole 22m51s.
  • Measuring the effectiveness of GitHub Copilot is crucial to understand the return on investment and to build a competitive edge for the business, and it's essential to start measuring now if not already done 23m30s.
  • To measure the effectiveness, consider defining signals from the existing Engineering Excellence Matrix or use the four dimensions presented, and integrate third-party system telemetry for a holistic view of the value of the AI investment 23m44s.
  • The presentation demonstrated how to measure from the first-party GitHub platform and the Matrix API, and the importance of integrating third-party system telemetry to show the value of the AI investment 23m50s.
  • A dashboard was demoed, which will be open-sourced, and the Copa-generated script and the dashboard link will be posted on the community site for further discussion and questions 24m20s.
  • Measuring value is a subjective matter, and the community's input and discussions are encouraged to help define the right metrics and numbers to measure value 25m0s.
  • The presentation concluded with a thank you note to the attendees and an invitation to provide feedback through a survey to improve future experiences 25m17s.
Made with Recall · in 3 seconds

Get a summary like this for anything you read, watch or save.

Recall summarizes any link you paste, then keeps it in your personal library so you can search, chat with it, and never lose a key idea again.

YouTube videosArticlesPodcastsPDFsAnything else
Save this summary

Then save anything you watch or read next.

Bookmark this summary, then save any video, article or PDF you read next.

Save to your library

Ready to get started?

Save, summarize & chat with your content.

GET STARTED

IT'S FREE

No credit card required · 30 Day Refund on Premium · 24 Hour Support

Recall web app on laptop