New MIT/Stanford study: 14% increase in productivity -- in real-world conditions!
Key finding: AI closes skill gaps between lower-skill and higher-skill workers
In the launch of this Substack, I cited two academic workplace productivity studies of generative AI from 2023 — the first I knew of in the ChatGPT era. Those studies found 56% and 37% increases in productivity for knowledge workers in coding and writing tasks, respectively.
A third study by MIT and Stanford professors, entitled “Generative AI at Work” dropped a few weeks ago. It shows 14% increases in productivity under real-world conditions for customer service agents using a generative AI tool that helped suggest solutions they could offer customers. Productivity was measured in terms of chats resolved per hour. (From here on out, I’m going to call this the “call center study.”)
Why 14% is a big deal
In case you’re asking, 14%? That’s way lower than 50% and 37%! Why is Taren still excited about this study?
The answer is: The first two studies were conducted in a research lab, with workers in a novel environment. So while the productivity increase was suggestive, you couldn’t necessarily expect the magnitude to transfer directly to an ordinary workday real world, where workers might have distractions, get bored, etc.
This call center study is different: It measured real productivity on the job, during the natural experiment of a staggered introduction of the tool, when some agents had access to it and others didn’t yet. And it WORKED! A 14% productivity increase under real-world conditions is huge. Consider, after all, that if that gain were passed through to workers, it would mean an average raise of 14% — more than most workers get in years.
A trifecta of trends: Quality, enjoyment, and leveling skills
The new call center study did more than just show productivity increases overall. It also provides additional confirmation of three trends suggested by one of the earlier papers. That paper was called “Experimental Evidence on the Productivity Effects of Generative Artificial Intelligence”, but I’m going to refer to it as the “writing task study” from here on out.
The writing task study included a wide range of knowledge workers for whom writing is a significant part of their job, from managers to marketing professionals. The authors asked a control group to do two separate job-relevant writing tasks with no special training in between. Meanwhile, they trained the treatment group on using ChatGPT between the two tasks.
Here are the three key findings from the writing task study that the call center study supported:
Generative AI improves quality of work!
As CBS reports on the call center study, “Assistance from the AI also improved customer sentiment, [and] reduced the volume of requests for managerial intervention.” Similarly, in the writing task study, the treatment group that was trained on generative AI tools between their first and second task improved their “grades” between tasks by a substantial margin:
Generative AI tools appear to close performance gaps between low-skill and high-skill workers
As Chris Farrell from Marketplace highlights about the call center study, “The most striking result to me and to the scholars is that AI disproportionately improved the work performance of less skilled and less experienced workers. AI also helped these workers raise their job skills fast.”
This aligns with findings from the writing task study. As you’d expect, there was a high correlation between the quality of writing on Task 1 compared to Task 2 among the control group: Good writers remain good writers. But using ChatGPT helped weaker writers in the treatment group much more than it helped stronger writers:
Workers LIKE using generative AI tools.
In the earlier writing task study, the authors found that “Exposure to ChatGPT increases job satisfaction and self-efficacy.” However, I was concerned that this effect might be due to the novelty of using the tool, especially for the first time in a lab.
But this job satisfaction finding was supported in the real world in the new call center study, where having access to the chatbot tool improved employee retention. Call center study co-author Erik Brynjolfsson of Stanford told CBS: "There was less churn once they used this tool because it seemed workers were happier and enjoyed the job more…. We wondered if it would push them harder, but it seems to be something workers liked. Customers were happier and I'm guessing as a call center operator, it's more enjoyable to interact with happy customers."
What your organization should expect
When your organization starts adopting generative AI tools, the magnitude of the impact you see will vary both by the type of work and quality of the generative AI tool. But the evidence is becoming clearer that you should expect your team to:
produce higher-quality work (especially among lower-skill workers)…
faster…
while enjoying their jobs more.
As always, if you’d like help figuring out what generative AI tools your organization should be using and training your team on them, please contact me at AI Impact Lab. That’s what we’re all about!
Nice! Love that you're surfacing the research for us!