In this paper, we study the adoption of a generative AI tool that provides conversational guidance
for customer support agents.1 This is, to our knowledge, the first study of the impact of generative
AI when deployed at scale in the workplace. We find that access to AI assistance increases the
productivity of agents by 14 percent, as measured by the number of customer issues they are able to
resolve per hour. In contrast to studies of prior waves of computerization, we find that these gains
accrue disproportionately to less-experienced and lower-skill workers.2 We argue that this occurs
because ML systems work by capturing and disseminating the patterns of behavior that characterize
the most productive agents. (View Highlight)
It monitors customer chats and provides agents with real-time suggestions for how
to respond. It is designed to augment agents, who remain responsible for the conversation and are
free to ignore its suggestions. (View Highlight)
AI assistance increases worker productivity, resulting in a 13.8 percent increase in the
number of chats that an agent is able to successfully resolve per hour. This increase reflects shifts in
three components of productivity: a decline in the time it takes to an agent to handle an individual
chat, an increase in the number of chats that an agent is able to handle per hour (agents may handle
multiple calls at once), and a small increase in the share of chats that are successfully resolved. (View Highlight)
AI assistance disproportionately increases the performance less skilled and less experi-
enced workers across all productivity measures we consider. In addition, we find that the AI tool
helps newer agents move more quickly down the experience curve: treated agents with two months
of tenure perform just as well as untreated agents with over six months of tenure (View Highlight)
We posit that
high-skill workers may have less to gain from AI assistance precisely because AI recommendations
capture the potentially tacit knowledge embodied in their own behaviors. Rather, low-skill workers
are more likely to improve by incorporating these behaviors by adhering to AI suggestions. Consis-
tent with this, we find few positive effects of AI access for the highest-skilled or most-experienced
workers. Instead, using textual analysis, we find suggestive evidence that AI assistance leads lower-
skill agents to communicate more like high-skill agents. (View Highlight)
We show that AI assistance markedly improves how customers treat agents, as measured
by the sentiments of their chat messages (View Highlight)
New highlights added October 19, 2023 at 3:22 PM
We study the staggered introduction of a generative AI-based conversational assistant using data from 5,179 customer support agents. Access to the tool increases productivity, as measured by issues resolved per hour, by 14 percent on average, with the greatest impact on novice and low- skilled workers, and minimal impact on experienced and highly skilled workers. We provide suggestive evidence that the AI model disseminates the potentially tacit knowledge of more able workers and helps newer workers move down the experience curve. In addition, we show that AI assistance improves customer sentiment, reduces requests for managerial intervention, and improves employee retention. (View Highlight)
At a technical level, customer support is well-suited for current generative AI tools. From an AI’s perspective, customer-agent conversations can be thought of as a series of pattern-matching problems in which one is looking for an optimal sequence of actions. When confronted with an issue such as “I can’t login,” an AI/agent must identify which types of underlying problems are most likely to lead a customer to be unable to log in and think about which solutions typically resolve these problems (“Can you check that caps lock is not on?”). At the same time, they must be attuned to a customer’s emotional response, making sure to use language that increases the likelihood that a customer will respond positively (“that wasn’t stupid of you at all! I always forget to check that too!”). Because customer service conversations are widely recorded and digitized, pre-trained LLMs can be fine-tuned for customer service using many examples of both successfully and unsuccessfully resolved conversations. (View Highlight)
“average handle time,” the average length of time an agent takes to finish a chat; “resolution rate,” the share of conversations that the agent can successfully resolve; and “net promoter score,” (customer satisfaction), which are calculated by randomly surveying customers after a chat and calculating the percentage of customers who would recommend an agent minus the percentage who would not. (View Highlight)
: agents who are never given access to the AI tool during our sample period (“never treated”), pre-AI observations for those who are eventually given access (“treated, pre”), and post-AI observations (“treated, post”). In total, we observe the conversation text and outcomes associated with 3 million chats by 5,179 agents. Within this, we observe 1.2 million chats by 1,636 agents in the post-AI period. M (View Highlight)
Our primary measure of productivity is resolutions per hour (RPH), the number of chats that a worker is able to successfully resolve per hour. (View Highlight)
We measure these individually as, respectively, average handle time (AHT), chats per hour (CPH), and resolution rate (RR). In addition, we also observe a measure of customer satisfaction through an agent’s net promoter score (NPS), (View Highlight)