TaxVox Will New IRS Performance Metrics Measure Up?
Barry Johnson, Janet Holtzblatt
Display Date

Last month, the IRS Chief Executive Officer (CEO), Frank Bisignano, announced that the agency will stop using its “level of service” (LOS) metric, the percentage of  Accounts Management (AM) calls answered by a live customer service representative. 

For over two decades, LOS has been one of the most visible measures of the IRS’s ability to assist taxpayers during the filing seasons—often cited in Commissioners’ testimonies and press accounts. But LOS has long been criticized by the IRS watchdogs—the National Taxpayer Advocate (NTA), Government Accountability Office, and Treasury Inspector General for Tax Administration. The IRS will replace LOS with several new measures, which could provide new insights into the impacts of cutbacks in staff levels and the adoption of new technologies—but only if the agency shares how the new measures are computed and commits to providing timely updates.

IRS initially lowered its LOS goal due to reduced staffing

The IRS typically releases two LOS measures, one for the filing season, and another for the entire year—which tends to be lower as priorities shift from answering calls to other tasks after April 15. Last year, despite high turnover in leadership and staff, the IRS surpassed its LOS goal of 85 percent during the filing season—presumably because mission-critical staff could not take advantage of buyouts until the filing season ended. 

However, as of October, the IRS had lost 17 percent of its employees in key filing season activities. Over the summer, the IRS announced that it would need 3,500 new taxpayer assistance representatives to achieve an LOS of 85 percent in the 2026 filing season. But when it reached only two-thirds of that hiring goal—and training was slowed by the October government shutdown and departures of experienced staff—the IRS lowered its 2026 LOS target to 70 percent. Shortly afterwards, the CEO revealed the replacement of LOS with other performance metrics.

LOS is an incomplete measure of taxpayer experience

In her 2026 Objectives Report to Congress, NTA Erin Collins noted that by focusing solely on AM customer service calls during the filing season, the IRS overlooked 11 million calls, or about 30 percent of the 2025 filing season total, including those related to overdue balances, collection activities, and identity theft. In her subsequent annual report, Collins found that of the nearly 104 million calls to the IRS made throughout fiscal year 2025, about 75 million were not answered by an employee. 

Moreover, LOS included only calls routed to a representative, ignoring those directed to automated responses and many hang-ups. 

In January, Bisignano announced that LOS would be replaced with enterprise metrics that reflect new technologies and service channels: the average speed of answer, call abandonment rate, and time spent on the line.  

The new metrics may be an improvement

Much more information is needed to determine if the new metrics will provide better measures of the taxpayer’s experience that reflect their efforts to get answers to questions using the full range of tools available to them.

In the private sector, “speed of answer” refers to the amount of time it takes for the caller to speak to a customer service representative, and “abandonment rate” is the share of callers who hang up before they reach that point. When the IRS uses those measures, how will they count callers who are directed to an automatic response before reaching a representative, or callers who try to speak to a human after they listen to the automatic response?

Less clear is the meaning of “time on line.” Is it the total amount of time on a phone line, including the wait, listening to an automated response, and talking to a representative? Or would this measure the use of relatively new technologies serving taxpayers, such as the amount of time online interacting with a chatbot?

Bisignano also declared that the new measures will be “enterprise metrics.” Does that mean that they will include all calls to the IRS, including taxpayers’ queries about overdue balances, collections, and identity fraud? And will the focus then shift from performance during the short filing season to the entire year?

Performance metrics need to include quality, tradeoffs, and historical data

Quality of call matters, too. In the past, one of the IRS’s performance measures has been the accuracy of representatives’ responses, based on a sample of calls. Bisignano has not revealed whether the IRS will retain that measure.

But even that quality measure is inadequate. New metrics could be strengthened with information about the number of IRS employees that the taxpayer spoke to, the thoroughness and accuracy of the information, and whether the taxpayer’s question was resolved. 

User experience studies would shed more light on the IRS’s performance. However, the IRS disbanded its user experience office last spring. 

Trade-offs between different types of services also matter. Customer service representatives not only answer the phones, but they also input data from paper returns into the IRS master file (at least until the IRS achieves its goals of digitizing paper returns) and correspond with taxpayers, many of whom are responding to notices about refunds frozen by the IRS. If phone service gets high scores but refunds are delayed, we doubt taxpayers will be satisfied with their interactions with the tax agency.

Finally, when it adds new metrics, the IRS should also include comparable estimates of the IRS’s past performance. Only then can people gauge any improvements in service. 

With the 2026 filing season underway, transparency is key

The IRS is counting on new technology, including AI-based tools, to offset significant staff losses over the past year and improve service. There is also room for improvement in legacy performance measures, and the agency should report a wide range of metrics. That includes separate measures on calls—distinguishing between automatic responses and discussions with representatives—and chatbot usage, historical data for comparisons, and identification of the trade-offs between the IRS’s different tasks. 

The public’s ability to judge how well the tax system is performing depends in part on how transparent the IRS is.

Tags IRS
Primary topic Tax administration (individual)
Research Area Tax administration (individual)