Benchmarks

The quantification of everyday life

The English poet W.H. Auden first published The Unknown Citizen in the January 6, 1940 edition of The New Yorker. Auden’s poem depicts the epitaph of a ‘Modern Man’, one whose life choices, political views, consumption patterns, and conduct are favourably assessed against the prevailing normative standards of the day by government agencies, his employer, and his trade union. The poem concludes its satirical critique of the creeping bureaucratization and standardization of the era by dismissing the possibility that there are any important dimensions of human life that cannot be measured statistically by the state: ‘Was he free? Was he happy? The question is absurd: Had anything been wrong, we should certainly have heard’.

The Unknown Citizen - W.H. Auden

He was found by the Bureau of Statistics to be
One against whom there was no official complaint,
And all the reports on his conduct agree
That, in the modern sense of an old-fashioned word, he was a
saint,
For in everything he did he served the Greater Community.
Except for the War till the day he retired
He worked in a factory and never got fired,
But satisfied his employers, Fudge Motors Inc.
Yet he wasn’t a scab or odd in his views,
For his Union reports that he paid his dues,
(Our report on his Union shows it was sound)
And our Social Psychology workers found
That he was popular with his mates and liked a drink.
The Press are convinced that he bought a paper every day
And that his reactions to advertisements were normal in every way.
Policies taken out in his name prove that he was fully insured,
And his Health-card shows he was once in hospital but left it cured.
Both Producers Research and High-Grade Living declare
He was fully sensible to the advantages of the Instalment Plan
And had everything necessary to the Modern Man,
A phonograph, a radio, a car and a frigidaire.
Our researchers into Public Opinion are content 
That he held the proper opinions for the time of year;
When there was peace, he was for peace:  when there was war, he went.
He was married and added five children to the population,
Which our Eugenist says was the right number for a parent of his
generation.
And our teachers report that he never interfered with their
education.
Was he free? Was he happy? The question is absurd:
Had anything been wrong, we should certainly have heard.

Nearly 80 years later, Auden’s critique of the nascent tendency for the state to reduce economic and social life to statistical standards has developed into a leviathan of numbers, with even the relative happiness of different countries subject to numerical indicators and comparative rankings. The incessant measurement of social and economic behaviour and performance pervades everyday life in the 21^st century. This quantification of everyday life translates the complicated and messy subjective realities of individuals and groups into neat objective categories of performance, making them easier to compare, to grade, to govern, and to influence. Numbers are used to measure a diverse and growing range of aspects of people’s lives and activities around the world, from national economic performance and global poverty to political freedom, sexual and gender-based violence, slavery, education, sport, and many other areas. They also shape how governments prioritize and decide on major policy choices, and influence how markets and individuals respond to those choices.

Ranking freedom. Source: Freedom House

Benchmarking the world

Global benchmarking is a transnational practice of comparative assessment based on the quality of conduct, the quality of policy and institutional design, or the quality of economic and social outcomes for a set of countries or other political units or organisations. The practice of global benchmarking begins with particular types of behaviour, rules, or outcomes that are of interest to a specific actor. These complex phenomena are then simplified and compared using a common metric in order to create performance indicators that can be aggregated into country scores or ratings, which are often presented as a hierarchical ranking of national performance from top to bottom. Global benchmarks have become a prominent means for states, international organisations, civil society organisations, and other actors to capture political attention, shape public debates, set policy agendas, and identify targets for interventions that aim at ‘improving’ performance in desired ways.

Benchmarking has quickly proliferated as a key tool of transnational governance in recent years. The Global Benchmarking Project has catalogued 275 benchmarks that have been created to rate and rank comparative performance in world politics. Over 200 of these benchmarks have been established since 2000, which serve to (re)produce social hierarchies in the global political economy between good and bad performers. Countries ranked as the ‘best’ in global benchmarks are typically wealthy developed economies, with European, North American, and Australasian countries dominating the ‘top ten’ of many global league tables.

The proliferation of global benchmarks. Source: Global Benchmarking Database

Dodgy data and distorting discourse

Powerful numbers can be subject to political manipulation to create the appearance of progress or to exaggerate problems through fiddling the figures, or to disguise weaknesses in data collection, coverage, and validity. For instance, the Trump administration in the United States explored the possibility of tweaking official US trade statistics to make the country’s trade deficit with other countries appear larger, and thereby strengthening political arguments against new multilateral trade deals. Evidence also suggests that some developing countries may under-report gross national income (GNI) per capita data to the World Bank in order to maintain access to concessional lending through the International Development Association. Heated arguments over the quality of some countries’ official statistics point to the political stakes involved in the question of how the outcomes of economic development and poverty reduction initiatives are measured and benchmarked.

One major problem with using benchmarks as governance tools is the failure of numerical measures to capture the complexity and multidimensionality of economic and social phenomena. This is referred to in the social sciences as a construct validity problem, and occurs when the design of performance indicators that are used to develop a benchmark fail to effectively and holistically measure what a benchmark claims to measure. For example, national measurements of gross domestic product (GDP) fail to capture the economic costs of environmental degradation, ignore the value created through unpaid household labour, and provide an extremely narrow conceptualization of prosperity that obscures fundamental variations in economic outcomes across different social groups, regions, industries, and income brackets in a particular country. The use of benchmarks by civil society organisations that advocate for political change and policy reform often suffer from similar methodological limitations.

US senator Robert Kennedy critiques GDP in 1968

Another significant problem is the capacity for benchmarks to potentially distort discourse in a particular issue area, either in elite policymaking arenas or in public debate via the media. For example, a prominent global benchmark such as the World Bank–International Finance Corporation’s Ease of Doing Business (EDB) ranking can potentially shape how political actors talk and think about business regulation by presenting a particular image of the world as a universal standard for all governments to conform with. In the case of the EDB ranking, the image of the world promoted through the benchmark equates a liberal market economy with fewer and less-intrusive regulations on business activities with ‘good’ regulation. This has led to criticism from states as well as from official reviews that have questioned the efficacy and validity of the exercise, but such contestation has not prevented a number of countries, such as Russia, from seeking to improve their EDB ranking as an official goal of government policy.

Even when governments push back and challenge the legitimacy and the quality of the expertise behind a benchmark, such as China’s response in June 2017 to being downgraded to ‘tier 3’ (the lowest ranking) in the US State Department’s Trafficking in Persons report, this indicates the power of benchmarks to shape political conversations, embarrass governments, and disrupt the news cycle.

Living with benchmarks

Benchmarks now reify many of the economic and social dimensions of everyday life, and have become increasingly intrusive as management strategies in individuals’ professional lives, including within the modern university. Observers from radical as well as mainstream perspectives have questioned the turn to quantification as an end in itself and have noted that benchmarking amplifies institutional incentives for ‘teaching to the test’, whereby the targets of a benchmark learn to adjust their actions or policies just enough to improve their results in performance measurements. At the same time, the growing popularity of composite benchmarks, which combine a range of indicators sourced from other benchmarks, increases the distortions inherent in this peculiar form of data aggregation because of the myriad difficulties inherent in concept definition, the selection and weighting of different indicators, and how indicators with different measurement units are made commensurable and translated into numbers.

These troubling dynamics are found at every level of political economy that is subject to benchmarking today, from the local to the global, although they do not apply equally across different types of institutions and actors. The ties that bind benchmarking of performance to changes in future actions are likely to be tighter in cases where direct consequences result from ‘good’ or ‘bad’ performance assessments, and when it is harder for the targets of benchmarking to disguise under-performance or to game the system through cosmetic compliance with the normative agenda underlying a particular benchmark. Substantial differences exist between the respective influence of comparative performance measurements on the future actions of individuals (such as the use of personal credit scores for loan criteria), professions (such as the UK’s research excellence framework for academic researchers), manufacturing businesses (such as supply chain audits), financial institutions (such as ratings of debt securities), countries (such as governance indicators), or international organisations (such as comparative assessments of the transparency and accountability of global governance processes). Put bluntly, benchmarking often constitutes bad science.

Benchmarking often constitutes bad science

The practice of global benchmarking raises ethical concerns about whether it is appropriate to comparatively assess the national performance of dissimilar countries with very different structural circumstances, historical legacies, and governing arrangements with a common metric. Like Auden’s Unknown Citizen, countries are today subjected to benchmarking based on a series of particular normative standards about how the world ought to be ordered. While benchmarks can be a powerful tool for engendering economic and social change, the legitimacy of benchmarks and the question of who benefits from this process and whose interests are being advanced should not be treated as marginal issues that are a secondary concern to the promotion of a ‘good cause’. Rather, the interrogation of who and what is measured, how, and who benefits from benchmarks is fundamental to understanding the International Political Economy of everyday life in the contemporary era.

Benchmarks Resources

Books

Andreas, P. and Greenhill, K.M. Eds. (2010) Sex, Drugs, and Body Counts: The Politics of Numbers in Global Crime and Conflict. Ithaca: Cornell University Press

Cooley, A. and Snyder, J. Eds. (2015) Ranking the World: Grading States as a Tool of Global Governance. Cambridge: Cambridge University Press

Kelley, J. (2017) Scorecard Diplomacy: Grading States to Influence Their Reputation and Behavior. Cambridge, UK: Cambridge University Press

Merry, S.E., Davis, K.E., and Kingsbury, B. Eds. (2015) The Quiet Power of Indicators: Measuring Governance, Corruption, and Rule of Law. Cambridge: Cambridge University Press

Articles

Broome, A., Homolar, A., and Kranke, M. (2017). Bad Science: International Organizations and the Indirect Power of Global Benchmarking. European Journal of International Relations.

Broome, A. and Quirk, J. (2015). Governing the World at a Distance: The Practice of Global Benchmarking. Review of International Studies, 41(5), pp. 819–841.

LeBaron, G. and Lister, J. (2015). Benchmarking Global Supply Chains: The Power of the ‘Ethical Audit’ Regime. Review of International Studies, 41(5), pp. 905–924.

Seabrooke, L. and Wigan, D. (2015). How Activists Use Benchmarks: Reformist and Revolutionary Benchmarks for Global Economic Justice. Review of International Studies, 41(5), pp. 887–904.

Websites

Fickle Formulas

Global Benchmarking Database

Scorecard Diplomacy