If the object of developing and developed world leaders is to uplift their
peoples continually, then it is essential to measure approximations of actual
service deliveries (what we ought to mean by “governance”), not to rate
nations impressionistically according to the perceived quality of their
operations, their perceived impartiality (as per Rothstein), the extent of
their bureaucratic autonomy (as per Fukuyama and others), or their capacity
to coax or coerce citizens. Only in that positive manner can we distinguish
the governments that are producing abundant political goods (i.e.,
good governance) from those that no longer are, or never did.
Would not it be enormously useful if we could establish an agreed-upon
definition of “governance”? Doing so would enable researchers to write
with comparative precision about critical and important phenomena.
Policy and opinion makers could reflect meaningfully upon clearly
enunciated and measurable characteristics. Governments could be examined
using a common typology. Donors and others could employ a well considered
scale to compare countries for investment and assistance
Alas, governance is a concept of many proprietors and many varieties
of definition and explanation. This note responds especially to the
approach recently articulated in Governance by Francis Fukuyama.
He emphasizes bureaucratic capabilities as a key aspect of governance
(Fukuyama 2013a). For a decade, in print and otherwise, I have been
attempting to introduce and popularize a different sort of definition
that equates governance with the performance of governments, which
sets out the specific criteria by which that performance should be
assessed, and that indicates the kinds of data that should be gathered to
do the necessary measurement (Rotberg 2004, 2007, Rotberg and West
In my view, and as I have argued elsewhere, performance of governments
means the delivery of the five bundles (divided into 57 underlying
subcategories) of political goods that citizens within any kind of political jurisdiction demand. I suggest that measuring performance, contrary to
Fukuyama, can best be done by using publicly available objective (not
subjective) data, and by examining outputs (results), not inputs. Proxy
results (e.g., life expectancy for the performance of a government in
providing good health outcomes, homicide numbers for safety, etc.) demonstrate
the effective delivery of governmental services—governance.
Governance, in other words, is tangible. It acts. It is not something stylistic
or artistic.
Such a scheme makes epistemological and parsimonious sense. It is
tidy and transparent. And it works.
Inputs or Outputs? Quality or Capacity?

Most of the other work on measuring governance has found it easier and

more satisfying to have examined inputs rather than outputs, and to have
done so subjectively by estimating bureaucratic and other “capacity” or by
assessing budgetary procedures, styles of financial management, or the
ineffable “quality” of governments. Rothstein, for example, seeks normatively
and procedurally to measure a state’s “impartiality”—a proxy for
the overall quality of a government. “Just political procedures are those
that by and large can be seen as impartial by groups with very different
conceptions of ‘the good.’ ” He and his Quality of Government Institute at
the University of Gothenburg use surveys of the opinions of experts (1,000
in 126 countries) to estimate “impartiality” and to rank countries according
to quality—how good their governments are (Rothstein 2011, 12–23;
Rothstein and Teorell 2013). But impartiality might or might not carry
with it the ability to “deliver,” that is, to “perform.” And, despite
Rothstein’s protestations, “impartiality” is rooted too deeply in
philosophical liberalism.
“Quality” studies are also carried out by the Varieties of Democracy
Project, and indexes such as the Bertelsmann Stiftung’s Transformation
Index (Rotberg, Bhushan, and Gisselquist 2014). Freedom House’s
Freedom in the World annually declares almost every global polity
either free, partly free, or not free based on the opinions of experts. But
its scoring is inherently subjective, with abundant opportunities for
selection bias (Freedom House 2013). UNDP calls governance a “system
of values, policies, and institutions by which a society manages its
economic and social affairs. . . . It is the way a society organizes itself to
make and implement decisions . . .” (UNDP 2000). But that is very
general, and measuring good governance in that manner becomes an
exercise in subjective speculation.
Widely used is the World Bank’s Worldwide Governance Indicators
(WGI); it measures quality of national governance by aggregating other
indexes of governmental effectiveness, regulatory quality, stability, and
control of corruption—all attributes capable of being estimated by crowd
sourcing (surveys of experts again) but more difficult to calibrate using
nationally generated statistics. The Indicators are largely normative,
encompassing policy preferences rather than (as in an output-oriented
index such as I propose) measuring the satisfaction of citizen-request
priorities. Rothstein labels the Indicators’ definition of governance as too
broad, especially its normative emphasis on “sound policies.” The Bank’s
emphasis on the input side of the governance equation “makes it impossible,”
he writes, to provide true results for governmental performance
(Rothstein 2011, 8; Rothstein and Teorell 2013, 3; see also Kraay,
Kaufmann, and Mastruzzi 2010).
As Fukuyama indicates, expert surveys are inherently weak. Unless the
experts have a common notion of “governance” or “regulatory effectiveness,”
each may answer the questions posed honestly, but from vastly
different perspectives. One person’s corruption, in other words, may be
another’s reciprocal “gift-giving.” Rule of law, he says, may even mean
one thing in one region and something very different in another. Some
may translate rule of law as “property rights,” another as constitutional
and other constraints on the executive (Fukuyama, 2013b).
Arndt and Oman concur with Fukuyama and other critics of basing
index results largely, if not entirely, on the perceptions of experts. Such
views are “inherently subjective” and “non-replicable.” Moreover,
perception-based indicators often reflect the views of businessmen and
“users tend to rely on the same indicators which they see their peers
using.” So there has been a “bubble effect,” and “herd behaviour” (Arndt
and Oman 2006). Arndt and Oman’s report and work by Thomas include
lengthy critiques of the methodology employed by the WGI (Thomas
Fukuyama himself prefers to define governance as a government’s
ability “to make and enforce rules, and to deliver services,” whether
within a democratic framework or not. For him, “governance is . . . execution.”
How a regime administers itself is critical. He builds on Weber’s
criteria for successful bureaucracies, that is, technocrats selected and motivated
by merit, remunerated fairly, and subject to discipline and control
(Fukuyama 2013b, 3–4).
These are inputs, preferred as measures by Fukuyama and many
others who have produced existing scholarship on governance, and
assuredly also by the majority of the makers of existing governance
indexes. Obviously, “good procedures and strong capacity are not ends in themselves.” Fukuyama understands that measuring outputs “could” provide “some idea as to how governments are performing” (Fukuyama 2013b, 11). But he asserts that there are decisive drawbacks to the use of outputs: Improvements, say, in the educational or health areas are not necessarily the consequences of governmental action. Those results could (as many economists believe) flow from the contextual situation, because of an existing resource base, or as a result of historical circumstances.

Andrews’ reservation (with which I mostly disagree) is the caution of
many: Indicators of governance “really reflect a nation’s level of development,” and not its governance (Andrews 2013, 5). Fukuyama also worries that there are too many methodological problems with the measurement of many kinds of outcomes, and that outcomes can be influenced too much by procedural inputs—how a regime delivers could influence the type of result.
Fukuyama proposes that outputs should be considered independent
variables explained by state quality rather than being measures of capacity
themselves. He further suggests that the quality of government (or governance)
is to be found at the intersection between what he calls capacity
and what Huntington described as “bureaucratic autonomy” (Huntington
2006). The latter is the ability of the bureaucrats to carry out the policies of
the state with little micro-management and according to broad guidelines.
Capacity, imperfectly defined, includes the ability of a state to perform
essential functions such as, but not exclusively, the ability to extract taxes
and obtain census information.
Andrews leans in his definition of governance more in the general
direction proposed by this article: The core theoretical understanding of
“governance” should be “the exercise of authority by governments on
behalf of citizens.” Governance indicators, he writes, should therefore
focus (as the Index of African Governance does) on “specific fields of engagement”
in which governments perform on behalf of citizens. “Indicators
should emphasize outcomes . . . the true indicators of governance”
(Andrews 2013, 6; Gisselquist 2012; Rotberg and Gisselquist 2009).
Bratton, another recent commentator, suggests that “governance is the
act or process of imparting direction and coordination to governmental
organizations in an environment.” He is close to Fukuyama since his
version of “governance” is “administrative and economic” as well as
political. Bratton (one of the founders of the Afrobarometer) believes in
the utility for measuring purposes of large-scale social surveys. They can
indeed indicate citizen or consumer satisfaction.An example he cites from
the administrative sphere, as discerned across countries by responses to
Afrobarometer questions, is the perception of a national president’s observance
of the rule of law (Bratton 2013a, 2013b). But such subjective polls,
no matter how broad or how carefully representative, hardly tell us objectively
how a government performs. They merely tell us, and usefully, what
citizens think or what they perceive—depending always on how well and
how precisely the survey questions are posed.

Proxy and Other Outcome Measures
Fukuyama pays too little attention to the second half of his original definition.
If governance is indeed “performance”—the delivery of services, as
we posited at the outset and as he seems to agree, then arguably the most
important measures of that delivery must be both the quality and the
quantity of those services. I have offered from the start a definition of
governance that tries hard not to be prescriptive or normative. It proceeds
from a summing of the needs, desires, and expectations of inhabitants of
jurisdictions, usually citizens. What is it, I ask, that citizens expect or
demand of their governments? What is it since the 17th century that
citizens have asked of their monarchs and later their states and nations?
Mine is a bottom-up method of defining governance that emphasizes
If we agree that citizens (originally taxpayers) expect their governments
to perform in such a manner that citizens will be secure (free from being
invaded or free from civil war and intrastate tumult) and safe (free from
crime and personal endangerment); if we agree that citizens desire something
akin to the predictability and backing of a robust rule of law that
delivers sanctity of contract as well as a fair and nonviolent adjudication of
disputes between persons; if we agree that most inhabitants of most states
prefer not to be cheated by corrupt practice; if we agree that citizens prefer
to participate in rule setting and thus in governing themselves, or at least
prefer to have a voice in agenda setting; if we agree that individuals prefer
to prosper—to eat more and better food, to be housed adequately, to be paid
fairly for their labor, and to believe that they are free to use their own skills
to better themselves; and finally, if we agree that citizens generally look to
states to provide educational opportunities, better rather than deficient
health care, clean water, a minimally polluted environment, and so on, then
it makes perfect sense to compare better and poorer national ways in which
these needs are realized—thus better and poorer ways by which states
perform for their taxpayers and inhabitants, or a composite of political
goods delivery that may conveniently be labeled “governance.”
The way to make those necessary comparisons—to calculate the
manner in which the state uses its capacity and a lesser or greater sense of
bureaucratic autonomy to satisfy its citizens—is to measure such variables
as, say, participation or educational opportunity. To do so, it is hence
essential to examine results. There is no better way of estimating how
successfully a state has met its obligation to serve—to perform—without
carefully calculating outcomes. It is possible, to be sure, to ask citizens if
they are “satisfied” with a government’s performance. That happens periodically
through elections, and also through consumer surveys that are
helpful but hardly definitive. More exacting, and more useful when one
tries to compare disparate polities or attempts to diagnose how a nation-state
could do more for its citizens—how it could perform better and more
completely—are examinations of actual quantifiable results.

Since it is difficult to measure governance performance across the five
categories of political goods directly, I use proxies (as explained previously
in the text). Fukuyama suggests that services are hard to estimate,
in part because some of the potential tests are unsatisfactory (Fukuyama
2013b, 9). He cites as bad choices using examination scores to measure
educational outcomes or the rate of case clearances to measure the
quality of justice or the rule of law. But there are better proxies for those
political goods, and across the 57 variables that comprise my preferred
output measuring data set; a number are very robust, a few are unhappily
qualitative and subjective, but in long use; and all are demonstrably
helpful in estimating the performance of governments. That is, they do a
very strong (not a perfect) job of measuring what we want to measure.
They offer the kinds of hard data that truly enable us to compare countries
to countries, provinces to provinces, municipalities to municipalities,
and so on.
The other (largely subjective) method, as used by Freedom House,
Bertelsmann, the Legatum Institute’s Prosperity Index, the Fraser Institute
Economic Freedom Index, Save the Children’s Mother’s Index, the Stanford
Food Research Institute’s Hunger Index, and even the Happy Planet
Index, produces helpful approximations of the realities of at least some of
aspects of governance among and between nations. But their scores are
based on opinions, feelings, codings, anecdotal understandings, and the
like. For example, the widely respected World Economic Forum’s Competitiveness
Index depends on a survey of the views of a limited number
of business executives within a country. Those kinds of indexes have great
difficulty measuring more than what observers and opinion makers
believe is happening, not unchallengeable reality. Experts may say, for
example, that country A has better educated citizens than country B when
an examination of real results—persistence levels, numbers of students
who go on from secondary to tertiary education—and so on might reveal
that country B is in fact producing better educational outcomes than
country A. In Africa, it is counterintuitive that poor Malawi should be
better governed than prosperous and bustling Kenya. But that is what the
Index of African Governance scores show, year after year, probably because
Malawians demonstrate positive outcomes despite poverty and Kenyans
fall short because of greater ethnic conflict, greater corruption, poorer
educational attainments, and so on.

Absent a results-based method of weighing the performance of governments,
drawing such distinctions between countries would be imprecise,
even potentially inaccurate. Using a vague notion of “quality,” a
proxy for “capacity,” or a proxy for “bureaucratic autonomy” would leave
differences between countries and across regions very hard to substantiate,
even impressionistically (Heller 2013 proposes employing a very complicated
array of “second-generation” data to do the job of measurement
when much simpler, more direct ways of assessing good governance are
easily available, as specified in this note).

As this note argues, precise measurements of governance as a whole
and of governance separated into its component parts permit researchers
and policymakers to separate good performers from bad performers.
Measuring governance by the outcome method painstakingly shows
whether regimes are delivering necessary and desirable governmentally
provided performance results to their citizens. This concept of governance
diagnostically also enables an existing government, a civil society, or
donors to appreciate which parts of an overall system are working well
and which poorly. Critical decisions may thus be made that, in the best of
circumstances, can improve good governance and therefore the living
conditions of those who reside in the developing world.
