How do I cite the Indices of Social Development?

The Indices of Social Development are compiled from various sources.

If you quote the Indices of Social development, please refer to as follows: Source: Indices of Social Development, URL:

Our resources page and this excel sheet lists the indicators with the data provider of the indicators.

If you do not use the indices but underlying indicators, you should also give the data provider as source.

To quote these, please reference as follows, for example: Adult Female Literacy Rate: Source: World Development Indicators, accessed at

How reliable are the scores?

We updated with data until 2020. More data for the years 2025 and beyond will become available every 5 years. When more data becomes available for the latest year, we will revise the indices by adding the new data. Every time we add new data, standard errors of the Indices will decrease and hence become more reliable.

Why no standard errors (s.e.) for the Clubs and Associations measure?

For the Clubs and Associations measure this is because there are some countries for which the only available source is the World Values Survey. For other indices, an estimate is not produced if there was only one source but for this measure, the range of data is so limited these are included anyway.

Why do you present data for five year-averages?

Not all data (e.g. household survey data) is available for every single year. Hence, taking averages overcomes sample bias errors and outliers caused by extreme scores for one point in time.

Why use so many variables rather than selecting a few ‘key’ indicators?

First, there are few data sources that cover a fully representative range of the world’s countries, and thus without combining indicators, it would be impossible to gain scores for more than a small sub-sample of nations.

Second, we assume that every indicator has some amount of error. Error can be due to several causes: observational error may exist because of unreliability in the instrument used to record a particular phenomenon: surveys, for example, may be subject to reporting biases or sampling error, while official statistics on the other hand may have been compiled using different methodologies.

There is also error that is attributable to the use of indicators with low concept validity, that is, when the selected indicator, however reliably gathered, only imperfectly corresponds to the latent variable under consideration. One way to reduce error is to employ greater scrutiny in the selection and consideration of indicators. Yet this presumes a high degree of knowledge on the part of the analyst: it can be difficult to assess the reliability of any given measure in isolation, especially in the absence of familiarity with the method used to generate those values. Validity is easier to determine, though here again we often have to rely on complex assumptions regarding the causal relationship between what we are measuring and what we seek to measure.

For example, it may be open to contention whether civic capacity is best measured by features of the institutional environment (the number of media organizations, freedom of information), features of citizen behavior (engagement in local civic groups, participation in voting, petitions and demonstrations), or some other feature of that society (e.g. the number of international NGOs).

Moreover, with social phenomena we often face a trade-off between reliability, validity, and representativeness: a given indicator, such as the income ratio between different ethnic groups, may be a valid and reliable measure of social exclusion, but available for very few countries; a survey item on attitudes toward other ethnic groups is certainly valid and may be widely available, but subject to survey response bias. There is, in short, rarely a single indicator that adequately measures the concept we are trying to quantify.

Combining multiple indicators, on the other hand, is another means to reduce aggregate error. If one assumes that errors are uncorrelated between data sources and that the size of the error is constant across items, then the combination of multiple sources will progressively reduce error as the number of indicators increases. We supplement these estimates with the calculated margin of error for each country, which is based on how many sources there were and the extent to which these sources agreed.

How reliable are perceptions-based indicators?

The institutions which drive social development are by nature difficult to detect, given that they rest upon tacit norms, beliefs, and practices which lack explicit formalization. Previous quantitative studies of social institutions have therefore largely relied upon using either proxies based upon causes or consequences (such as using daily newspaper circulation as a proxy for the extent to which citizens take an active interest in local politics, or linguistic fractionalization as a proxy for cohesion or otherwise between social groups), or survey responses to questions regarding social attitudes.

Not all survey data is perceptionsbased, however, and can often be behavioral, as when respondents are asked whether they have been the victim of crime, whether they have signed a petition, or whether they have contacted a local representative. Both proxy variables and survey items are used in these indices, and both correlate to an exceptionally large extent.

For example, a country’s reported level of social trust is strongly predicted by a country’s homicide rate, while the correlation between the proportion of managers who say men have a greater priority than women to a job, and the ratio between male and female labor force participation, is likewise high. To some extent, this reflects the fact that perceptions and attitudes are not simply the result of social institutions, but are the institution, to a substantial degree.

What are the advantages of using matching percentiles?

The matching percentiles method brings with it several advantages for creating a set of indices of this nature.

First, the matching percentiles method overcomes the problem of sampling bias. This is pervasive when a new data source only covers a limited and unrepresentative sample of countries, as country scores on the new indicator will reflect not only a difference in scaling (β) but also a difference in the constant (α). A further advantage of the matching percentiles technique is that it allows us to keep adding successive waves of indicators, even with very small samples, that can be used to continually ‘refine’ the country scores simply by using information on relative rankings. Whereas regression based techniques of aggregation encounter difficulties in incorporating small sample sources due to difficulties estimating α and β when the sample size is very low, no such difficulties affect the matching percentiles technique. This is critically important for a set of indices of this nature, where the present data remain incomplete, such that it will be necessary to keep adding new indicators in future years as successive data source become available, even where such sources cover relatively few countries.

What does ‘Matching Percentiles’ mean?

The matching percentiles method, used by the Indices of Social Development, was first deployed by Lambsdorff et al. (1999) to construct the Corruptions Perceptions Index. In the matching percentiles process, values are matched across indicators based on country rankings. The ranks of successive indicators included in the index are used to assign equivalent values to countries based on their position on each additional measure. Variables are iteratively added to produce the index.