Gabriel Borges, OLAC
Abstract
Population and Housing Censuses are designed to cover the entire population in a country, though under-enumeration is a recurrent problem in most censuses. This is not a random problem and affects peculiarly especial groups such as children. Somehow surprisingly, undercount of children has happened in all sorts of population censuses, regardless census design, cultural and socioeconomic characteristics of the country. Despite the wide-ranging acknowledgement of this issue, little is known about its causes and no effective alternatives have been proposed to addressed it. Furthermore, there is a lack of comprehensive studies that measure the magnitude of this problem for a large number of countries and across time. This study seeks to fill these gaps by estimating the undercount of children for all Latin American censuses since the 1950 round, discussing the possible reasons for such a phenomenon and proposing changes in the training and in the questionnaire design for the next round of census in the region.
A subenumeração de crianças nos censos latino-americanos
Resumo
Censos populacionais são desenhados para cobrir a totalidade da população de um país. Contudo, subenumeração é um problema recorrente nos censos, e afeta mais alguns grupos populacionais específicos, como crianças. A subenumeração tem sido, de certa forma surpreendentemente, um problema que ocorre em diversos censos, independentemente do desenho do censo e de características culturais e socioeconômicas do país onde ele é realizado. Apesar do amplo conhecimento deste problema, pouco se sabe sobre suas causas e nenhuma medida efetiva tem sido adotada para solucioná-lo. Além disso, faltam estudos abrangentes que mensurem sua magnitude para um grande número de países e ao longo do tempo. Este trabalho se propõe a preencher estas lacunas ao estimar a subenumeração de crianças em todos os censos latino-americanos desde a rodada de 1950, discutindo as possíveis causas deste fenômeno e propondo mudanças no treinamento e no desenho do questionário para a próxima rodada de censos na região.
La subenumeración de niños en los censos latino-americanos
Resumen
Los censos de población son diseñados para cubrir a toda la población de un país. Sin embargo, la subenumeración es un problema recurrente en los censos, que afecta más a algunos grupos de población específicos, por ejemplo el de los niños. La subenumeración ha sido, de cierta forma sorprendentemente, un problema que ocurre en diversos censos, independientemente del diseño del censo y de características culturales y socioeconómicas del país donde se lo realiza. A pesar del amplio conocimiento de este problema, poco se sabe sobre sus causas y ninguna medida efectiva ha sido adoptada para solucionarlo. Además, faltan estudios exhaustivos que miden su magnitud para un gran número de países y un largo periodo de tiempo. Este trabajo se propone llenar estos vacíos al estimar la subenumeración de niños en todos los censos latinoamericanos desde la ronda de 1950, discutiendo las posibles causas de este fenómeno y proponiendo cambios en la capacitación y en el diseño del cuestionario para la próxima ronda de censos en la región.
Introduction
Coverage problems such as census undercount are not random and affect particular groups, for instance, children. More than 20% of the total population omitted in the Latin American censuses since 1950 were children at ages below 5 years old, figure disproportionally higher than the proportion of this group in the population – around 12%.
Census under-enumeration is an important issue, closely related to the main uses of censuses. Errors in census data can affect planning and provision of public polices, redistribution of funds, business decisions and calculation of all sort of indicators, since this is the natural source of information for these purposes. More specifically, undercount of children can influence health and educational policies, such as immunization campaigns and assessment of child care and education demand, as well as bias social indicators calculated using census data.
Somehow surprisingly, this phenomenon has occurred in all sorts of population censuses, regardless the census design and cultural or socioeconomic characteristics of the country, such as several Asian countries, the United States, the United Kingdom and South Africa. Furthermore, it has not necessarily improved over time, despite improvements in the overall census quality.
There are several factors that may be related to this differential omission, but there is no consensus in the literature about the main causes that lead to it, hampering actions to resolve the problem. The leading explanations can be divided into two broad categories: i) children living in households effectively counted in the census are omitted by the respondents; ii) children are disproportionately present in households more likely to be omitted.
Among the first set of explanations is the idea that children might not be viewed as a person who should be counted and their existence may not be considered by the respondents (Chackiel 2009). Children undercount can also be related to specific cultural characteristics of a country. In China, for instance, fertility policies provide a powerful incentive for both parents and officials to underreport out-of-quota births as well as children (Goodkind 2011). Respondents could also think that children not registered in the civil registration systems should not be included in the census (Mortara 1941), which would be a more problematic issue in countries with less developed vital registration systems.
Another potential explanation is related to the fact that several characteristics of the households are likely to be associated to a larger number of children, especially due to differential fertility in accordance with these characteristics. If fertility is higher in regions where enumeration is more difficult, or in other “hard-to-count” population groups, children tend to be disproportionately omitted (West and Robinson 1999; O’Hare 2015). Other pieces of evidence from earlier censuses in the United States show that undercount for infants is closely tied to the undercount at other ages, since large part of the infants who were not reported in the census were the children of adults who were also not listed in the census. This suggests the hypothesis that the under-enumeration of persons aged 15-19 and of children aged 0-4 might be caused by the omission of newly formed small households or families (Ewbank 1981).
Undercount of children in Latin America has been reported for many countries, but there seems to be no comprehensive work that has estimated it for a large number of countries and years, allowing an evaluation of how it differs from country to country and how it has changed over time. There have not been many proposals to address this issue either.
Method and Results
Table 1 shows the estimated census undercount for all Latin American countries and census rounds since 1950 for the children aged 0-4, contrasted with the overall undercount of people aged above 5. The coverage measure is calculated by comparing census figures to the population projections published by the United Nations (2015).
Table 1 – Census undercount (%) by country, census round and age group according to indirect evaluation comparing censuses figures to UN population projections – 1950-2010
Source: United Nations (2015); Latin American Mortality Database (LAMBdA); IPUMS-International, National Population and Housing Censuses.
The summary measures for the region illustrate overall improvements in census coverage among both children and the overall population, even though the undercount of children has remained significantly higher than the overall omission of the population. The weighted average shows that the undercount of young children in Latin American censuses has been around twice as high as the undercount among the population above age 5. The 1950 census round show extremely high omission rates for both groups. The median undercount indicates that half of the censuses carried out in this round missed at least 19.9% of their children and 9.9% of the general population. For the most recent rounds of census, almost 10% of young children are still not being counted, compared with an omission of 4.7% of the remaining age groups.
Chile, Costa Rica, Cuba and Nicaragua are examples of countries with similar omission rates between these two age groups, although at different undercount rate levels. In Costa Rica, except for the 1950 census round, children undercount has been very similar to the overall undercount.
Conversely, in countries such as Haiti, Mexico, Uruguay, Peru and the Dominican Republic, children tend to present much higher undercount rates than the other age groups combined. The average children undercount rate in Mexico during the period under analysis (14.9%) is almost three times the average undercount for the population aged 5 or over (5.2%). In Uruguay, a country with consistent low census undercount rates (less than 3%), the average children undercount is 5.7%.
Discussion and Recommendations
Given the relevance of census data for several purposes, especially for planning and provision of public policies, and the evidence of persistent high children undercount, which represented more than 20% of the total census omission, addressing this issue should be one of the priorities for the 2020 round of census in Latin America.
This would be only possible if the problem is recognized and its causes are better understood. This post has provided enough evidence that this is an important issue, but the existing data do not allow a conclusive assessment of the causes of the problem. Some results presented here are striking and can nonetheless shed some light on this question. The fact that the three countries that presented the smallest differences between the undercount of children and the overall population (Chile, Costa Rica and Cuba) are also countries with longstanding reliable administrative systems supports the theory that reporting to the census could be associated to the culture of registering a child. These countries are also known as paradigms of social consciousness stemming from the interpretation of life and the value ascribed to the health and nutritional status of mothers and children (Horwitz 1987), which could also be related to the value of childhood in these societies, leading to a greater likelihood of reporting the children to the census. Institutional factors may also play an important role. In Cuba, for example, the census and the National Statistical Office are part of a strong network of public institutions that can make the enumeration a greatly valued activity.
A more country-specific analysis should be carried out in order to further explore the most likely causes for children omission in each situation, given cultural, demographic and socioeconomic characteristics, in addition to census-taking features.
Addressing the problem of children undercount when this occurs along with the omission of the entire household might require attention to the same recommendations used to achieve an overall satisfactory enumeration, as extensively discussed in the literature on this field, such as in the well-known manuals for census-taking (UNSD 2008).
In cases where the undercount of children typically occurs due to an omission of respondents in properly counted households, additional control measures can be applied. Two recommendations are proposed here as an attempt to minimize this problem in the next census round.
The first one is to include warnings highlighting this issue in the enumerator and supervisor training and manuals. It should be emphasized the fact that children undercount is a common feature in censuses so that they should be aware that people might not report all children living in the household.
The second, and perhaps the most important recommendation, is related to a minor change in the questionnaire design. Censuses normally ask questions like: “How many people were living in this household on mm/dd/yyyy?” or “list all the residents living in this household on mm/dd/yyyy?”. Aware of the differential undercount of specific groups, questionnaires normally have some reminders such as: “please remember to include all children, even newborn babies”. In fact, only three Latin American countries (Costa Rica, Panama and Venezuela) did not include any reminder note about children in their 2010 census round questionnaire**.
However, these reminders in the questionnaire, despite being important, do not seem as effective as we would expect. One of the possible explanations is that these questions and reminders are not mandatory and enumerators are not required (though recommended) to read them.
One way to address this problem is by making the enumerator ask explicitly for the number of children living in the household. This survey technique has been applied, for instance, when asking for children ever born by sex and separating those living in the household to those who are outside the household. This is also the strategy used to improve income reporting, when respondents are asked about income from different sources separately. Censuses in Latin America are mostly based on face-to-face interviews so that the content asked by the enumerator is strictly related to the answers given by the respondents.
Thus, there would be the following questions: i) “How many people aged 5 or over were living in this household on mm/dd/yyyy?” _ _ ii) “How many children under age 5 were living in this household on mm/dd/yyyy? Remember to include newborn babies” _ _ iii) Probing question: “So, there were X people living in this household on mm/dd/yyyy” Yes / No (return to the list of residents to correct information).
Finally, it is worth mentioning that these question should be, obviously, adapted to the specific context where it will be used. Furthermore, as always recommended when there is a change in the questionnaire, field tests should be performed before the final implementation.
* Based on the paper submitted to the 28th International Population Conference of the International Union for the Scientific Study of Population (IUSSP) in Cape Town, South Africa
** A reminder about old people is also common in census questionnaires, based on the supposition that this group is disproportionately undercounted as well. However, results from direct and indirect evaluations have often contradicted this premise. On the contrary, old people often appear to be over-counted due to age misstatement.
Hi Gabriel, I’d like to discuss a couple of methodological points from your main argument. As you mention in your extended paper, there are several well-known procedures to tackle fertility and age reporting issues, and undercounts. I guess you are aware of the complexity of the demographers’ task (and Gerland’s paper you cite is a good illustration on this matter): to come up with the «true» (?) number, or rather, with our best guess. In Latin America there are a bunch of data (surveys, vital registration) «competing» with census data for the best guess on population and fertility estimates, and then, there are different methods and demographic techniques to fill in the blanks (reverse survival, own children, relational Gompertz model, etc). So, my question is: What makes you think that the UN «projections» (rather estimates) are the «true» numbers, the legitimate reference, to assess census results? How does the UN come up with those «magic» numbers? I’m afraid you took an easy shortcut here.
I also wonder if the census data you used are not already adjusted. I’m saying this because published census results are sometimes already adjusted, in some ways, for under enumeration.
Another question I have refers to the interpolations/extrapolations to the census dates. Have you used the 5-year or the annual UN estimates? Interpolations adds another layer of potential inaccuracy in estimates.
A special case to consider is the latest UN projection/estimate. What do you think the ones you used for 2010 are based on?
Just a note to mention two updated UN publications: The latest revision of World Population Prospects (2017): https://esa.un.org/unpd/wpp/ and the new revision of the Principles and Recommendations (2017) from UNSD: You can google ST/ESA/STAT/SER.M/67/Rev.3
Un saludo cordial
Me gustaLe gusta a 2 personas
Hi Lina, thank you for the very interesting comments. Yes, I’m aware of the complexity and limitations of the methods used to produce population estimates. I still think these are the best data to assess census coverage in a comprehensive way, as I’m proposing, but this definitely needs to be better discussed in the paper. The first reason is that, even though the PES are important for several purposes (and I report some of the results in the paper), not all countries conduct a PES and publish the results, and their performance, methodology and technical rigor varies significantly over time and across countries.
I could also estimate myself the population by using vital registration, censuses and surveys data and demographic techniques and then compare them with the census results. But, as you mention, this is precisely what the UN (and CELADE, which provides the main inputs to the UN estimates) do. CELADE also works with the NSOs of different countries, and I think that this iterative process of producing population estimates by combining different data and methods with expert knowledge about the reliability of each data source and the demographic characteristics of the countries is the best we could do for Latin America so far.
I have used the annual UN estimates, which I know are an interpolation of the 5-year estimates, and I agree that this might add some inaccuracy. Since CELADE calculates population estimates by single year of age and calendar year, this would perhaps be a more precise estimate for this purpose. And I will try to update the results with the latest revisions.
As far as I know the census results I’m using are not adjusted for under enumeration. Some censuses include imputation of unknown age and sex. More recently, for cases like Brazil, Mexico and Uruguay there was also “count imputation”, which is done when a households is known to be occupied, but for some reason the interview could not be conducted. But this doesn’t seem to have an important effect in the results.
Saludos,
Gabriel
Me gustaLe gusta a 1 persona
Pingback: Población y residencia habitual: ¿un problema de los censos? | OLAC