Recent and Proposed Changes in How the Census Bureau Collects and Processes Race and Ethnicity Data

In 2020, the Census Bureau made some important changes to how they collect and process race and ethnicity data. These changes were intended to better capture the increasingly diverse racial and ethnic identities of the U.S. population, while still following the 1997 Federal Office of Management and Budget (OMB) Standards for the Classification of Federal Data on Race and Ethnicity. However, these changes also raise some questions about the validity of the Census Bureau’s “Two or More Races” classification in 2020, particularly for Hispanic and Latino individuals. They also complicate comparisons with race data from before 2020, making it difficult to assess change over time.

The 1997 OMB standards have governed how federal agencies collect and report on people’s racial and ethnic identities for over 25 years. However, these standards may be about to change substantially. A federal interagency working group is currently developing a proposal for revisions to the standards, a preliminary version of which was published in a Federal Register notice in January. The OMB plans to finalize any revisions to the standards no later than summer 2024.

In this blog post, we discuss the consequences of the Census Bureau’s recent changes to how they collect and process race and ethnicity data, as well as the proposed revisions to the OMB standards for the classification of federal data on race and ethnicity.

Throughout, we use the term racial “classification” rather than “identity” to refer to the categories used by the Census Bureau to describe people’s racial identities. We do this purposefully, since the way that the Census Bureau classifies people’s race does not always match how those people would self-identify.

 

Key Takeaways

  • In 2020, the Census Bureau changed how they collect and process data on race and ethnicity in a way that resulted in a large increase in the number of people classified as “Two or More Races.”

  • The increase in the likelihood of being classified as “Two or More Races” was especially large for people of Hispanic or Latino ethnicity. This seems to be due in large part to many Hispanic and Latino people selecting “White” and writing in a Hispanic or Latino origin on the race question, which is then coded by the Census Bureau as “White” and “Some Other Race.” This pattern of responding reflects a misalignment between how Hispanic and Latino people view their own identities and the race categories used by the federal government. It is likely that many of these individuals do not self-identify as multiracial.

  • Comparisons with race data collected before 2020 must be conducted with caution since we do not know how much of the change is due to differences in how the data were collected and processed. In contrast, comparisons with ethnicity data (that is, whether or not someone identifies as Hispanic or Latino) collected before 2020 can be conducted with relative confidence.

  • Some of the challenges with interpreting race data from the 2020 Census could be addressed for future Censuses by the proposed changes to the OMB’s standards for data on race and ethnicity. These proposed changes include using a combined race/ethnicity question with “Hispanic or Latino” as a category, and adding a category for people who identify as Middle Eastern or North African (MENA). These changes would likely greatly reduce the number of individuals who select “Some Other Race” because none of the offered race categories match their identity, or who are coded as “Two or More Races” by the Census Bureau because they write in an origin that does not match the race checkbox they selected.

 

Changes Made to the Census Bureau’s Race and Ethnicity Data Collection and Processing in 2020

The Census Bureau made subtle but important changes to how they collect data on race and ethnicity in both the 2020 Decennial Census and the American Community Survey compared to prior years. The full ethnicity (Hispanic or Latino origin) and race questions as they appeared on the 2020 Decennial Census are provided to the right.

In 2020, for the first time, respondents who selected the “White” or “Black or African American” checkboxes on the race question were instructed to print their origin (e.g., German, Jamaican, etc.) in a dedicated write-in area below the checkbox. Previously, respondents who selected the “White” or “Black or African American” checkboxes were not asked to provide any details about their origin, although detailed origin responses were encouraged for people who selected the “American Indian or Alaska Native,” “Asian,” or “Some Other Race” checkboxes.

In addition to asking people to provide more detail about their origins, the Census Bureau also changed how they processed people’s responses to capture more details about multiple origins. Specifically:

  • They increased the number of characters recorded for each write-in response field from 30 to 200 characters. This allowed them to capture a greater number of origins for people who wrote in multiple origins within a single response field.

  • They assigned up to six race group codes for each write-in response field, whereas in 2010 they only assigned up to two codes for each write-in response field.

  • They coded the first six responses in each write-in response field from left to right without prioritizing certain kinds of responses over others. In comparison, when processing data from the 2010 race question, the Census Bureau prioritized coding write-in origins that fit under one of OMB’s “race” categories over any origins associated with Hispanic or Latino ethnicity.

Importantly, when a Hispanic or Latino origin was reported in a write-in response field on the race question, the Census Bureau coded this response as “Some Other Race” because Hispanic or Latino is not considered a race category in the 1997 OMB standards. For example, if a respondent selected the “White” checkbox and wrote “Cuban” in the response field under this box, the Census Bureau would code this individual as both “White” and “Some Other Race.” This individual would then be counted in the “Two or More Races” category, even though they did not select two or more race checkboxes on the Census form.

This rule for coding Hispanic or Latino write-in responses to the race question as “Some Other Race” was also in use for the 2000 and 2010 Decennial Censuses. However, the new write-in response fields provided for the “White” and “Black or African American” categories in 2020, combined with the data processing changes that increased the number of write-in responses that are coded by the Census Bureau, would have resulted in more Hispanic or Latino individuals being classified as “Two or More Races” because they wrote in an origin that did not match the race checkbox they selected according to the Census Bureau’s racial classification system.

 

What Does This Mean for How We Interpret the Census Bureau’s Hispanic or Latino Ethnicity Data for Connecticut?

The data collection and processing changes implemented in 2020 should not have had a meaningful impact on the number of people who are classified as being of Hispanic or Latino ethnicity. This is because the Census Bureau does not generally recode people’s responses to the Hispanic or Latino ethnicity question based on what they report as their origin on the race question (see the drop-down section below for details). Additionally, since in 2020 the Census Bureau could only assign one detailed Hispanic or Latino origin to each individual (e.g., “Mexican”), the data processing changes increasing the number of ethnicity write-in responses that are coded for internal research purposes would not result in a greater number of detailed Hispanic or Latino origins being publicly reported for each individual (see here, and see the drop-down section below for details).

  • According to personal communications from the Census Bureau’s Racial Statistics Branch, if a person provides a valid response to the ethnicity question, their response is not recoded depending on their response to the race question. For example, if a person indicates on the ethnicity question that they are not Hispanic or Latino and they write in a Hispanic or Latino origin on the race question, they are still classified as not Hispanic or Latino based on their self-reported ethnic identity. The exception is that, in cases where the person did not provide a response to the ethnicity question, the Census Bureau codes an ethnicity based on their response to the race question. A Census Bureau research report found that, in 2010, less than 0.8% of those classified as Hispanic or Latino were assigned that classification based on their response to the race question. Therefore, write-in responses to the race question are not likely to be a meaningful contributor to the number classified as Hispanic or Latino in 2020.

  • Respondents to the Decennial Census are not told that they can select more than one Hispanic or Latino origin, although they are not told that they can’t do this, either. Although the Census Bureau codes up to six different Hispanic or Latino origin responses for internal research purposes, the 1997 OMB standards only permit one Hispanic or Latino origin to be assigned to each person for public reports.

    According to personal communications from the Census Bureau’s Racial Statistics Branch, if a respondent reports more than one Hispanic or Latino origin on the ethnicity question, the Census Bureau assigns them a single Hispanic or Latino origin for public reporting as follows:

    (1) They first use the write-in response to the ethnicity question, if any was provided, regardless of any checkboxes that may have been selected. If more than one write-in response was provided, they use the one listed first.

    (2) If multiple checkboxes were selected but no write-in response was provided, they randomly assign one of the selected origins.

In Connecticut, the proportion of the population identifying as Hispanic or Latino increased from 13.4% in 2010 to 17.3% in 2020. As explained above, this percentage is probably not meaningfully impacted by the Census Bureau’s changes to how they collect and process race and ethnicity data, so we can conclude that the proportion of Connecticut’s residents who identify as Hispanic or Latino increased between 2010 and 2020.

 

What Does This Mean for How We Interpret the Census Bureau’s Race Data for Connecticut?

The changes to the Census Bureau’s race data collection and processing procedures in 2020 increased both the number of origins people were encouraged to report and the proportion of these origins the Census Bureau captured and coded into race categories. As a consequence, the number of people classified as “Two or More Races” increased dramatically across the nation, from 2.9% of the total U.S. population in 2010 to 10.2% of the U.S. population in 2020.

In Connecticut, the percentage of the population classified as “Two or More Races” showed a similar increase from 2.6% in 2010 to 9.2% in 2020. Due to the changes in how the Census Bureau collected and processed race data, we cannot know what proportion of this increase reflects a true demographic change.

The percentage of Connecticut’s residents classified as “White alone” decreased substantially from 77.6% in 2010 to 66.4% in 2020. In contrast, the percentage of Connecticut’s residents classified as “Black or African American alone,” “Some Other Race alone,” “Asian alone,” or “American Indian or Alaska Native alone” showed moderate increases from 2010 to 2020. Since these increases in single-race categories cannot be easily attributed to the Census Bureau’s changes to how they collected and processed race data, they are likely to indicate a true increase in the proportion of Connecticut’s residents who identify with each of these racial groups alone (although the increase in these single-race groups may have appeared greater had the Census Bureau not changed their data collection and processing procedures to facilitate the reporting and coding of multiple races per person).

Importantly, the pattern of changes in racial classification from the 2010 to 2020 Decennial Census differed by Hispanic or Latino ethnicity. While both Hispanic and non-Hispanic individuals showed an increase in the likelihood of being classified as “Two or More Races,” this increase was much larger for people of Hispanic or Latino ethnicity.

The percentage of Connecticut’s Hispanic and Latino residents who were classified as “White alone” decreased from 47.2% in 2010 to just 18.6% in 2020, while the percentage classified as “Two or More Races” increased from just 6.9% in 2010 to 31.3% in 2020.

This pattern was also observed nationally and is likely due in large part to the Census Bureau coding an additional “Some Other Race” category for many Hispanic and Latino individuals who both checked the “White” checkbox and wrote in a Hispanic or Latino origin for the race question. Indeed, in 2020, 94% of Connecticut’s Hispanic or Latino residents who were classified as “Two or More Races” had “Some Other Race” as one of their race categories (up from 71% in 2010), and 89% had ”White” as one of their race categories (up from 75% in 2010). Only 15% had “Black or African American” as one of their race categories, and 7% had “American Indian or Alaska Native.”

We would therefore caution against concluding from these statistics that the proportion of Connecticut’s Hispanic or Latino residents who self-identify as multiracial has dramatically increased since 2010. Rather, it is likely that many Hispanic and Latino respondents ended up being classified as “Two or More Races” because their write-in response did not match the race checkbox that they selected (according to the Census Bureau’s racial classification system), rather than that these individuals truly see themselves as multiracial. Unfortunately, the Census Bureau has not yet published research or data that would allow us to determine how many individuals actually selected multiple race checkboxes on the 2020 Census form and how many were classified as “Two or More Races” because they wrote in an origin that did not match the selected race category according to the Census Bureau’s racial classification system.

It is also worth noting that “Some Other Race alone” was by far the largest single-race classification for Connecticut residents of Hispanic or Latino ethnicity in 2020, and that the proportion identifying as “Some Other Race alone” had increased by 4.7 percentage points since 2010. These “Some Other Race alone” classifications are likely to accurately reflect how people self-identified on the Census form rather than reflecting a Census Bureau recoding of their responses, since the Census Bureau would only have coded an individual as “Some Other Race alone” if they wrote in a response that did not match any OMB race category and they did not select any of the OMB race category checkboxes. This suggests that a large percentage of Connecticut’s Hispanic or Latino residents do not self-identify with one of the OMB’s race categories and that this percentage increased over the decade.

The percentage of Connecticut’s non-Hispanic residents who were classified as “Two or More Races” more than doubled from 1.9% in 2010 to 4.6% in 2020. While this increase is not as dramatic as that for Hispanic and Latino residents, who were more than four times more likely to be classified as “Two or More Races” in 2020 versus 2010, it still is a substantial increase. This trend was also observed nationally. This may reflect both an increase in the number of additional race categories coded by the Census Bureau based on write-in responses and an increase in the number of respondents who are selecting multiple race checkboxes on the Census form. Based on the information available now, we cannot know what proportion of the increase reflects a true change in the percentage of residents who self-identify as multiracial.

Of Connecticut’s non-Hispanic residents who were classified as “Two or More Races” in 2020, 90% had “White” as one of their race categories, 36% had “Black or African American,” 34% had “Some Other Race,” 23% had “American Indian or Alaska Native,” and 21% had “Asian” as one of their race categories.

The percentage of Connecticut’s non-Hispanic residents classified as “White alone” decreased from 82.3% in 2010 to 76.4% in 2020. Again, due to the changes in the Census Bureau’s data collection and processing procedures, it is difficult to know how much of this change is due to true population demographic changes. However, we can calculate that 46% of the decrease in the proportion classified as “White alone” is accounted for by the increase in the proportion classified as “Two or More Races,” 24% by the increase in “Asian alone,” 22% by the increase in “Black or African American alone,” and 9% by the increase in “Some Other Race alone.” Since the increases in the percentage classified as “Black alone,” “Asian alone,” and “Some Other Race alone” cannot be readily attributed to changes in the Census Bureau’s data collection and processing procedures, it is likely that these changes reflect true demographic changes in Connecticut’s resident population (accounting for a combined 54% of the decrease in the proportion of the population classified as “White alone”).

 

Proposed Changes to the Federal Standards for Classification of Data on Race & Ethnicity

Some of the challenges with interpreting race data from the 2020 Census could be addressed for future Censuses by proposed changes to the OMB’s standards for the classification of federal data on race and ethnicity that are currently under review.

The preliminary proposed changes include:

  1. Using a combined race and ethnicity question rather than two separate questions for race and Hispanic/Latino ethnicity. This combined question would include “Hispanic or Latino” as a racial/ethnic category and would continue to encourage respondents to select all categories that apply. Respondents would also be able to select or write in one or more detailed Hispanic/Latino origins (whereas, currently, only one detailed Hispanic or Latino origin can be assigned per person).

  2. Adding “Middle Eastern or North African” (MENA) as a new minimum category and removing MENA groups from the “White” category.

These proposed changes have been under consideration by the OMB since before the 1997 OMB standards were set, and were raised again in 2016, but they were not adopted either time. A Federal Interagency Working Group is currently reviewing public comments on the proposed changes, and the OMB has stated that they expect to implement changes to the standards by summer 2024.

These changes were proposed because many people do not self-identify with one of the 1997 OMB race categories. In the 2020 Census, “Some Other Race alone or in combination” was the second-largest racial group alone or in combination, up from the third-largest in 2010 and 2000. As we discussed above, this problem is particularly notable for the Hispanic and Latino population. A Census Bureau report on Hispanic race reporting on the 2010 Census found that 43.5% of self-reported Hispanic and Latino individuals did not identify with one of the OMB race categories: 30.5% were only classified as “Some Other Race” and 13% did not answer the race question (compared to a 4% non-response rate for the total U.S. population).

The Census Bureau conducted research testing a combined race and ethnicity question format in the 2010 Alternative Questionnaire Experiment (AQE) and the 2015 National Content Test (NCT). Both studies found that the combined race and ethnicity question increased the quality of data on individuals of Hispanic or Latino ethnicity by reducing the number who did not respond to the race question, the number who chose the “Some Other Race” category, and the number who selected the “White” category despite feeling that this did not accurately reflect their self-identity. Additionally, the 2010 AQE showed that the combined question format increased the reliability of Hispanic and Latino individuals’ race reporting (that is, they were more likely to report the same race if asked again at a later time). Importantly, the combined question format did not decrease the proportion of the total population that identified as Hispanic or Latino (the AQE found no difference while the NCT found a slight increase with the combined question format).

Both Census Bureau studies found that the combined question format resulted in a much smaller proportion of Hispanic or Latino individuals identifying as White (a reduction of about 40 percentage points), suggesting that many Hispanic or Latino individuals who select “White” on the separate race question are doing so only because they see it as the closest match among the offered choices, not because they identify as White. This was also confirmed by focus group and interview research conducted by the Census Bureau as part of the AQE.

Census Bureau research has also revealed that many individuals of Middle Eastern or North African origin were confused about how to respond to the race question, felt excluded, and/or felt that the inclusion of the terms “Lebanese” and “Egyptian” as examples under the White racial category was wrong. For example, they found that 11.5% of people of Middle Eastern or North African origin reported their identity as “Some Other Race” even though the OMB standards include MENA origins in the White classification.

Thus, these proposed changes are likely to increase the quality of federal data on race and ethnicity by reducing the number of individuals who select “Some Other Race,” who skip the race question entirely, or who are coded as “Two or More Races” by the Census Bureau because they write in an origin that does not match the Census Bureau’s definition of the race category that they selected. Moreover, these changes would likely allow more people to report their race or ethnicity in a way that aligns with their self-identity. However, these changes could introduce further complications when comparing racial and ethnic data over time.

 

For More Information 

You can visit the Census Bureau’s website to learn more about the recent changes to race and ethnicity data collection and processing and their research to improve data on race and ethnicity.

View the 2023 Federal Register Notice to learn more about the proposed changes to the OMB’s standards for the classification of federal data on race and ethnicity.

To learn more about the Census and resources provided by CTData, head to our Census Data portal. Explore other data sets and analysis at data by topic and data projects. You can stay up-to-date on the latest data and tools by subscribing to our newsletter and following CTData on Facebook, Twitter, and LinkedIn