The Construction of Race
In 1950, South Africa’s apartheid government created the Race Classification Board to assign legal racial categories to people whose status was ambiguous. The examiners used what they called the pencil test: if a pencil inserted into someone’s hair stayed in place without falling, the person might be classified as “Coloured” rather than “White.” They listened to accents, examined fingernails for pigmentation, and interviewed neighbors. Families were split: siblings were classified into different categories because they had inherited different combinations of features, giving them different legal rights and permitted occupations, and requiring them to live in different neighborhoods.
The pencil test is an extreme example of something that operates wherever racial classification exists: a social and political determination dressed as a natural fact. Race is not a biological category. This is not a political opinion—it is the settled position of geneticists, who have found more variation within conventionally defined racial groups than between them. Race has been defined and redefined by specific people for specific political reasons. Understanding how these categories are built, and by whom, is necessary for understanding how algorithmic systems operate in societies organized around race and caste. A system cannot be race-neutral or caste-neutral if it is trained on data generated by institutions that were not.
Take the US Census Bureau’s changing list of racial categories as an example. In 1930, Mexicans were for the first time classified as a separate racial category rather than white. A decade later they were reclassified as white again following diplomatic pressure from the Mexican government. The category “Hispanic” does not appear in census data before 1970; it was created by the Nixon administration to aggregate Spanish-speaking populations for federal programs.
The sociologists Michael Omi and Howard Winant called this process racial formation. Irish immigrants in the 1840s appeared in popular cartoons as racially distinct from Anglo-Saxons, while Italian immigrants in the early twentieth century were subject to legal discrimination and extrajudicial violence partly justified on racial grounds. Jewish people were categorized as a distinct race for purposes of immigration restriction under the Johnson-Reed Act of 1924. Each of these groups eventually became “white” through political processes like labor organizing, military service, and suburban home ownership policies that included them while excluding Black Americans.
A structurally identical process operated in colonial India, where British census-taking hardened a complex system of jati—locally variable occupational and kinship communities—into a rigid administrative scheme derived from the four-varna hierarchy of classical Sanskrit texts. Before the British began conducting censuses in the 1870s, caste identity in most of India was locally negotiated, overlapping, and context-dependent. A family might claim a higher jati status in one region than in another. The colonial census required people to report a single affiliation from a fixed list, ranked in a hierarchy that was presented as traditional but was in practice imposed or interpolated by officials unfamiliar with local practice.
The jurist and social reformer B.R. Ambedkar, who was himself from a Scheduled Caste community and who drafted independent India’s constitution, argued throughout his career that caste is not a natural feature of Hindu society. He saw the colonial census not as a neutral enumeration of existing facts but as a hardening of fluid social positions into legal permanence. His arguments remain politically contested in India today, which is itself evidence that caste is a political artifact.
The Mandal Commission of 1979 recommended reserving twenty-seven percent of central government jobs for Other Backward Classes. There were widespread student protests when its recommendations were partially implemented in 1990. The opposition was not to the existence of caste hierarchy as such; tt was to the explicit acknowledgment of that hierarchy as the basis for redistribution. Naming a constructed category in order to redress it is deeply uncomfortable for the people advantaged by the original construction. The same pattern has appeared in every country where affirmative action policies have been implemented: defenders of the status quo claim that they are being discriminated against.
Isabel Wilkerson’s Caste draws an explicit parallel between the racial hierarchy of the United States, the caste system of India, and Nazi Germany’s racial laws. The three systems share structural features: birth-based status that cannot be changed by individual achievement, enforcement of endogamy (i.e., pressure or prohibition against intermarriage across boundaries), and claims that contact with the lower caste or race will pollute those with higher status. These systems aren’t identical, but understanding their similarities makes it easier to see features that are easy to miss if each is treated as unique.
Which brings us to the algorithmic systems the tech industry is building. They operate in a world where race and caste have been encoded into property records, school funding formulas, arrest databases, and credit histories through decades of explicitly discriminatory policy. A hiring model trained on historical promotion decisions learns to prefer candidates who resemble the people who were previously promoted, in organizations that excluded people by race and class. A recidivism prediction tool trained on arrest data learns patterns from policing decisions that were themselves racially disparate, and a content moderation system trained on human moderators’ judgments inherits their biases about whose speech is threatening and whose is merely passionate.
India’s Aadhaar biometric identification system links fingerprints and iris scans to a twelve-digit identifier for every resident. It was presented as a race- and caste-blind system that would reduce discrimination by making identity verification universal and objective. In practice, manual laborers whose fingerprints have worn smooth from physical work cannot authenticate, and rural residents without internet access cannot use online verification portals. A system described as category-neutral still produces unequal outcomes when the world it operates in is not.
see the whole series · email me
- Kendi2016
- Ibram X. Kendi: Stamped from the Beginning: The Definitive History of Racist Ideas in America. Nation Books, 2016, 9781568584638.
- Khera2019
- Reetika Khera (ed.): Dissent on Aadhaar: Big Data Meets Big Brother. Orient BlackSwan, 2019, 9789352875429.
- OmiWinant2015
- Michael Omi and Howard Winant: Racial Formation in the United States (3rd ed.). Routledge, 2015, 9780415520317.
- Wilkerson2020
- Isabel Wilkerson: Caste: The Origins of Our Discontents. Random House, 2020, 9780593230251.