Ethnicity Estimate

Ethnicity of a name

Ethnicity Estimate helps you find comprehensive information about the ethnicity of a name with two features:

  • Ethnicity or diaspora estimation based on a taxonomy of 136 ethnicities and diasporas: Afghan, African American, Albanian, Algerian, Armenian, Angolan, Armenian, Asian American, Austrian, Azerbaijani, Bahrani, Bangladeshi, etc.
  • U.S. race ethnicity estimation based on the US Census Taxonomy: African American or Black name, White name, Latino or Hispanic name, Asian name, American Indian or Alaska Native name and Native Hawaiian or Other Pacific Islander name.

Name diaspora: Identify the diaspora from a name

The name diaspora ethnicity feature analyzes a name to identify :

  • the most likely ethnicity ;
  • the top 10 most likely ethnicities, order from most likely to least likely ;
  • the character set used for analysis ;
  • a calibrated probability for each ethnicity returned.

Our artificial intelligence is based on the morphology of names to determine their ethnicities. This is why we can identify a full name ethnicity, a first name ethnicity, a last name ethnicity and even the ethnicity of a nickname or an invented name.
Finally our functionality supports 136 ethnicities and 22 alphabets.

info symbol

For more precision enter a first name, a last name and a country of residence. If you do not know the country of residence, you can determine it using our feature: name country.

Example of an ethnic or diaspora identity identified from a first and last name
Enter a first name or a last name or both for more precision :

First name or given name or nickname.

Last name or family name or surname.

Country of residence, in ISO 3166-1 alpha-2 format.

Example of an ethnic or diaspora identity identified from a first and last name
People from different diasporas.

About name ethnicity and diaspora estimation

What is the ethnicity of a name

An ethnicity or ethnic group is a group of people who identify with each other on the basis of shared attributes that distinguish them from other groups such as a common set of traditions, ancestry, language, history, culture, origins, religion or social treatment.

A diaspora refers to a population who lives outside the area in which they had lived for a long time or in which their ancestors lived. The origins of this group of people differs from their country of residence.

For example, for the name Subrahmanyan Chandrasekhar in the United States, the most likely given diaspora is Indian and the second most likely Pakistanese.

In addition to the diaspora or ethnicity estimate from a name, we determine a list of the 10 most likely ethnicity or diaspora sorted from most likely to least likely. Knowing that the ethnicity of a name is not absolute, each found ethnicity or diaspora is completed with a normalized probability.

People from different diasporas.

How to find the ethnicity of a name

The ethnicity of a name is determined thanks to our artificial intelligence :

  • enriched by one of the most complete databases in the world ;
  • refined during numerous partnerships and research with universities (Harvard, Berkeley, HEC, ... ), scientific groups (The Lancet, Elsevier, ...), governmental and international institution (IOM, ONU, European Commission ...) and linguists, anthropologist and historians;
  • reinforced by more than 7 billion names processed.

Our artificial intelligence not only compares names from international databases but uses the morphology of names to determine their ethnicities.
It is thanks to this technology that we can guess the ethnicity of a first name, the ethnicity of a last name or surname, the ethnicity of a nickname and the ethnicity of a full name.
We can identify the most likely ethncity or diaspora from a name in more than 22 different alphabets*.

About name diaspora feature

  • Tag: name diaspora.
  • Determine: Diaspora from first name (optional), last name (optional), country code (optional).
  • Cost: 20 credits per name.
  • API: name diaspora documentation.
  • Method used: Comparison of international, national and regional databases and artificial intelligence based on the morphology of names.
  • Returned values: Ethnic or cultural origins of the person's ancestors.

name diaspora parameters

Values ​​submitted:

First name :
First name or given name or nickname.
Last name :
Last name or family name or surname.

Returned values:

Script :
Character set used for analysis.
Alphabets enumerators
Ethnicity or diaspora :
Most likely ethnicity.
Ethnicities or diasporas enumerators
Alternative ethnicity :
Second most likely ethnicity.
Ethnicities or diasporas enumerators
Top ethnicities :
Top 10 most likely ethnicities, order from most likely to least likely.
Ethnicities or diasporas enumerators
Calibrated probability :
The calibrated probability for ethnicity to have been guessed correctly. -1 = still calibrating. .
Alternative calibrated probability :
The calibrated probability for ethnicity OR ethnicityAlt to have been guessed correctly. -1 = still calibrating. .
Score :
Higher implies a more reliable result, score is not normalized.
Lifted :
Indicates if the output ethnicity is based on machine learning only, or further lifted as a known fact by a country-specific rule.

Name US race: U.S. race ethnicity from a name

The name U.S. race ethnicity feature analyzes a name to identify :

  • the most likely race ethnicity ;
  • a list of the most likely race ethnicity, sorted from most likely to least likely ;
  • the character set used for analysis ;
  • a calibrated probability for each race ethnicity returned.

U.S. race ethnicity is a categorization based on the US Census Taxonomy which guide the Census Bureau in classifying written responses to the race question. This categorization includes a taxonomy in 6 classes: White, Black or African American, American Indian or Alaska Native, Asian, Hispanic or Latino and Native Hawaiian or Other Pacific Islander.

This feature, based on similar physical and biological attributes, identify if a name is more likely a White name, a Black name, an Hispanic or Latino name, an Asian name, an American Indian or Alaska Native name or a Native Hawaiian or Other Pacific Islander name. Note that the name diaspora feature may be better suited for a taxonomy based on cultural expression and place of origin.

info symbol

For more precision enter a first name, a last name and a country of residence. If you do not know the country of residence, you can determine it using our feature: name country.

Example of racial ethnicity determined from a first and last name
Enter a first name or a last name or both for more precision :

First name or given name or nickname.

Last name or family name or surname.

Most likely country of residence, in ISO 3166-1 alpha-2 format.

Example of racial ethnicity determined from a first and last name
Group of people with different ethnic races

About name race ethnicity estimation

What is the race ethnicity of a name

U.S. race ethnicity is a categorization based on the US Census Taxonomy. The U.S. Census Bureau must adhere to the 1997 Office of Management and Budget (OMB) standards on race and ethnicity which guide the Census Bureau in classifying written responses to the race question. This categorization includes a taxonomy in 6 classes:

White
A person having origins in any of the original peoples of Europe, the Middle East, or North Africa.
Black or African American
A person having origins in any of the Black racial groups of Africa.
Hispanic or Latino
A person who having origins or descent from Mexico, Puerto Rico, Cuba, Central and South America, and other Spanish-speaking countries.
American Indian or Alaska Native
A person having origins in any of the original peoples of North and South America (including Central America) and who maintains tribal affiliation or community attachment.
Asian
A person having origins in any of the original peoples of the Far East, Southeast Asia, or the Indian subcontinent including, for example, Cambodia, China, India, Japan, Korea, Malaysia, Pakistan, the Philippine Islands, Thailand, and Vietnam.
Native Hawaiian or Other Pacific Islander
A person having origins in any of the original peoples of Hawaii, Guam, Samoa, or other Pacific Islands.

For example, for the name Keith Haring the most likely given race ethnicity is White, with a probability of 82.84%, and the second most likely given race ethnicity is Black.

In addition to the race ethnicity from a name, we estimate a list of the most likely race ethnicity, sorted from most likely to least likely. Knowing that the race ethnicity is not absolute, each race ethnicity found is completed with a normalized probability.

About name US race feature

  • Tag: name US race.
  • Determine: US race ethnicity from first name (optional), last name (optional), country code (optional).
  • Cost: 10 credits per name.
  • API: name US race documentation.
  • Method used: Comparison of international, national and regional databases and artificial intelligence based on the morphology of names.
  • Returned values: Race ethnicity based on the US Census Taxonomy.

name US race parameters

Values ​​submitted:

First name :
First name or given name or nickname.
Last name :
Last name or family name or surname.

Returned values:

Script :
Character set used for analysis.
Alphabets enumerators
U.S. race ethnicity :
Most likely ethnicity (U.S. race and ethnicity categorization from US Census Taxonomy).
U.S. race ethnicities enumerators
Alternative U.S. race ethnicity :
Second most likely ethnicity (U.S. race and ethnicity categorization from US Census Taxonomy).
U.S. race ethnicities enumerators
Top U.S. race ethnicities :
Most likely ethnicity, sorted from most likely to least likely (U.S. race and ethnicity categorization from US Census Taxonomy).
U.S. race ethnicities enumerators
Calibrated probability :
The calibrated probability for raceEthnicity to have been guessed correctly. -1 = still calibrating..
Alternative calibrated probability :
The calibrated probability for raceEthnicity OR raceEthnicityAlt to have been guessed correctly. -1 = still calibrating..
Score :
Higher implies a more reliable result, score is not normalized.