German federal election: Is there a trend toward more candidates from ethnic minority backgrounds? A deep-learning approach.

elections minority representation candidates Germany

📢 Germany is a diverse society, but is this diversity reflected in the candidates who stand for election? Using a deep-learning approach, I have classified the ethnic diversity of all +22.000 candidates that have stood in the past five elections. Four key findings:

1️⃣ The share of candidates from ethnic minority backgrounds has doubled since 2005 - it sits at 9.2% in 2021. Yet, this trend is not present across all parties. The differences between the parties even became greater over time.

2️⃣ There is a clear divide between parties located on the left and the right when it comes to including candidates of non-European origin. In the most recent federal election in 2021, less than 2% of all candidates of the populist radical-right (AfD) and the Conservatives (CDU/CSU) come from Muslim, Asian, or Sub-Saharan African countries.

3️⃣ Are these minority candidates placed in positions where they are likely to win office? The Greens (Grüne) in 2013/2017 and the Left (Die Linke) in 2021 are the only parties for which the share of ‘visible’ minority candidates positioned among most viable list positions substantively exceeds the overall share of minority candidates running on a party’s list.

4️⃣ Do those with a minority background simply have less interest in standing for election? The share of minority candidates in small parties, representative of the full ideological spectrum, is consistently greater than the share of minority candidates in established parties in all elections 2005 - 2021. This suggests that large parties seem to act as gatekeepers.

I discuss these findings also in a blog post for LSE EUROPP and in an article for Mediendienst Integration (in German). My findings have also been covered by Associated Press, ABC News, the Wall Street Journal, and the Handelsblatt among others.

➡️ How can we automatically classify the minority status of +22000 candidates? Measuring a candidate’s ‘migration background’ is notoriously difficult. Some data can be directly measured by geo-coding the information on their place of birth as provided in the list of candidates published by the Federal Returning Officer. But the task becomes more complex if we try to go beyond simply looking at birthplaces.

To understand the potential discrimination faced by candidates from party gatekeepers and voters, it is useful to consider whether their background can be identified by others from their names. A candidate’s name is a piece of information that is directly available to voters when they encounter campaign posters, read a candidate’s name on the ballot paper, or see a candidate ultimately end up in office. Yet a complete analysis of the surnames of all candidates is an extremely challenging task and cannot feasible be done manually.

Therefore, this analysis is based on an automatic classification of all candidates’ last names, which I achieve by relying on an artificial recurrent neural network (RNN) architecture, namely a ‘Long short-term memory’ (LSTM). LSTMs are frequently used in the field of deep-learning to classify, process and predict time series. I exploit the power of this network to address the problem of name classification, which can be conceived as a series of individual letters. The model is trained by relying on a total of N=74.592 web-scraped surnames that we broadly classify into twenty different categories that are inspired by the related categories available on Wikipedia. Below you can see an illustration of how the model attempts to classify names into these categories. The figure shows the relationship between the first guess the model produces when trying to classify a name and the model’s second guess. For instance, when the model’s first guess is that a name is of West Slavic origin, the second guess tends to be East Slavic. Similarly, when the first guess is that a name is Scandinavian, the second guess tends to be German.

Alongside this classification, I performed a manual analysis of the first names of candidates. Where information was available, I also took account of biographical information. The resulting classification closely corresponds to other available estimates on the number of minority candidates in the 2013 German federal election. To the best of my knowledge, my analysis of all 22,711 candidates that have stood in the past five German federal elections represents the most extensive attempt to automatically classify electoral candidates in this way.