which aimed to use a ML algorithm. A team of data scientists, trained a ML algorithm on resume data to predict job performance of applicants in hopes of streamlining the process of selecting individuals to interview. The algorithm was trained on the resumes of current employees , with gender and names removed, in hopes of preventing discrimination, per human decision-making practices.
Including gender significantly decreases discrimination — by a factor of 2.8 times. Without access to gender, the ML algorithm over-predicts women to default compared to their true default rate, while the rate for men is accurate. Adding gender to the ML algorithm corrects for this and the gap in prediction accuracy for men and women who default diminishes. Additionally, the use of of gender in the ML algorithm also increases profitability on average by 8%.
We find that proxies can predict gender with 91% accuracy in our data, so although gender is removed, much gender information is estimated by the algorithm through proxies. But these proxies favor men. Without access to the real gender data the ML algorithm is not able to recover as much information for women compared to men, and the predictions for women suffer, resulting in discrimination.
While I’m not looking for a job, I am a Multiracial American. Meaning, I have parents that belong to two different racial groups - as defined by OMBPress. And, if you don’t know, Multiracial (aka: Mixed-race) Americans are the fastest growing racial group per uscensusbureau.…