When computer science was in its infancy, programmers quickly realized that though computers are astonishingly powerful tools, the results they achieve are only as good as the data you feed into them. (This principle was quickly formalized as GIGO: “Garbage In, Garbage Out.”) What was true in the era of the UNIVAC has proved still to be true in the era of machine learning: among other well-publicized AI fiascos, chatbots that have interacted with bigots have learned to spew racist invective, while facial-recognition software trained solely on images of white people sometimes fails to recognize people of color as human.
In this episode, we meet Prof. Catherine D’Ignazio of MIT’s Department of Urban Studies and Planning (DUSP) and Prof. Jacob Andreas and Harini Suresh of the Department of Electrical Engineering and Computer Science. In 2021, D’Ignazio, Andreas, and Suresh collaborated as part of the Social and Ethical Responsibilities of Computing initiative from the Schwarzman College of Computing in a project to teach computer science students in 6.864 Natural Language Processing to recognize how deep learning systems can replicate and magnify the biases inherent in the data sets that are used to train them.