Selected Projects
Max-Min Diversification and Monotone Nonnegative Submodular Functions under Fairness Constraints.
Z. Moumoulidou
ReportA Study on Fairness and Diversity in Gender Classification.
M. T. Islam, Z. Moumoulidou
ReportIn this project we empirically evaluate how a fair and diverse dataset affects the behavior of a gender classification model that uses a simple CNN architecture. To this end, we train the same model on two different datasets; one biased and another one which is de-biased. We show that the overall performance of a classifier trained on a fair and diverse dataset is better even after applying data-level techniques such as random over-sampling of the minority class or SMOTE to address the presence of bias. As a second step, we experiment with ensemble models and check whether building an independent model for an under-represented group in which a classifier under performs helps in boosting the accuracy. We conclude that ensemble models alone are not capable of mitigating the problem. Introducing fake or artificial data, even if they are generated in unusual ways, can provide support for the minority group as long as the generated data is similar.
Towards Profiling Fair Classification Approaches.
M. T. Islam, Z. Moumoulidou
ReportClassification is a fundamental supervised-learning method that is frequently used in high-stake decision making. Nonetheless, it has been noticed that classification systems often discriminate against historically underrepresented groups. As a result, during the last couple of years the machine learning community designed various mechanisms that aim to provide fairness guarantees. In this project we select six fair classification approaches and evaluate their profile in terms of correctness, fairness, efficiency, and scalability metrics. Further, in our evaluation we use two real-world datasets and contrast with four fundamental fairness-unaware classification models: Logistic Regression, Support Vector Machines, Decision Trees and Neural Networks. To the best of our knowledge, the novelty of this study is that it offers a thorough comparison among linear and non-linear models using a diverse set of fair classification mechanisms, correctness and performance metrics. Our findings show that fair approaches generally tradeoff some correctness for fairness. All fair approaches enhance fairness according to the fairness metrics we chose to evaluate on, but there is no single best approach that does best across all correctness and fairness metrics. Further, our findings show that different approaches have different level of efficiency, and we specifically identify the approaches that are the most scalable with increasing number of data points and attributes.