Embeddings of textual data containing location names (e.g., social media posts) have essential applications in various contexts such as marketing and disaster management. In these downstream implementations, social biases behind location names are highly prone to introduce unfair results through their embeddings; for example, emergent text messages with swapped location names might result in varied rescue responses. Hence, it is critical to address social biases encoded in location names and to seek its mitigation. Prevalent works addressing biases in embeddings mainly focus on individual attributes like gender or ethnicity. Yet, a large number of social attributes behind location names (e.g., income level and population density) makes it challenging to originate the source of biases. Existing mitigation methods based on finding attribute subspaces cannot be simply applied to address social biases. Moreover, bias mitigation tends to simultaneously remove necessary semantics from embeddings, making it difficult to achieve a balance between mitigation performance and semantics retention. In this article, we first employ the concept of counterfactual fairness to investigate the social biases encoded in training data. Then, we quantify the biases in the contextual embeddings (BERT and ELMo). We report a high correlation between biases in the training data and embeddings. Next, we introduce a novel bias mitigation algorithm that customizes bias representations for any location names. The method yields debiased location name vectors for various social attributes simultaneously. The proposed algorithm achieves a better mitigation performance on overall attributes compared with a prevalent postprocessing method, while maintaining correctness by retaining semantic information.
All Science Journal Classification (ASJC) codes
- Modeling and Simulation
- Social Sciences (miscellaneous)
- Human-Computer Interaction
- Contextual word embeddings
- social attributes