AP4L Project: Transition Guardians

By Dr Shalini, University of Surrey

Major life events and transitions significantly shape a person's life, often requiring them to rebuild or redefine aspects of their identity. In times of identity change, individuals may turn to social media for support and guidance. When these transitions involve life-changing experiences, such as relationship breakups or coming out as LGBTQ+, people frequently share their stories on platforms like Reddit to connect with others and navigate the challenges they face. While such self-disclosures can be rewarding social interactions in a community of shared interests, they pose privacy risks and the threat of online harm. Social media facilitates sharing personal information that can inadvertently expose users to harm through piecing together contextual clues. Our research investigates techniques for detecting subtle privacy leaks in online narratives. 

We provide the following major contributions during our research:


  1. Research into the identification and retrieval of such risky self-disclosures of personal information identifiers (PIIs) is hampered by the lack of open-source labeled datasets, due to the high annotation costs and privacy risks associated with the curation of datasets containing self-disclosive text, especially by vulnerable populations, from platforms such as Reddit. To foster reproducible research into PII-revealing text detection, we develop a taxonomy of PII-revealing categories for vulnerable populations and introduce a synthetic PII-labeled multi-text span dataset generated from 3 text generation Large Language Models, prompted to resemble the original Reddit posts.

  2. The project is developing a browser extension to help users reflect on their digital traces and manage their online presence. This tool can highlight potential privacy risks by showing how small pieces of personal information, when combined, can lead to unintended exposure. Users can customize the extension to track PII leaks on specific platforms, such as Facebook, while excluding others like LinkedIn, giving them greater control over their online privacy.

  3. Further, we aim to enhance the functionality of this extension by providing users with statistical insights into the PII(s) they are revealing through their online interactions. For example, the plugin could inform users if, say, 10% of people in location X identify as transgender, highlighting how sharing this information may associate them with a minority or vulnerable group, potentially making them more identifiable. However, to ensure privacy, these aggregate statistics will be calculated using privacy-preserving techniques. This approach allows for useful insights without compromising the privacy of individuals included in the aggregated data.

Comments

Popular posts from this blog

Engaging with hard-to-reach and potentially vulnerable participants in the AP4L Project