This paper focuses on pseudonymizing user profile data from online social networking platforms, particularly Facebook. The primary goal of this research is to address the challenge of preserving privacy while maintaining data utility for downstream applications, such as machine learning. To strike this balance, we propose replacing personal data with context-aware pseudonyms to ensure minimal loss of data quality. Our methodology involves applying lightweight encryption techniques to nonsemantic entities, such as user IDs or usernames, while identifying and replacing semantic entities, such as names of people and geographic information, with context-coherent pseudonyms. We achieve context-coherent pseudonymization of semantic entities through a prompt engineering process utilizing Llama 3, which replaces semantic entities with context-coherent pseudonyms, ensuring the privacy of personal data while preserving contextual integrity. While the initial results are promising, we plan to conduct further evaluations of the proposed framework to assess its robustness and scalability.
«
This paper focuses on pseudonymizing user profile data from online social networking platforms, particularly Facebook. The primary goal of this research is to address the challenge of preserving privacy while maintaining data utility for downstream applications, such as machine learning. To strike this balance, we propose replacing personal data with context-aware pseudonyms to ensure minimal loss of data quality. Our methodology involves applying lightweight encryption techniques to nonsemantic...
»