HyperAIHyperAI
2 months ago

BN-AuthProf: Benchmarking Machine Learning for Bangla Author Profiling on Social Media Texts

Tasnim, Raisa ; Chowdhury, Mehanaz ; Rahman, Md Ataur
BN-AuthProf: Benchmarking Machine Learning for Bangla Author Profiling
  on Social Media Texts
Abstract

Author profiling, the analysis of texts to uncover attributes such as genderand age of the author, has become essential with the widespread use of socialmedia platforms. This paper focuses on author profiling in the Bangla language,aiming to extract valuable insights about anonymous authors based on theirwriting style on social media. The primary objective is to introduce andbenchmark the performance of machine learning approaches on a newly createdBangla Author Profiling dataset, BN-AuthProf. The dataset comprises 30,131social media posts from 300 authors, labeled by their age and gender. Authors'identities and sensitive information were anonymized to ensure privacy. Variousclassical machine learning and deep learning techniques were employed toevaluate the dataset. For gender classification, the best accuracy achieved was80% using Support Vector Machine (SVM), while a Multinomial Naive Bayes (MNB)classifier achieved the best F1 score of 0.756. For age classification, MNBattained a maximum accuracy score of 91% with an F1 score of 0.905. Thisresearch highlights the effectiveness of machine learning in gender and ageclassification for Bangla author profiling, with practical implicationsspanning marketing, security, forensic linguistics, education, and criminalinvestigations, considering privacy and biases.

BN-AuthProf: Benchmarking Machine Learning for Bangla Author Profiling on Social Media Texts | Latest Papers | HyperAI