This paper describes the participation of the UoB-NLP team in the ProfNER-ST shared subtask 7a. The task was aimed at detecting the mention of professions in social media text. Our team experimented with two methods of improving the performance of pre-trained models: Specifically, we experimented with data augmentation through translation and the merging of multiple language inputs to meet the objective of the task. While the best performing model on the test data consisted of mBERT fine-tuned on augmented data using back-translation, the improvement is minor possibly because multi-lingual pre-trained models such as mBERT already have access to the kind of information provided through back-translation and bilingual data.
|Title of host publication||Proceedings of the Sixth Social Media Mining for Health (SMM4H) Workshop and Shared Task|
|Place of Publication||Mexico City, Mexico|
|Publisher||Association for Computational Linguistics|
|Number of pages||3|
|Publication status||Published - 1 Jun 2021|