This paper presents our submission to Task 2 of the Workshop on Noisy User-generated Text. We explore improving the performance of a pre-trained transformer-based language model fine-tuned for text classification through an ensemble implementation that makes use of corpus level information and a handcrafted feature. We test the effectiveness of including the aforementioned features in accommodating the challenges of a noisy data set centred on a specific subject outside the remit of the pre-training data. We show that inclusion of additional features can improve classification results and achieve a score within 2 points of the top performing team.
|Title of host publication||Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020)|
|Place of Publication||Online|
|Publisher||Association for Computational Linguistics|
|Number of pages||7|
|Publication status||Published - 1 Nov 2020|