Abstract
In this paper, we propose a method for obtaining sentence-level embeddings. While the problem of obtaining word-level embeddings is very well studied, we propose a novel method for obtaining sentence-level embeddings. This is obtained by a simple method in the context of solving the paraphrase generation task. If we use a sequential encoder-decoder model for generating paraphrase, we would like the generated paraphrase to be semantically close to the original sentence. One way to ensure this is by adding constraints for true paraphrase embeddings to be close and unrelated paraphrase candidate sentence embeddings to be far. This is ensured by using a sequential pair-wise discriminator that shares weights with the encoder. This discriminator is trained with a suitable loss function. Our loss function penalizes paraphrase sentence embedding distances from being too large. This loss is used in combination with a sequential encoder-decoder network. We also validate our method by evaluating the obtained embeddings for a sentiment analysis task. The proposed method results in semantic embeddings and provide competitive results on the paraphrase generation and sentiment analysis task on standard dataset. These results are also shown to be statistically significant.
Original language | English |
---|---|
Pages (from-to) | 149-161 |
Number of pages | 13 |
Journal | Neurocomputing |
Volume | 420 |
Early online date | 1 Sept 2020 |
DOIs | |
Publication status | Published - 8 Jan 2021 |
Bibliographical note
Funding Information:We acknowledge the help provided by our DelTA Lab members and our family who have supported us in our research activity.
Publisher Copyright:
© 2020 Elsevier B.V.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
Keywords
- Adversarial learning
- Discriminator
- GAN
- LSTM
- Pairwise
- Paraphrase
- Question Generation
- Sentiment Analysis
- VQA
ASJC Scopus subject areas
- Computer Science Applications
- Cognitive Neuroscience
- Artificial Intelligence