Learning semantic sentence embeddings using pair-wise discriminator

Badri N. Patro, Vinod K. Kurmi, Sandeep Kumar, Vinay P. Namboodiri

Research output: Chapter or section in a book/report/conference proceedingChapter in a published conference proceeding

19 Citations (SciVal)

Abstract

In this paper, we propose a method for obtaining sentence-level embeddings. While the problem of securing word-level embeddings is very well studied, we propose a novel method for obtaining sentence-level embeddings. This is obtained by a simple method in the context of solving the paraphrase generation task. If we use a sequential encoder-decoder model for generating paraphrase, we would like the generated paraphrase to be semantically close to the original sentence. One way to ensure this is by adding constraints for true paraphrase embeddings to be close and unrelated paraphrase candidate sentence embeddings to be far. This is ensured by using a sequential pair-wise discriminator that shares weights with the encoder that is trained with a suitable loss function. Our loss function penalizes paraphrase sentence embedding distances from being too large. This loss is used in combination with a sequential encoder-decoder network. We also validated our method by evaluating the obtained embeddings for a sentiment analysis task. The proposed method results in semantic embeddings and outperforms the state-of-the-art on the paraphrase generation and sentiment analysis task on standard datasets. These results are also shown to be statistically significant.

Original languageEnglish
Title of host publicationCOLING 2018 - 27th International Conference on Computational Linguistics, Proceedings
EditorsEmily M. Bender, Leon Derczynski, Pierre Isabelle
PublisherAssociation for Computational Linguistics (ACL)
Pages2715-2729
Number of pages15
ISBN (Electronic)9781948087506
Publication statusPublished - 26 Aug 2018
Event27th International Conference on Computational Linguistics, COLING 2018 - Santa Fe, USA United States
Duration: 20 Aug 201826 Aug 2018

Publication series

NameCOLING 2018 - 27th International Conference on Computational Linguistics, Proceedings

Conference

Conference27th International Conference on Computational Linguistics, COLING 2018
Country/TerritoryUSA United States
CitySanta Fe
Period20/08/1826/08/18

ASJC Scopus subject areas

  • Language and Linguistics
  • Computational Theory and Mathematics
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Learning semantic sentence embeddings using pair-wise discriminator'. Together they form a unique fingerprint.

Cite this