On Compositionality in Data Embedding

Zhaozhen Xu, Zhijin Guo, Nello Cristianini

Research output: Chapter or section in a book/report/conference proceedingChapter in a published conference proceeding

Abstract

Representing data items as vectors in a space is a common practice in machine learning, where it often goes under the name of “data embedding”. This representation is typically learnt from known relations that exist in the original data, such as co-occurrence of words, or connections in graphs. A property of these embeddings is known as compositionality, whereby the vector representation of an item can be decomposed into different parts, which can be understood separately. This property, first observed in the case of word embeddings, could help with various challenges of modern AI: detection of unwanted bias in the representation, explainability of AI decisions based on these representations, and the possibility of performing analogical reasoning or counterfactual question answering. One important direction of research is to understand the origins, properties and limitations of compositional data embeddings, with the idea of going beyond word embeddings. In this paper, we propose two methods to test for this property, demonstrating their use in the case of sentence embedding and knowledge graph embedding.

Original languageEnglish
Title of host publicationAdvances in Intelligent Data Analysis XXI - 21st International Symposium on Intelligent Data Analysis, IDA 2023, Proceedings
EditorsBruno Crémilleux, Sibylle Hess, Siegfried Nijssen
Place of PublicationCham, Switzerland
PublisherSpringer
Pages484-496
Number of pages13
ISBN (Print)9783031300462
DOIs
Publication statusPublished - 1 Apr 2023
Event21st International Symposium on Intelligent Data Analysis, IDA 2022 - Louvain-la-Neuve, Belgium
Duration: 12 Apr 202314 Apr 2023

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13876 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference21st International Symposium on Intelligent Data Analysis, IDA 2022
Country/TerritoryBelgium
CityLouvain-la-Neuve
Period12/04/2314/04/23

Keywords

  • Embedding Compositionality
  • Knowledge Graph
  • Sentence Embedding

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'On Compositionality in Data Embedding'. Together they form a unique fingerprint.

Cite this