TY - GEN
T1 - On Compositionality in Data Embedding
AU - Xu, Zhaozhen
AU - Guo, Zhijin
AU - Cristianini, Nello
PY - 2023/4/1
Y1 - 2023/4/1
N2 - Representing data items as vectors in a space is a common practice in machine learning, where it often goes under the name of “data embedding”. This representation is typically learnt from known relations that exist in the original data, such as co-occurrence of words, or connections in graphs. A property of these embeddings is known as compositionality, whereby the vector representation of an item can be decomposed into different parts, which can be understood separately. This property, first observed in the case of word embeddings, could help with various challenges of modern AI: detection of unwanted bias in the representation, explainability of AI decisions based on these representations, and the possibility of performing analogical reasoning or counterfactual question answering. One important direction of research is to understand the origins, properties and limitations of compositional data embeddings, with the idea of going beyond word embeddings. In this paper, we propose two methods to test for this property, demonstrating their use in the case of sentence embedding and knowledge graph embedding.
AB - Representing data items as vectors in a space is a common practice in machine learning, where it often goes under the name of “data embedding”. This representation is typically learnt from known relations that exist in the original data, such as co-occurrence of words, or connections in graphs. A property of these embeddings is known as compositionality, whereby the vector representation of an item can be decomposed into different parts, which can be understood separately. This property, first observed in the case of word embeddings, could help with various challenges of modern AI: detection of unwanted bias in the representation, explainability of AI decisions based on these representations, and the possibility of performing analogical reasoning or counterfactual question answering. One important direction of research is to understand the origins, properties and limitations of compositional data embeddings, with the idea of going beyond word embeddings. In this paper, we propose two methods to test for this property, demonstrating their use in the case of sentence embedding and knowledge graph embedding.
KW - Embedding Compositionality
KW - Knowledge Graph
KW - Sentence Embedding
UR - http://www.scopus.com/inward/record.url?scp=85152568244&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-30047-9_38
DO - 10.1007/978-3-031-30047-9_38
M3 - Chapter in a published conference proceeding
AN - SCOPUS:85152568244
SN - 9783031300462
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 484
EP - 496
BT - Advances in Intelligent Data Analysis XXI - 21st International Symposium on Intelligent Data Analysis, IDA 2023, Proceedings
A2 - Crémilleux, Bruno
A2 - Hess, Sibylle
A2 - Nijssen, Siegfried
PB - Springer
CY - Cham, Switzerland
T2 - 21st International Symposium on Intelligent Data Analysis, IDA 2022
Y2 - 12 April 2023 through 14 April 2023
ER -