Abstract
We introduce a principled method for the signed clustering problem, where the goal is to partition a weighted undirected graph whose edge weights take both positive and negative values, such that edges within the same cluster are mostly positive, while edges spanning across clusters are mostly negative. Our method relies on a graph-based diffuse interface model formulation utilizing the Ginzburg-Landau functional, based on an adaptation of the classic numerical Merriman-Bence-Osher (MBO) scheme for minimizing such graph-based functionals. The proposed object ive function aims to minimize the total weight of inter-cluster positively-weighted edges, while maximizing the total weight of the inter-cluster negatively-weighted edges. Our method scales to large sparse networks, and can be easily adjusted to incorporate labelled data information, as is often the case in the context of semisupervised learning. We tested our method on a number of both synthetic stochastic block models and real-world data sets (including financial correlation matrices), and obtained promising results that compare favourably against a number of state-of-the-art approaches from the recent literature.
Original language | English |
---|---|
Pages (from-to) | 73-109 |
Number of pages | 37 |
Journal | Communications in Mathematical Sciences |
Volume | 19 |
Issue number | 1 |
Early online date | 24 Mar 2021 |
DOIs | |
Publication status | Published - 24 Mar 2021 |
Bibliographical note
Funding Information:Funding. MC and AP acknowledge support from The Alan Turing Institute EP-SRC grant EP/N510129/1 and seed funding project SF029 “Predictive graph analytics and propagation of information in networks”. AP also acknowledges support from the National Group of Mathematical Physics (GNFM-INdAM), by Imperial College together with the Data Science Institute and Thomson-Reuters Grant No. 4500902397-3408 and EPSRC grant EP/P002625/1. YvG did a substantial part of the work which has contributed to this paper at the University of Nottingham. YvG acknowledges that this project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skl odowska-Curie grant agreement No. 777826.
Publisher Copyright:
© 2021 International Press. All Rights Reserved.
Copyright:
Copyright 2021 Elsevier B.V., All rights reserved.
Funding
Funding. MC and AP acknowledge support from The Alan Turing Institute EP-SRC grant EP/N510129/1 and seed funding project SF029 “Predictive graph analytics and propagation of information in networks”. AP also acknowledges support from the National Group of Mathematical Physics (GNFM-INdAM), by Imperial College together with the Data Science Institute and Thomson-Reuters Grant No. 4500902397-3408 and EPSRC grant EP/P002625/1. YvG did a substantial part of the work which has contributed to this paper at the University of Nottingham. YvG acknowledges that this project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skl odowska-Curie grant agreement No. 777826.
Keywords
- clustering
- graph Laplacians
- Merriman—Bence—Osher scheme
- signed networks
- spectral methods
- threshold dynamics
- time series
ASJC Scopus subject areas
- General Mathematics
- Applied Mathematics