Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here .

Loading metrics

Open Access

Peer-reviewed

Research Article

Accurate graph classification via two-staged contrastive curriculum learning

Roles Data curation, Formal analysis, Investigation, Methodology, Software, Validation, Visualization, Writing – original draft

Affiliation Department of Computer Science and Engineering, Seoul National University, Seoul, Republic of Korea

Roles Formal analysis, Investigation, Validation, Writing – review & editing

Roles Conceptualization, Formal analysis, Funding acquisition, Investigation, Project administration, Supervision, Validation, Writing – review & editing

* E-mail: [email protected]

ORCID logo

  • Sooyeon Shim, 
  • Junghun Kim, 
  • Kahyun Park, 

PLOS

  • Published: January 3, 2024
  • https://doi.org/10.1371/journal.pone.0296171
  • Reader Comments

Fig 1

Given a graph dataset, how can we generate meaningful graph representations that maximize classification accuracy? Learning representative graph embeddings is important for solving various real-world graph-based tasks. Graph contrastive learning aims to learn representations of graphs by capturing the relationship between the original graph and the augmented graph. However, previous contrastive learning methods neither capture semantic information within graphs nor consider both nodes and graphs while learning graph embeddings. We propose TAG ( Two-staged contrAstive curriculum learning for Graphs ), a two-staged contrastive learning method for graph classification. TAG learns graph representations in two levels: node-level and graph level, by exploiting six degree-based model-agnostic augmentation algorithms. Experiments show that TAG outperforms both unsupervised and supervised methods in classification accuracy, achieving up to 4.08% points and 4.76% points higher than the second-best unsupervised and supervised methods on average, respectively.

Citation: Shim S, Kim J, Park K, Kang U (2024) Accurate graph classification via two-staged contrastive curriculum learning. PLoS ONE 19(1): e0296171. https://doi.org/10.1371/journal.pone.0296171

Editor: Jin Liu, Shanghai Maritime University, CHINA

Received: April 16, 2023; Accepted: November 28, 2023; Published: January 3, 2024

Copyright: © 2024 Shim et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All data and codes are available at the following link: https://github.com/snudatalab/TAG .

Funding: This work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) Flexible and Efficient Model Compression Method for Various Applications and Environments (2020-0-00894), Artificial Intelligence Graduate School Program (Seoul National University) (2021-0-01343), and Artificial Intelligence Innovation Hub (Artificial Intelligence Institute, Seoul National University) (2021-0-02068). The Institute of Engineering Research at Seoul National University provided research facilities for this work. The Institute of Computer Technology at Seoul National University provides research facilities for this study. The funders had no role in the methodology of the study.

Competing interests: The authors have declared that no competing interests exist.

Introduction

How can we generate graph representations for accurate graph classification? Graph neural network (GNN) has drawn the attention of researchers since it is applicable to real-world graph-structured data including social networks, molecular graphs, etc. Various GNNs have been proposed to solve graph classification [ 1 – 7 ].

A main challenge of accurate graph classification is to learn graph embeddings that reflect the crucial information within graphs. Contrastive learning has been widely used to address the issue and achieved superior performance on the graph classification task. Graph contrastive learning produces the representations of graphs based on the similarity between graphs. The learning algorithm can be used in both settings: unsupervised [ 8 – 14 ] and supervised settings [ 15 , 16 ].

Recent graph contrastive learning methods utilize data augmentation to ensure the similarity of the original graph and the newly generated graph. Random-based augmentations are used to generate graphs in [ 9 , 10 , 13 ], but information loss is inevitable in those methods. Graph contrastive learning methods with carefully designed augmentations [ 8 , 11 , 12 , 14 ] preserve more graph semantics compared to those with random-based ones; however, these methods increase the complexity of their models. Furthermore, none of the previous approaches optimize node embeddings which are the basis of graph embeddings.

In this paper, we propose TAG ( Two-staged contrAstive curriculum learning for Graphs ), an accurate graph contrastive learning approach that can be applied to both supervised and unsupervised graph classification. We design six model-agnostic augmentation algorithms that preserve the semantic information of graphs. Three algorithms change the features of nodes, and the other three modify the structure of graphs based on degree centrality. We then conduct graph contrastive learning in two levels: node-level and graph-level. Node-level contrastive learning learns node embeddings based on the relationship between nodes. Graph-level contrastive learning learns the embeddings of graphs based on node embeddings. The embeddings of all nodes within a graph are aggregated to generate a graph embedding. Thus, the relationships of both nodes and graphs are reflected in the graph representations. Furthermore, TAG exploits a curriculum learning strategy to enhance performance. Fig 1 shows the overall performance of TAG; note that TAG outperforms the competitors in both unsupervised and supervised settings.

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

(a-d) are the performance in unsupervised setting, and (e-h) are that in supervised one. Note that TAG shows the highest classification accuracy with the shortest running time in both settings.

https://doi.org/10.1371/journal.pone.0296171.g001

Our main contributions are summarized as follows:

  • Data augmentation. We propose six model-agnostic augmentation algorithms for graphs. Every augmentation method considers node centrality to preserve semantic information of original graphs.
  • Method. We propose TAG, a two-staged contrastive curriculum learning method for accurate graph classification. The two-staged approach embeds the relational information of both nodes and graphs into the graph representations.
  • Experiments. We perform experiments on seven benchmark datasets in supervised and unsupervised settings, achieving the best performance.

Table 1 describes the symbols used in this paper. The code is available at https://github.com/snudatalab/TAG .

thumbnail

https://doi.org/10.1371/journal.pone.0296171.t001

Related works

Node-level graph contrastive learning.

Node-level graph contrastive learning methods are designed to handle node classification task by capturing the relationship between nodes. DGI [ 17 ] is the first work that applies the concept of contrastive learning to the graph domain. JGCL [ 18 ] combines supervised setting, semi-supervised setting, and unsupervised setting to learn the optimal node representations. GMI [ 19 ] defines the concept of graph mutual information (GMI) and aims to maximize the mutual information in terms of node features and topology of graphs. GCC [ 20 ] learns transferable structural representation across various networks to guide the pre-training of graph neural networks. GRACE [ 21 ] jointly considers both topology and node attribute levels for corruption to generate graph views and maximizes the agreement in the views at the node level. Zhu et al. [ 22 ] propose GCA which removes unimportant edges by giving them large removal probabilities on the topology level and adds more noise to unimportant feature dimensions on the node attribute level for adaptive augmentation. BGRL [ 23 ] is a scalable method with two encoders that learns by predicting alternative augmentations of the input. Graph Barlow Twins (G-BT) [ 24 ] is a model that replaces negative samples with a cross-correlation-based loss function and does not introduce asymmetry in the network. black However, those previous approaches for node-level graph contrastive learning address only the node classification problem, making them unsuitable for graph classification problem.

Graph-level graph contrastive learning

Graph-level graph contrastive learning aims to obtain graph representations to solve graph classification task. Previous graph-level contrastive learning methods are divided into two types: model-specific and model-agnostic ones. Model-agnostic approaches use augmentation algorithms which do not engage in the training process. GraphCL [ 10 ] brings the contrastive learning method for images to the graph domain. CuCo [ 13 ] extends GraphCL by applying curriculum learning to properly learn from the negative samples. MVGRL [ 9 ] learns graph-level representations by contrasting encodings from first-order neighbors and graph diffusion. These methods use random-based graph augmentations that cannot preserve the core information of graphs well. We propose a graph contrastive learning method along with degree-based augmentations to address the issue.

Model-specific augmentation approaches directly participate in the training process. InfoGraph [ 8 ] learns graph representations by contrasting them with patch-level representations obtained from the training process. You et al. [ 11 ] propose JOAO which changes the simple augmentations to be learnable. AD-GCL [ 12 ] adopts the structure of an adversarial attack to obtain graph representations. AutoGCL [ 14 ] generates new graphs by changing the softmax function into the Gumbel-Softmax function. black However, those approaches for graph-level graph contrastive learning are more complex than model-agnostic methods, significantly increasing the training time. Therefore, we propose a contrastive learning method with simple augmentations for computational efficiency.

Graph augmentation

Data augmentation has garnered significant attention recently, due to its successful application to many domains including image classification [ 25 ], natural language processing (NLP) [ 26 ], human activity recognition (HAR) [ 27 , 28 ], and cognitive engagement classification [ 29 ]. Among them, graph augmentation methods are actively studied for improving the performance of graph contrastive learning.

Graph augmentation algorithms are divided into two types: model-specific and model-agnostic augmentation. Model-specific augmentation algorithms are restricted to a certain model. black Thus, those augmentation methods are not easy to be directly used in graph contrastive learning.

Model-agnostic graph augmentations are applied to any graph neural network. You et al. [ 10 ] suggest DropNode and ChangeAttr for graph contrastive learning. DropNode discards randomly selected nodes with their connections and ChangeAttr converts features of randomly selected nodes into random values. DropEdge [ 30 ] changes graph topology by removing a certain ratio of edges. GraphCrop [ 31 ] selects a subgraph from a graph through a random walk. Wang et al. [ 32 ] introduce NodeAug which contains three different augmentations: ReplaceAttr, RemoveEdge, and AddEdge. ReplaceAttr substitutes the feature of a chosen node with the average of its neighboring nodes’ features. RemoveEdge discards edges based on the importance score of edges. AddEdge attaches new edges to a central node which is designated based on the importance score for nodes. Motif-similarity [ 33 ] adds and deletes edges from motifs that are frequent in a particular graph. Yoo et al. [ 34 ] proposes NodeSam and SubMix. NodeSam performs split and merge operations on nodes. SubMix replaces a subgraph of a graph with another subgraph cut off from another graph. black SFA [ 35 ] proposes a spectral feature argumentation for contrastive learning on graphs.

However, previous model-agnostic augmentation algorithms [ 10 , 31 – 34 ] change nodes or edges that are randomly selected, which easily overlook the semantic information of the original graphs. Another limitation is that previous approaches change only node attributes [ 35 ] or graph structures [ 30 , 33 ], restricting the diversity of augmented examples. On the other hand, TAG changes both node attributes and graph structures based on the degree centrality to preserve crucial information of graphs.

Preliminary on graph contrastive learning

In this section, we describe the preliminary of our work. Contrastive learning aims to learn embeddings by capturing the relational information between instances. For each instance, positive and negative samples need to be defined to maximize the similarity between a given instance and a positive sample compared to negative samples. Graph contrastive learning operates on graph-structured data. Recent works utilize data augmentation to generate positive samples. Previous graph contrastive learning methods are divided into two categories: node-level and graph-level contrastive learning.

Node-level graph contrastive learning methods obtain node embeddings of a graph. Given a graph, previous approaches augment the given graph and contrast nodes of the given graph and the augmented graph. A pair of nodes from two graphs at the same position is defined as positive samples and all other nodes except for positive samples are defined as negative samples. The model then learns the similarity of a positive pair against a negative pair. Graph-level graph contrastive learning methods learn graph embeddings by contrasting the graphs. Previous approaches set two augmented graphs with the same origin as positive samples and all other graphs in the training set except for the original graph as negative samples. Graph-level contrastive learning models then capture the similarity between a positive pair of graphs compared to a negative pair.

Despite the decent performance of graph contrastive learning, there is still a room for improvement. First, the relationship between node and graph embeddings has not been studied. Even though graph embeddings are obtained based on node embeddings, previous graph contrastive learning methods do not consider node embeddings. Second, most augmentation algorithms for contrastive learning randomly select nodes or edges to be modified. Since node feature and graph topology are the most essential components of graph-structured data, augmenting graphs while preserving crucial information within the pivotal components is important. However, previous methods rely on random-based augmentation algorithms which inevitably involve information loss. Finally, the influence of both positive and negative samples has not been studied. Previous methods focus on either positive or negative samples. To improve the performance of graph contrastive learning, well-defining both positive and negative samples is important. In this work, we propose TAG which addresses the three issues.

Proposed method

We propose TAG, a two-staged contrastive curriculum learning framework for graphs. The main challenges and our approaches are as follows:

  • How can we generate graph representations in both unsupervised and supervised settings? We propose a two-staged graph contrastive curriculum learning method that is applied to both settings through two types of loss functions.
  • How can we design augmentations for contrastive learning to preserve the semantics well? We propose six data augmentation algorithms for graph contrastive learning. The augmentation algorithms consider degree centrality to minimize information loss.
  • How can we determine the order of feeding the negative examples in contrastive learning? We exploit curriculum learning to determine the order of negative samples and maximize the performance of the model.

The overall process of TAG is illustrated in Figs 2 and 3 . Fig 2 explains how the proposed method learns a training set. Fig 3 illustrates the details of performing augmentation and contrastive learning. Given a graph dataset, we first augment graphs, and then perform contrastive curriculum learning in two levels: nodes and graphs.

thumbnail

https://doi.org/10.1371/journal.pone.0296171.g002

thumbnail

TAG performs node-level and graph-level contrastive learning on the feature-augmented graph G f , i and the structure-augmented graph G s , i obtained from the original graph G i . In the contrastive learning steps, nodes and graphs colored with blue are positive samples, and those colored with red are negative ones.

https://doi.org/10.1371/journal.pone.0296171.g003

Data augmentation

Our goal is to design data augmentation algorithms that minimize the information loss of graphs. Data augmentation is used to ensure the similarity between samples in contrastive learning. The most important challenge of augmentation is preserving the semantics, or keeping crucial information in determining graph labels. If the semantics are not preserved well in the process of augmentation, the original graph and the augmented graph would have different labels, resulting in increased dissimilarity. Therefore, we propose six model-agnostic graph augmentation algorithms based on degree centrality to minimize information loss. Our idea is to change low-degree nodes to minimize the loss of semantics.

We categorize the six augmentation methods into two types: feature and structure modification. Feature modification algorithms generate new graphs by changing only the node feature. On the other hand, structure modification algorithms change the graph structure. We propose three algorithms for each type. The three algorithms designed for feature augmentation are listed as follows:

  • Edit feature. Randomly change the features of n nodes with the lowest degrees.
  • Mix feature. Mix the features of two selected nodes and then substitute the mixed features for the features of nodes with lower degrees. Repeat this process n times.
  • Add noise. Add noise to the features of selected nodes. n nodes with the lowest degrees are selected to be modified.

The algorithms for structure augmentation are as follows:

  • Delete node. Discard n nodes with the lowest degrees along with their connections.
  • Delete edge. Select m edges from nodes with the lowest degrees. Remove the selected edges.
  • Cut subgraph. Select a subgraph with high-degree nodes.

n and m denote the number of nodes and edges to be modified, respectively. n and m are decided according to the augmentation ratio which is given as a hyperparameter. All algorithms consider degree centrality to keep semantic information.

Algorithm 1 TAG (Two-staged Contrastive Curriculum Learning for Graphs)

graph representation with curriculum contrastive learning

Output: the trained graph neural network f

graph representation with curriculum contrastive learning

6: for t ← 1 to T do

7:   for i ← 1 to N do

8:    l n ( i ) ← ContrastNode ( G i , G f , i , f )     ⊳ Algorithm 2

graph representation with curriculum contrastive learning

10:   end for

graph representation with curriculum contrastive learning

13: end for

Two-staged contrastive learning

We propose a graph contrastive learning model for accurate graph classification utilizing all the proposed augmentation algorithms. Graph contrastive learning is a self-supervised approach that allows a model to learn the representations of graphs without labels by teaching the model which graph instances are similar or different. We use the data augmentation algorithms proposed in the Data augmentation section to generate similar graphs. Considering the fact that graph embeddings are obtained based on node embeddings, learning the representative embeddings from both nodes and graphs is important. We propose TAG which conducts graph contrastive learning on two stages: node-level and graph-level.

graph representation with curriculum contrastive learning

In the following, we first explain the two-staged approach of TAG in detail. Then, we describe how to apply TAG for supervised graph classification and how to exploit curriculum learning for determining the order of negative samples.

Algorithm 2 ContrastNode in TAG

graph representation with curriculum contrastive learning

Output: node-level contrastive loss l n ( i ) for a graph G i

graph representation with curriculum contrastive learning

4:    if k ≠ j then

graph representation with curriculum contrastive learning

6:     x j , x k , x f , j , x f , k ← get feature vectors of nodes v j , v k , u j , u k

7:     v j , v k , u j , u k ← f ( x j , θ ), f ( x k , θ ), f ( x f , j , θ ), f ( x f , k , θ )

graph representation with curriculum contrastive learning

10:    end if

11:   end for

graph representation with curriculum contrastive learning

14: Compute l n ( i )     ⊳ Eq 1

Node-level contrastive learning.

The objective of the node-level contrastive learning in TAG is to learn meaningful node representations by embedding the nodes into a latent space where positive pairs of nodes are more closely located than negative ones. Positive pairs ( v j , u j ) of nodes are obtained by selecting a node v j from an original graph G i , and a node u j from a feature-augmented graph G f , i with the same position. We utilize all of the proposed augmentations by randomly selecting an augmentation algorithm for a graph from the proposed algorithms.

There are two types of negative node pairs: 1) pairs ( v j , v k ) of nodes both sampled from the original graph G i , and 2) pairs ( v j , u k ) of nodes sampled from G i and G f , i , respectively. All nodes in G i which are not selected for the positive pairs are used to generate the negative samples v k . Similarly, every node u k from G f , i except for the selected positive node u j is treated as a negative sample. The process of sampling positive and negative pairs of nodes for the node-level contrastive learning is illustrated in Fig 4 .

thumbnail

Nodes v j and v k are selected from the original graph G i while nodes u j and u k are sampled from a feature-augmented graph G f , i at the same position.

https://doi.org/10.1371/journal.pone.0296171.g004

graph representation with curriculum contrastive learning

Algorithm 3 ContrastGraph in TAG

graph representation with curriculum contrastive learning

Output: raph-level contrastive loss l g ( i ) for a graph G i

1: ( G f , i , G s , i ) ← select a positive pair of graphs

2:   i ′ ← 1 to N do

3:    if i ′ ≠ i then

4:     G s , i ′ ← select a negative graph

5:     z f , i , z s , i ′ ← average node embeddings within G f , i , G s , i ′

graph representation with curriculum contrastive learning

7:    end if

8:   end for

graph representation with curriculum contrastive learning

10: Compute l g ( i )    ⊳ Eq 2

Graph-level contrastive learning.

Graph-level contrastive learning in TAG aims to obtain representative graph embeddings. Graph embeddings are learned by collecting all node embeddings within a graph with the average function. As with node-level contrastive learning, positive and negative samples are defined using augmentation in graph-level contrastive learning.

A positive pair ( G f , i , G s , i ) of graphs contains a feature-modified graph G f , i and a structure-modified graph G s , i of a graph G i . Feature modification and structure modification algorithms are randomly chosen from the proposed augmentation algorithms. Negative pairs are ( G f , i , G s , i ′ ) where G i ′ is a different graph from G i . Fig 5 explains positive and negative samples designed for graph-level contrastive learning.

thumbnail

( G f , i , G s , i ) is a positive pair originated from a graph G i , and ( G f , i , G s , i ′ ) for i ≠ i ′ are negative pairs.

https://doi.org/10.1371/journal.pone.0296171.g005

graph representation with curriculum contrastive learning

Supervised contrastive learning.

graph representation with curriculum contrastive learning

Curriculum learning

Curriculum learning imitates the learning process of humans who start learning from easier samples, and then learn more from harder samples. To further improve the performance of TAG, we reorder the samples for training by exploiting the curriculum learning strategy. A naive approach would define negative samples that are misclassified with high probability as hard samples. However, this is not directly applicable to the contrastive learning methods including TAG since the labels may not be given.

To determine the difficulty of samples regarding the two-staged contrastive loss, we utilize the similarity between positive and negative samples. The sample with a large loss is hard to learn because loss minimization is difficult. However, it is hard to use the loss as a difficulty measure since reordering should be done before loss calculation. Thus, we define the cosine similarity of nodes in a negative pair which affects the size of the loss as a difficulty score. If a negative sample is similar to a positive sample, the model struggles to find the difference between the samples causing a large loss. We feed negative samples with lower similarity first, and then move on to harder negative samples as training continues to facilitate effective training. Both node-level and graph-level contrastive learning train negative samples gradually from easy to hard ones.

Experiments

We perform experiments to answer the following questions:

  • Q1. Performance on Unsupervised Classification. How fast and accurate is TAG compared to previous methods for unsupervised graph classification?
  • Q2. Performance on Supervised Classification. Does TAG show superior performance than other baselines in supervised graph classification task?
  • Q3. Effectiveness of Proposed Augmentations. Do the proposed augmentation algorithms improve the performance of TAG?
  • Q4. Ablation Study. Does each step of TAG contribute to the performance of the unsupervised graph classification task?

Experimental settings

We introduce our experimental settings including datasets, competitors, and hyperparameters. All of our experiments are conducted on a single GPU machine with GeForce GTX 1080 Ti.

Datasets. We use seven benchmark datasets for graph classification task in our experiments, which are summarized in Table 2 . MUTAG, PROTEINS, NCI1, NCI109, DD, and PTC-MR [ 36 ] are molecular datasets where the nodes stand for atoms and are labeled by the atom type, while edges are bonds between the atoms. DBLP [ 37 ] is a citation network dataset in the computer science field whose nodes represent scientific publications.

thumbnail

https://doi.org/10.1371/journal.pone.0296171.t002

Competitors. We compare TAG in supervised and unsupervised settings. For the unsupervised setting, we compare TAG with ten previous approaches for unsupervised graph classification, including those for contrastive learning.

  • DGK [ 38 ] learns latent representations of graphs by adopting the concept of the skip-gram model.
  • sub2vec [ 39 ] is an unsupervised learning algorithm that captures two properties of subgraphs: neighborhood and structure.
  • graph2vec [ 40 ] extends neural networks for document embedding to the graph domain, by viewing the graphs as documents.
  • InfoGraph [ 8 ] generates graph representations by maximizing mutual information between graph-level and patch-level representations.
  • MVGRL [ 9 ] learns graph representations by contrasting two diffusion matrices transformed from the adjacency matrix.
  • GraphCL [ 10 ] brings image contrastive learning to graphs.
  • JOAO [ 11 ] jointly optimizes augmentation selection together with the contrastive objectives.
  • AD-GCL [ 12 ] uses an adversarial training strategy for edge-dropping augmentation of graphs.
  • CuCo [ 13 ] adopts curriculum learning to graph contrastive learning for performance improvement.
  • AutoGCL [ 14 ] uses node representations to predict the probability of selecting a certain augment operation.

We use support vector machine (SVM) and multi-layer perceptron (MLP) as base classifiers to evaluate the competitors and TAG in an unsupervised setting. We select an SVM classifier among various machine learning classifiers for a fair comparison since the competitors use SVM to evaluate their methods. To evaluate methods in deep learning as well as in machine learning, we exploit an MLP classifier.

In the supervised setting, we compare the accuracy of TAG with 4 baselines:

  • GCN+GMP [ 41 ] uses the graph convolutional network (GCN) to learn the node representations, and the global mean pooling (GMP) is applied to obtain the graph representation.
  • GIN [ 5 ] uses multi-layer perceptrons (MLP) to update node representations, and sums them up to generate the graph representation.
  • ASAP [ 6 ] alternatively clusters nodes in a graph and gathers the representations of clusters to obtain graph representations.
  • GMT [ 7 ] designs graph pooling layer based on multi-head attention.

We run 10-fold cross-validation to evaluate the competitors and TAG.

Hyperparameters. We use GCN [ 41 ] to learn node embeddings and apply the global mean pooling algorithm to generate a graph embedding. We set the augmentation ratio which decides the amount of data to be changed to 0.4. The ratio is the only hyperparameter for data augmentation of TAG. Thus, TAG does not suffer from hyperparameter optimization problems. We train each model using the Adam optimizer with a learning rate of 0.0001. We set the number of epochs to 5.

Performance on unsupervised classification

We evaluate unsupervised graph classification accuracy and running time of TAG. The graph classification accuracy of TAG and previous unsupervised methods are described in Table 3 . We adopt support vector machine (SVM) and multi-layer perceptron (MLP) as base classifiers for TAG and the baselines. Note that TAG achieves the best accuracy, giving 4.08% points and 2.14% points higher accuracy than the second-best competitors on average in SVM and MLP classifiers, respectively.

thumbnail

Bold and underlined text denote the best and the second-best accuracy, respectively. OOM and Avg. denote the out of memory error and average accuracy, respectively. Note that TAG shows the best classification accuracy.

https://doi.org/10.1371/journal.pone.0296171.t003

The overall performance in the unsupervised setting of TAG with two classifiers including the running time is summarized in Figs 6 and 7 . Fig 6 shows the results of TAG and previous approaches with an SVM classifier. Note that TAG shows the highest classification accuracy in most cases with the shortest running time. This shows that TAG effectively and efficiently finds the graph representations for unsupervised graph classification from large graphs. Fig 7 shows the accuracy and running time of TAG and the competitors measured with an MLP classifier. TAG outperforms the competitors for most datasets.

thumbnail

Note that TAG shows the highest classification accuracy with the shortest running time in most cases.

https://doi.org/10.1371/journal.pone.0296171.g006

thumbnail

(a-g) show the accuracy and running time of each dataset. TAG outperforms the competitors in most cases.

https://doi.org/10.1371/journal.pone.0296171.g007

Performance on supervised classification

TAG also operates in the supervised graph classification task in addition to the unsupervised one. We compare TAG with four baselines for supervised graph classification in Table 4 . We use classification accuracy and running time as the evaluation metrics. Note that TAG gives the highest accuracy, with 4.76% points higher average accuracy than the second-best method. Specifically, TAG in the supervised setting achieves 4.50% points and 13.26% points higher average accuracy than that in the unsupervised setting in SVM and MLP classifiers, respectively.

thumbnail

Bold and underlined text denote the best and the second-best accuracy, respectively. Avg. denotes the average accuracy. Note that TAG shows the best accuracy.

https://doi.org/10.1371/journal.pone.0296171.t004

Fig 8 shows the classification accuracy and the running time of TAG and baselines in a supervised setting. Note that TAG gives the shortest running time with the highest accuracy in most of the cases. This shows that TAG efficiently learns meaningful graph representations not only for unsupervised graph classification, but also supervised one.

thumbnail

(a-g) show the performance in each dataset. Note that TAG shows the highest classification accuracy with the shortest running time for most datasets.

https://doi.org/10.1371/journal.pone.0296171.g008

Effectiveness of proposed augmentations

We compare the proposed augmentations of TAG with eight previous model-agnostic augmentation algorithms for graphs. ChangeAttr modifies features and the other methods change the structure of graphs. Recall that TAG performs graph contrastive learning in two levels: node-level and graph-level. For node-level, TAG needs feature-augmented graphs. For graph-level, TAG needs feature and structure augmentations. Thus, both augmentation algorithms are necessary for TAG. MVGRL [ 9 ], GraphCL [ 10 ], and CuCo [ 13 ] are previous methods that adopt model-agnostic graph augmentations. However, MVGRL causes out-of-memory errors for large-scale graph datasets. CuCo is more elaborate than GraphCL since it additionally performs curriculum learning. Therefore, we compare TAG with previous augmentation algorithms by applying them to CuCo.

Table 5 shows the classification results using different augmentations. The accuracy is measured with an SVM classifier. TAG outperforms the baselines in most cases. Specifically, TAG achieves 5.05% points higher average accuracy than the strongest baseline SubMix. Note that random-based augmentations DropNode, DropEdge, GraphCrop, and ChangeAttr degrade the performance of CuCo for all datasets. This proves that random-based augmentation methods have difficulty preserving the semantics. In contrast, TAG with the proposed augmentations help enhance the performance.

thumbnail

We report the best and the second-best accuracy with bold and underlined texts, respectively. Avg. denotes the average accuracy. Note that TAG presents the best accuracy among the models.

https://doi.org/10.1371/journal.pone.0296171.t005

We also show the effectiveness of the degree-based node and edge selection of TAG for graph augmentation. We compare TAG with two different selection methods: TAG-random and TAG-reverse. TAG-random randomly selects the nodes or edges to be changed. TAG-reverse selects the nodes or edges from high to low degrees. Table 6 reports the classification accuracy of TAG and the baselines. We use SVM and MLP classifiers to measure the accuracy. Note that TAG outperforms the baselines in all datasets. Specifically, TAG achieves up to 4.36% points and 4.19% points higher average accuracy than the second-best baselines in SVM and MLP classifiers, respectively. This shows that the proposed augmentations of TAG considering the degree centrality effectively improves the graph classification accuracy.

thumbnail

TAG-random runs TAG by randomly selecting nodes or edges to be modified. TAG-reverse augments nodes or edges relevant to high degrees. Bold, underlined, and Avg. texts denote the best, the second-best, and the average accuracy, respectively.

https://doi.org/10.1371/journal.pone.0296171.t006

Ablation study

We perform an ablation study for TAG and report the results in Table 7 . The methods w/o curriculum and w/o node-level are TAG without the curriculum learning and the two-staged structure performing only graph-level contrastive learning, respectively. We also run TAG while fixing the proposed augmentations. black Since TAG needs both feature and structure augmentation algorithms to conduct two-staged contrastive learning, we evaluate the performance of pairs of algorithms. For example, the ‘Edit feature + Delete node’ runs TAG using ‘edit feature’ and ‘delete node’ algorithms for feature and structure modification, respectively.

thumbnail

We report accuracies of graph classification using SVM and MLP classifiers. Bold, underlined, and Avg. texts denote the best, the second-best, and the average accuracy, respectively. The methods w/o curriculum and w/o node-level refer to TAG without the curriculum learning and the node-level contrastive learning, respectively. The fixed augmentation methods (Edit feature + Delete node, Edit feature + Delete edge, etc.) run TAG by using the same feature and structure augmentations for all graphs, while TAG randomly selects an augmentation for each graph. Note that TAG shows the best performance for all cases.

https://doi.org/10.1371/journal.pone.0296171.t007

TAG with the curriculum learning improves the classification performance of SVM and MLP by 6.20% and 3.35% points on average, respectively, compared to that without the curriculum learning. Using both node-level and graph-level contrastive learning on TAG achieves 6.55% and 4.16% points higher average accuracy than using only graph-level contrastive learning on TAG in SVM and MLP classifiers, respectively. Experimental results of fixing the proposed augmentations show higher accuracies than the methods w/o curriculum and w/o node-level. The results prove that the proposed augmentation algorithms preserve the semantics well since the accuracies of the fixed augmentation methods are comparable to TAG. Furthermore, TAG achieves the best performance when it utilizes all the proposed augmentation algorithms. The results show that the proposed ideas, i.e., the two-staged framework, exploitation of curriculum learning, and the proposed augmentation algorithms for contrastive learning improve the accuracy of graph classification.

We propose TAG, a two-staged contrastive curriculum learning model for graphs. We introduce two types of data augmentations for graphs and propose six model-agnostic augmentation algorithms that minimize information loss. TAG conducts contrastive curriculum learning in two stages. In the first stage, TAG gathers the relational information between nodes from an original graph and a feature-modified graph. In the second stage, the proposed method utilizes both feature-modified and structure-modified graphs to learn the similarity between them. We exploit curriculum learning to effectively train the model via carefully selected ordering of negative samples. We evaluate TAG by measuring the graph classification accuracy and running time. TAG shows the fastest running time and the best accuracy achieving up to 4.08% points and 4.76% points higher average accuracy than the second-best competitors in unsupervised and supervised settings, respectively. Future works include designing an accurate graph classification method for hypergraphs.

  • 1. Zhang M, Cui Z, Neumann M, Chen Y. An End-to-End Deep Learning Architecture for Graph Classification. In: AAAI. AAAI Press; 2018. p. 4438–4445.
  • 2. Lee JB, Rossi RA, Kong X. Graph Classification using Structural Attention. In: KDD. ACM; 2018. p. 1666–1674.
  • 3. Kashima H, Inokuchi A. Kernels for graph classification. In: ICDM workshop on active mining. vol. 2002; 2002.
  • View Article
  • PubMed/NCBI
  • Google Scholar
  • 5. Xu K, Hu W, Leskovec J, Jegelka S. How Powerful are Graph Neural Networks? In: ICLR 2019;.
  • 6. Ranjan E, Sanyal S, Talukdar PP. ASAP: Adaptive Structure Aware Pooling for Learning Hierarchical Graph Representations. In: AAAI 2020;.
  • 7. Baek J, Kang M, Hwang SJ. Accurate Learning of Graph Representations with Graph Multiset Pooling. In: ICLR 2021;.
  • 8. Sun F, Hoffmann J, Verma V, Tang J. InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization. In: ICLR 2020;.
  • 9. Hassani K, Ahmadi AHK. Contrastive Multi-View Representation Learning on Graphs. In: ICML 2020;.
  • 10. You Y, Chen T, Sui Y, Chen T, Wang Z, Shen Y. Graph Contrastive Learning with Augmentations. In: NeurIPS 2020;.
  • 11. You Y, Chen T, Shen Y, Wang Z. Graph Contrastive Learning Automated. In: ICML 2021;.
  • 12. Suresh S, Li P, Hao C, Neville J. Adversarial Graph Augmentation to Improve Graph Contrastive Learning. In: NeurIPS 2021;.
  • 13. Chu G, Wang X, Shi C, Jiang X. CuCo: Graph Representation with Curriculum Contrastive Learning. In: IJCAI 2021;.
  • 14. Yin Y, Wang Q, Huang S, Xiong H, Zhang X. AutoGCL: Automated Graph Contrastive Learning via Learnable View Generators. In: AAAI 2022;.
  • 15. Tan Z, Ding K, Guo R, Liu H. Supervised Graph Contrastive Learning for Few-shot Node Classification; 2022. Available from: https://arxiv.org/abs/2203.15936 .
  • 16. Jia H, Ji J, Lei M. Supervised Contrastive Learning with Structure Inference for Graph Classification; 2022. Available from: https://arxiv.org/abs/2203.07691 .
  • 17. Velickovic P, Fedus W, Hamilton WL, Liò P, Bengio Y, Hjelm RD. Deep Graph Infomax. In: ICLR 2019;.
  • 18. Akkas S, Azad A. JGCL: Joint Self-Supervised and Supervised Graph Contrastive Learning. In: WWW(Companion Volume)’22;.
  • 19. Peng Z, Huang W, Luo M, Zheng Q, Rong Y, Xu T, et al. Graph Representation Learning via Graphical Mutual Information Maximization. In: WWW’20;.
  • 20. Qiu J, Chen Q, Dong Y, Zhang J, Yang H, Ding M, et al. GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training. In: SIGKDD’20;.
  • 21. Zhu Y, Xu Y, Yu F, Liu Q, Wu S, Wang L. Deep Graph Contrastive Representation Learning. CoRR. 2020;.
  • 22. Zhu Y, Xu Y, Yu F, Liu Q, Wu S, Wang L. Graph Contrastive Learning with Adaptive Augmentation. In: WWW’21;.
  • 23. Thakoor S, Tallec C, Azar MG, Azabou M, Dyer EL, Munos R, et al. Large-Scale Representation Learning on Graphs via Bootstrapping. In: ICLR; 2022.
  • 24. Bielak P, Kajdanowicz T, Chawla NV. Graph Barlow Twins: A self-supervised representation learning framework for graphs. Knowl Based Syst. 2022;.
  • 26. Dai H, Liu Z, Liao W, Huang X, Wu Z, Zhao L, et al. Chataug: Leveraging chatgpt for text data augmentation. arXiv preprint arXiv:230213007. 2023;.
  • 30. Rong Y, Huang W, Xu T, Huang J. DropEdge: Towards Deep Graph Convolutional Networks on Node Classification. In: ICLR 2020;.
  • 31. Wang Y, Wang W, Liang Y, Cai Y, Hooi B. GraphCrop: Subgraph Cropping for Graph Classification. CoRR. 2020;.
  • 32. Wang Y, Wang W, Liang Y, Cai Y, Liu J, Hooi B. NodeAug: Semi-Supervised Node Classification with Data Augmentation. In: KDD’20;.
  • 33. Zhou J, Shen J, Xuan Q. Data Augmentation for Graph Classification. In: CIKM’20;.
  • 34. Yoo J, Shim S, Kang U. Model-Agnostic Augmentation for Accurate Graph Classification. In: WWW’22;.
  • 35. Zhang Y, Zhu H, Song Z, Koniusz P, King I. Spectral Feature Augmentation for Graph Contrastive Learning and Beyond. In: AAAI. AAAI Press; 2023. p. 11289–11297.
  • 36. Morris C, Kriege NM, Bause F, Kersting K, Mutzel P, Neumann M. TUDataset: A collection of benchmark datasets for learning with graphs. CoRR. 2020;.
  • 37. Pan S, Zhu X, Zhang C, Yu PS. Graph stream classification using labeled and unlabeled graphs. In: ICDE 2013;.
  • 38. Yanardag P, Vishwanathan SVN. Deep Graph Kernels. In: SIGKDD 2015;.
  • 39. Adhikari B, Zhang Y, Ramakrishnan N, Prakash BA. Sub2Vec: Feature Learning for Subgraphs. In: PAKDD 2018;.
  • 40. Narayanan A, Chandramohan M, Venkatesan R, Chen L, Liu Y, Jaiswal S. graph2vec: Learning Distributed Representations of Graphs. CoRR. 2017;.
  • 41. Kipf TN, Welling M. Semi-Supervised Classification with Graph Convolutional Networks. In: ICLR 2017;.
  • DOI: 10.24963/ijcai.2021/317
  • Corpus ID: 235671564

CuCo: Graph Representation with Curriculum Contrastive Learning

  • Guanyi Chu , Xiao Wang , +1 author Xunqiang Jiang
  • Published in International Joint… 1 August 2021
  • Computer Science

Figures and Tables from this paper

figure 1

70 Citations

Accurate graph classification via two-staged contrastive curriculum learning.

  • Highly Influenced

Graph Self-Contrast Representation Learning

Select the best: enhancing graph representation with adaptive negative sample selection, multi-scale subgraph contrastive learning, spectral augmentations for graph contrastive learning, generative subgraph contrast for self-supervised graph representation learning, curriculum graph machine learning: a survey, self-supervised graph-level representation learning with adversarial contrastive learning, localgcl: local-aware contrastive learning for graphs, graphcoco: graph complementary contrastive learning, 28 references, graph contrastive learning with augmentations.

  • Highly Influential

Hierarchical Graph Representation Learning with Differentiable Pooling

Graph2vec: learning distributed representations of graphs, strategies for pre-training graph neural networks, infograph: unsupervised and semi-supervised graph-level representation learning via mutual information maximization, hard negative mixing for contrastive learning, deep graph infomax, how powerful are graph neural networks, inductive representation learning on large graphs, curriculumnet: weakly supervised learning from large-scale web images, related papers.

Showing 1 through 3 of 0 Related Papers

Navigation Menu

Search code, repositories, users, issues, pull requests..., provide feedback.

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly.

To see all available qualifiers, see our documentation .

  • Notifications You must be signed in to change notification settings

source code of IJCAI 2021 paper "Graph Representation with Curriculum Contrastive Learning"

BUPT-GAMMA/CuCo

Folders and files.

NameName
6 Commits

Repository files navigation

Environment settings.

  • Pytorch 1.4.0
  • torch-cluster==1.5.2
  • torch-scatter==1.3.2
  • torch-sparse==0.6.0
  • Python 100.0%

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • PMC10763937

Logo of plosone

Accurate graph classification via two-staged contrastive curriculum learning

Sooyeon shim.

Department of Computer Science and Engineering, Seoul National University, Seoul, Republic of Korea

Junghun Kim

Kahyun park, associated data.

All data and codes are available at the following link: https://github.com/snudatalab/TAG .

Given a graph dataset, how can we generate meaningful graph representations that maximize classification accuracy? Learning representative graph embeddings is important for solving various real-world graph-based tasks. Graph contrastive learning aims to learn representations of graphs by capturing the relationship between the original graph and the augmented graph. However, previous contrastive learning methods neither capture semantic information within graphs nor consider both nodes and graphs while learning graph embeddings. We propose TAG ( Two-staged contrAstive curriculum learning for Graphs ), a two-staged contrastive learning method for graph classification. TAG learns graph representations in two levels: node-level and graph level, by exploiting six degree-based model-agnostic augmentation algorithms. Experiments show that TAG outperforms both unsupervised and supervised methods in classification accuracy, achieving up to 4.08% points and 4.76% points higher than the second-best unsupervised and supervised methods on average, respectively.

Introduction

How can we generate graph representations for accurate graph classification? Graph neural network (GNN) has drawn the attention of researchers since it is applicable to real-world graph-structured data including social networks, molecular graphs, etc. Various GNNs have been proposed to solve graph classification [ 1 – 7 ].

A main challenge of accurate graph classification is to learn graph embeddings that reflect the crucial information within graphs. Contrastive learning has been widely used to address the issue and achieved superior performance on the graph classification task. Graph contrastive learning produces the representations of graphs based on the similarity between graphs. The learning algorithm can be used in both settings: unsupervised [ 8 – 14 ] and supervised settings [ 15 , 16 ].

Recent graph contrastive learning methods utilize data augmentation to ensure the similarity of the original graph and the newly generated graph. Random-based augmentations are used to generate graphs in [ 9 , 10 , 13 ], but information loss is inevitable in those methods. Graph contrastive learning methods with carefully designed augmentations [ 8 , 11 , 12 , 14 ] preserve more graph semantics compared to those with random-based ones; however, these methods increase the complexity of their models. Furthermore, none of the previous approaches optimize node embeddings which are the basis of graph embeddings.

In this paper, we propose TAG ( Two-staged contrAstive curriculum learning for Graphs ), an accurate graph contrastive learning approach that can be applied to both supervised and unsupervised graph classification. We design six model-agnostic augmentation algorithms that preserve the semantic information of graphs. Three algorithms change the features of nodes, and the other three modify the structure of graphs based on degree centrality. We then conduct graph contrastive learning in two levels: node-level and graph-level. Node-level contrastive learning learns node embeddings based on the relationship between nodes. Graph-level contrastive learning learns the embeddings of graphs based on node embeddings. The embeddings of all nodes within a graph are aggregated to generate a graph embedding. Thus, the relationships of both nodes and graphs are reflected in the graph representations. Furthermore, TAG exploits a curriculum learning strategy to enhance performance. Fig 1 shows the overall performance of TAG; note that TAG outperforms the competitors in both unsupervised and supervised settings.

An external file that holds a picture, illustration, etc.
Object name is pone.0296171.g001.jpg

(a-d) are the performance in unsupervised setting, and (e-h) are that in supervised one. Note that TAG shows the highest classification accuracy with the shortest running time in both settings.

Our main contributions are summarized as follows:

  • Data augmentation. We propose six model-agnostic augmentation algorithms for graphs. Every augmentation method considers node centrality to preserve semantic information of original graphs.
  • Method. We propose TAG, a two-staged contrastive curriculum learning method for accurate graph classification. The two-staged approach embeds the relational information of both nodes and graphs into the graph representations.
  • Experiments. We perform experiments on seven benchmark datasets in supervised and unsupervised settings, achieving the best performance.

Table 1 describes the symbols used in this paper. The code is available at https://github.com/snudatalab/TAG .

SymbolDescription
a set of graphs for training
-th graph in a set
, feature-modified graph originated from
, structure-modified graph originated from
, ′ structure-modified graph originated from ′ for ≠ ′
-th node in a graph
-th node in a graph ,

Related works

Node-level graph contrastive learning.

Node-level graph contrastive learning methods are designed to handle node classification task by capturing the relationship between nodes. DGI [ 17 ] is the first work that applies the concept of contrastive learning to the graph domain. JGCL [ 18 ] combines supervised setting, semi-supervised setting, and unsupervised setting to learn the optimal node representations. GMI [ 19 ] defines the concept of graph mutual information (GMI) and aims to maximize the mutual information in terms of node features and topology of graphs. GCC [ 20 ] learns transferable structural representation across various networks to guide the pre-training of graph neural networks. GRACE [ 21 ] jointly considers both topology and node attribute levels for corruption to generate graph views and maximizes the agreement in the views at the node level. Zhu et al. [ 22 ] propose GCA which removes unimportant edges by giving them large removal probabilities on the topology level and adds more noise to unimportant feature dimensions on the node attribute level for adaptive augmentation. BGRL [ 23 ] is a scalable method with two encoders that learns by predicting alternative augmentations of the input. Graph Barlow Twins (G-BT) [ 24 ] is a model that replaces negative samples with a cross-correlation-based loss function and does not introduce asymmetry in the network. black However, those previous approaches for node-level graph contrastive learning address only the node classification problem, making them unsuitable for graph classification problem.

Graph-level graph contrastive learning

Graph-level graph contrastive learning aims to obtain graph representations to solve graph classification task. Previous graph-level contrastive learning methods are divided into two types: model-specific and model-agnostic ones. Model-agnostic approaches use augmentation algorithms which do not engage in the training process. GraphCL [ 10 ] brings the contrastive learning method for images to the graph domain. CuCo [ 13 ] extends GraphCL by applying curriculum learning to properly learn from the negative samples. MVGRL [ 9 ] learns graph-level representations by contrasting encodings from first-order neighbors and graph diffusion. These methods use random-based graph augmentations that cannot preserve the core information of graphs well. We propose a graph contrastive learning method along with degree-based augmentations to address the issue.

Model-specific augmentation approaches directly participate in the training process. InfoGraph [ 8 ] learns graph representations by contrasting them with patch-level representations obtained from the training process. You et al. [ 11 ] propose JOAO which changes the simple augmentations to be learnable. AD-GCL [ 12 ] adopts the structure of an adversarial attack to obtain graph representations. AutoGCL [ 14 ] generates new graphs by changing the softmax function into the Gumbel-Softmax function. black However, those approaches for graph-level graph contrastive learning are more complex than model-agnostic methods, significantly increasing the training time. Therefore, we propose a contrastive learning method with simple augmentations for computational efficiency.

Graph augmentation

Data augmentation has garnered significant attention recently, due to its successful application to many domains including image classification [ 25 ], natural language processing (NLP) [ 26 ], human activity recognition (HAR) [ 27 , 28 ], and cognitive engagement classification [ 29 ]. Among them, graph augmentation methods are actively studied for improving the performance of graph contrastive learning.

Graph augmentation algorithms are divided into two types: model-specific and model-agnostic augmentation. Model-specific augmentation algorithms are restricted to a certain model. black Thus, those augmentation methods are not easy to be directly used in graph contrastive learning.

Model-agnostic graph augmentations are applied to any graph neural network. You et al. [ 10 ] suggest DropNode and ChangeAttr for graph contrastive learning. DropNode discards randomly selected nodes with their connections and ChangeAttr converts features of randomly selected nodes into random values. DropEdge [ 30 ] changes graph topology by removing a certain ratio of edges. GraphCrop [ 31 ] selects a subgraph from a graph through a random walk. Wang et al. [ 32 ] introduce NodeAug which contains three different augmentations: ReplaceAttr, RemoveEdge, and AddEdge. ReplaceAttr substitutes the feature of a chosen node with the average of its neighboring nodes’ features. RemoveEdge discards edges based on the importance score of edges. AddEdge attaches new edges to a central node which is designated based on the importance score for nodes. Motif-similarity [ 33 ] adds and deletes edges from motifs that are frequent in a particular graph. Yoo et al. [ 34 ] proposes NodeSam and SubMix. NodeSam performs split and merge operations on nodes. SubMix replaces a subgraph of a graph with another subgraph cut off from another graph. black SFA [ 35 ] proposes a spectral feature argumentation for contrastive learning on graphs.

However, previous model-agnostic augmentation algorithms [ 10 , 31 – 34 ] change nodes or edges that are randomly selected, which easily overlook the semantic information of the original graphs. Another limitation is that previous approaches change only node attributes [ 35 ] or graph structures [ 30 , 33 ], restricting the diversity of augmented examples. On the other hand, TAG changes both node attributes and graph structures based on the degree centrality to preserve crucial information of graphs.

Preliminary on graph contrastive learning

In this section, we describe the preliminary of our work. Contrastive learning aims to learn embeddings by capturing the relational information between instances. For each instance, positive and negative samples need to be defined to maximize the similarity between a given instance and a positive sample compared to negative samples. Graph contrastive learning operates on graph-structured data. Recent works utilize data augmentation to generate positive samples. Previous graph contrastive learning methods are divided into two categories: node-level and graph-level contrastive learning.

Node-level graph contrastive learning methods obtain node embeddings of a graph. Given a graph, previous approaches augment the given graph and contrast nodes of the given graph and the augmented graph. A pair of nodes from two graphs at the same position is defined as positive samples and all other nodes except for positive samples are defined as negative samples. The model then learns the similarity of a positive pair against a negative pair. Graph-level graph contrastive learning methods learn graph embeddings by contrasting the graphs. Previous approaches set two augmented graphs with the same origin as positive samples and all other graphs in the training set except for the original graph as negative samples. Graph-level contrastive learning models then capture the similarity between a positive pair of graphs compared to a negative pair.

Despite the decent performance of graph contrastive learning, there is still a room for improvement. First, the relationship between node and graph embeddings has not been studied. Even though graph embeddings are obtained based on node embeddings, previous graph contrastive learning methods do not consider node embeddings. Second, most augmentation algorithms for contrastive learning randomly select nodes or edges to be modified. Since node feature and graph topology are the most essential components of graph-structured data, augmenting graphs while preserving crucial information within the pivotal components is important. However, previous methods rely on random-based augmentation algorithms which inevitably involve information loss. Finally, the influence of both positive and negative samples has not been studied. Previous methods focus on either positive or negative samples. To improve the performance of graph contrastive learning, well-defining both positive and negative samples is important. In this work, we propose TAG which addresses the three issues.

Proposed method

We propose TAG, a two-staged contrastive curriculum learning framework for graphs. The main challenges and our approaches are as follows:

  • How can we generate graph representations in both unsupervised and supervised settings? We propose a two-staged graph contrastive curriculum learning method that is applied to both settings through two types of loss functions.
  • How can we design augmentations for contrastive learning to preserve the semantics well? We propose six data augmentation algorithms for graph contrastive learning. The augmentation algorithms consider degree centrality to minimize information loss.
  • How can we determine the order of feeding the negative examples in contrastive learning? We exploit curriculum learning to determine the order of negative samples and maximize the performance of the model.

The overall process of TAG is illustrated in Figs ​ Figs2 2 and ​ and3. 3 . Fig 2 explains how the proposed method learns a training set. Fig 3 illustrates the details of performing augmentation and contrastive learning. Given a graph dataset, we first augment graphs, and then perform contrastive curriculum learning in two levels: nodes and graphs.

An external file that holds a picture, illustration, etc.
Object name is pone.0296171.g002.jpg

TAG first augments all graphs in a training set D , and then performs node-level and graph-level contrastive curriculum learning. For contrastive learning, TAG defines positive and negative samples, and computes the similarity between them. The proposed method learns negative samples from easy to hard ones which is determined based on the similarity.

An external file that holds a picture, illustration, etc.
Object name is pone.0296171.g003.jpg

TAG performs node-level and graph-level contrastive learning on the feature-augmented graph G f , i and the structure-augmented graph G s , i obtained from the original graph G i . In the contrastive learning steps, nodes and graphs colored with blue are positive samples, and those colored with red are negative ones.

Data augmentation

Our goal is to design data augmentation algorithms that minimize the information loss of graphs. Data augmentation is used to ensure the similarity between samples in contrastive learning. The most important challenge of augmentation is preserving the semantics, or keeping crucial information in determining graph labels. If the semantics are not preserved well in the process of augmentation, the original graph and the augmented graph would have different labels, resulting in increased dissimilarity. Therefore, we propose six model-agnostic graph augmentation algorithms based on degree centrality to minimize information loss. Our idea is to change low-degree nodes to minimize the loss of semantics.

We categorize the six augmentation methods into two types: feature and structure modification. Feature modification algorithms generate new graphs by changing only the node feature. On the other hand, structure modification algorithms change the graph structure. We propose three algorithms for each type. The three algorithms designed for feature augmentation are listed as follows:

  • Edit feature. Randomly change the features of n nodes with the lowest degrees.
  • Mix feature. Mix the features of two selected nodes and then substitute the mixed features for the features of nodes with lower degrees. Repeat this process n times.
  • Add noise. Add noise to the features of selected nodes. n nodes with the lowest degrees are selected to be modified.

The algorithms for structure augmentation are as follows:

  • Delete node. Discard n nodes with the lowest degrees along with their connections.
  • Delete edge. Select m edges from nodes with the lowest degrees. Remove the selected edges.
  • Cut subgraph. Select a subgraph with high-degree nodes.

n and m denote the number of nodes and edges to be modified, respectively. n and m are decided according to the augmentation ratio which is given as a hyperparameter. All algorithms consider degree centrality to keep semantic information.

Algorithm 1 TAG (Two-staged Contrastive Curriculum Learning for Graphs)

Input: training set D = { G i } i = 1 N of graphs, graph neural network f with parameters θ , and number T of training epochs

Output: the trained graph neural network f

1: for G i ∈ D do

2:   A f ← select a feature modification algorithm at random

3:   A s ← select a structure modification algorithm at random

4:   G f , i , G s , i ← augment a graph G i with A f , A s

6: for t ← 1 to T do

7:   for i ← 1 to N do

8:    l n ( i ) ← ContrastNode ( G i , G f , i , f )     ⊳ Algorithm 2

9:    l g ( i ) ← ContrastGraph ( G f , i , { G s , i } i = 1 N , f )     ⊳ Algorithm 3

10:   end for

11:   L ← - ( 1 / N ) Σ i ∈ D ( l n ( i ) + l g ( i ) )     ⊳ Eq 3

12:   θ ← update the parameters to minimize L

13: end for

Two-staged contrastive learning

We propose a graph contrastive learning model for accurate graph classification utilizing all the proposed augmentation algorithms. Graph contrastive learning is a self-supervised approach that allows a model to learn the representations of graphs without labels by teaching the model which graph instances are similar or different. We use the data augmentation algorithms proposed in the Data augmentation section to generate similar graphs. Considering the fact that graph embeddings are obtained based on node embeddings, learning the representative embeddings from both nodes and graphs is important. We propose TAG which conducts graph contrastive learning on two stages: node-level and graph-level.

Algorithm 1 shows the overall training process of TAG. Given a training set D of graphs, we first augment graphs in D before training, and then perform two-staged contrastive curriculum learning. Node-level contrastive curriculum learning captures the relational information between nodes in a graph G i and a feature-modified graph G f , i (line 8 in Algorithm 1). Graph-level contrastive learning extracts representative graph embeddings by maximizing the similarity between graphs G f , i and G s , i with the same origin (line 9 in Algorithm 1). A graph neural network is trained by minimizing the proposed two-staged contrastive loss (line 12 in Algorithm 1).

In the following, we first explain the two-staged approach of TAG in detail. Then, we describe how to apply TAG for supervised graph classification and how to exploit curriculum learning for determining the order of negative samples.

Algorithm 2 ContrastNode in TAG

Input: original graph G i = ( V i , E i , X i ) , feature-augmented graph G f , i = ( V f , i , E f , i , X f , i ) , and graph neural network f with parameters θ

Output: node-level contrastive loss l n ( i ) for a graph G i

1: for j ∈ V i do

2:  ( v j , u j ) ← select a positive pair of nodes from V i , V f , i

3:   for k ∈ V i do

4:    if k ≠ j then

5:    v k , u k ← select negative nodes from V i , V f , i

6:     x j , x k , x f , j , x f , k ← get feature vectors of nodes v j , v k , u j , u k

7:     v j , v k , u j , u k ← f ( x j , θ ), f ( x k , θ ), f ( x f , j , θ ), f ( x f , k , θ )

8:     S ( v j , v k ) ← | v j · v k | / | v j | | v k |

9:     S ( v j , u k ) ← | v j · u k | / | v j | | u k |

10:    end if

11:   end for

12:  Sort negative nodes according to S in the ascending order

14: Compute l n ( i )     ⊳ Eq 1

Node-level contrastive learning

The objective of the node-level contrastive learning in TAG is to learn meaningful node representations by embedding the nodes into a latent space where positive pairs of nodes are more closely located than negative ones. Positive pairs ( v j , u j ) of nodes are obtained by selecting a node v j from an original graph G i , and a node u j from a feature-augmented graph G f , i with the same position. We utilize all of the proposed augmentations by randomly selecting an augmentation algorithm for a graph from the proposed algorithms.

There are two types of negative node pairs: 1) pairs ( v j , v k ) of nodes both sampled from the original graph G i , and 2) pairs ( v j , u k ) of nodes sampled from G i and G f , i , respectively. All nodes in G i which are not selected for the positive pairs are used to generate the negative samples v k . Similarly, every node u k from G f , i except for the selected positive node u j is treated as a negative sample. The process of sampling positive and negative pairs of nodes for the node-level contrastive learning is illustrated in Fig 4 .

An external file that holds a picture, illustration, etc.
Object name is pone.0296171.g004.jpg

Nodes v j and v k are selected from the original graph G i while nodes u j and u k are sampled from a feature-augmented graph G f , i at the same position.

The node-level contrastive loss l n is defined as follows:

where sim(⋅) denotes the cosine similarity function, τ is the temperature parameter, and K is the number of nodes in a graph. Vectors v j and u j are the hidden representations of nodes v j and u j , respectively. Algorithm 1 shows the process of calculating node-level contrastive loss. We exploit curriculum learning and compute the loss with reordered negative samples whose ordering is determined in line 12 of Algorithm 1. We feed negative samples from easy to hard ones where the difficulty of a negative sample is defined as the cosine similarity of the sample and its paired positive sample.

Algorithm 3 ContrastGraph in TAG

Input: feature-augmented graph G f , i , structure-augmented graphs { G s , i } i = 1 N , and graph neural network f with parameters θ

Output: raph-level contrastive loss l g ( i ) for a graph G i

1: ( G f , i , G s , i ) ← select a positive pair of graphs

2:   i ′ ← 1 to N do

3:    if i ′ ≠ i then

4:     G s , i ′ ← select a negative graph

5:     z f , i , z s , i ′ ← average node embeddings within G f , i , G s , i ′

6:     S ( G f , i , G s , i ′ ) ← | z f , i · z s , i ′ | / | z f , i | | z s , i ′ |

7:    end if

8:   end for

9: Sort negative graphs according to S in the ascending order

10: Compute l g ( i )    ⊳ Eq 2

Graph-level contrastive learning

Graph-level contrastive learning in TAG aims to obtain representative graph embeddings. Graph embeddings are learned by collecting all node embeddings within a graph with the average function. As with node-level contrastive learning, positive and negative samples are defined using augmentation in graph-level contrastive learning.

A positive pair ( G f , i , G s , i ) of graphs contains a feature-modified graph G f , i and a structure-modified graph G s , i of a graph G i . Feature modification and structure modification algorithms are randomly chosen from the proposed augmentation algorithms. Negative pairs are ( G f , i , G s , i ′ ) where G i ′ is a different graph from G i . Fig 5 explains positive and negative samples designed for graph-level contrastive learning.

An external file that holds a picture, illustration, etc.
Object name is pone.0296171.g005.jpg

( G f , i , G s , i ) is a positive pair originated from a graph G i , and ( G f , i , G s , i ′ ) for i ≠ i ′ are negative pairs.

The graph-level contrastive loss l g is written as below:

where z ⋅, i is a representation of graph G ⋅, i and N is the number of graphs for training. Algorithm 3 describes the process of calculating graph-level contrastive loss where graph representations are obtained based on node representations in line 5. We reorder the negative samples in line 9 of Algorithm 3 to maximize the performance of TAG by exploiting curriculum learning. TAG trains the samples gradually from easy to hard ones where a negative pair of graphs with low similarity is regarded as an easy sample.

The final loss function L for TAG jointly uses the node-level and graph-level contrastive losses. Given a set D of graphs for training,

where l n ( i ) and l g ( i ) are node- and graph-level losses for a graph G i , respectively.

Supervised contrastive learning

To further improve the performance of TAG, we design the proposed method to operate in the supervised setting as well. In supervised graph classification, the labels of graphs are available while training. To exploit the information of the given labels, we use the typical cross-entropy loss l ce (⋅). Specifically, the loss l ce ( y i , y ^ i ) between the one-hot encoded label y i and the prediction probability y ^ i of a graph G i is computed as follows:

where C is the number of classes, y i ( c ) is c -th element of y i , and y ^ i ( c ) is the prediction probability of a graph G i to class c . Node and graph representation vectors in Eqs 1 and 2 are learned using a graph neural network. For supervised graph classification, we attach a fully-connected layer to the final layer of the graph neural network to construct TAG as an end-to-end model. The probability vector y ^ i is obtained through the softmax function after a fully-connected layer.

To fully exploit both the result of the two-staged contrastive learning and the information of given labels while training, we minimize the supervised loss l ce ( y i , y ^ i ) and the two-staged contrastive loss L simultaneously. Thus, the loss L sup for supervised learning is computed by adding the cross-entropy loss to the loss in Eq (3) :

where l n ( i ) and l g ( i ) are node- and graph-level losses for a graph G i , respectively. N denotes the size of a set D .

Curriculum learning

Curriculum learning imitates the learning process of humans who start learning from easier samples, and then learn more from harder samples. To further improve the performance of TAG, we reorder the samples for training by exploiting the curriculum learning strategy. A naive approach would define negative samples that are misclassified with high probability as hard samples. However, this is not directly applicable to the contrastive learning methods including TAG since the labels may not be given.

To determine the difficulty of samples regarding the two-staged contrastive loss, we utilize the similarity between positive and negative samples. The sample with a large loss is hard to learn because loss minimization is difficult. However, it is hard to use the loss as a difficulty measure since reordering should be done before loss calculation. Thus, we define the cosine similarity of nodes in a negative pair which affects the size of the loss as a difficulty score. If a negative sample is similar to a positive sample, the model struggles to find the difference between the samples causing a large loss. We feed negative samples with lower similarity first, and then move on to harder negative samples as training continues to facilitate effective training. Both node-level and graph-level contrastive learning train negative samples gradually from easy to hard ones.

Experiments

We perform experiments to answer the following questions:

  • Q1. Performance on Unsupervised Classification. How fast and accurate is TAG compared to previous methods for unsupervised graph classification?
  • Q2. Performance on Supervised Classification. Does TAG show superior performance than other baselines in supervised graph classification task?
  • Q3. Effectiveness of Proposed Augmentations. Do the proposed augmentation algorithms improve the performance of TAG?
  • Q4. Ablation Study. Does each step of TAG contribute to the performance of the unsupervised graph classification task?

Experimental settings

We introduce our experimental settings including datasets, competitors, and hyperparameters. All of our experiments are conducted on a single GPU machine with GeForce GTX 1080 Ti.

Datasets. We use seven benchmark datasets for graph classification task in our experiments, which are summarized in Table 2 . MUTAG, PROTEINS, NCI1, NCI109, DD, and PTC-MR [ 36 ] are molecular datasets where the nodes stand for atoms and are labeled by the atom type, while edges are bonds between the atoms. DBLP [ 37 ] is a citation network dataset in the computer science field whose nodes represent scientific publications.

DatasetGraphsNodesEdgesFeaturesClasses
1883,3713,72172
1,11343,47181,04432
4,110122,747132,753372
4,127122,494132,604382
1,178334,925843,046892
3444,9155,054182
19,456203,954764,51241,3252

1 https://chrsmrrs.github.io/datasets/

Competitors. We compare TAG in supervised and unsupervised settings. For the unsupervised setting, we compare TAG with ten previous approaches for unsupervised graph classification, including those for contrastive learning.

  • DGK [ 38 ] learns latent representations of graphs by adopting the concept of the skip-gram model.
  • sub2vec [ 39 ] is an unsupervised learning algorithm that captures two properties of subgraphs: neighborhood and structure.
  • graph2vec [ 40 ] extends neural networks for document embedding to the graph domain, by viewing the graphs as documents.
  • InfoGraph [ 8 ] generates graph representations by maximizing mutual information between graph-level and patch-level representations.
  • MVGRL [ 9 ] learns graph representations by contrasting two diffusion matrices transformed from the adjacency matrix.
  • GraphCL [ 10 ] brings image contrastive learning to graphs.
  • JOAO [ 11 ] jointly optimizes augmentation selection together with the contrastive objectives.
  • AD-GCL [ 12 ] uses an adversarial training strategy for edge-dropping augmentation of graphs.
  • CuCo [ 13 ] adopts curriculum learning to graph contrastive learning for performance improvement.
  • AutoGCL [ 14 ] uses node representations to predict the probability of selecting a certain augment operation.

We use support vector machine (SVM) and multi-layer perceptron (MLP) as base classifiers to evaluate the competitors and TAG in an unsupervised setting. We select an SVM classifier among various machine learning classifiers for a fair comparison since the competitors use SVM to evaluate their methods. To evaluate methods in deep learning as well as in machine learning, we exploit an MLP classifier.

In the supervised setting, we compare the accuracy of TAG with 4 baselines:

  • GCN+GMP [ 41 ] uses the graph convolutional network (GCN) to learn the node representations, and the global mean pooling (GMP) is applied to obtain the graph representation.
  • GIN [ 5 ] uses multi-layer perceptrons (MLP) to update node representations, and sums them up to generate the graph representation.
  • ASAP [ 6 ] alternatively clusters nodes in a graph and gathers the representations of clusters to obtain graph representations.
  • GMT [ 7 ] designs graph pooling layer based on multi-head attention.

We run 10-fold cross-validation to evaluate the competitors and TAG.

Hyperparameters. We use GCN [ 41 ] to learn node embeddings and apply the global mean pooling algorithm to generate a graph embedding. We set the augmentation ratio which decides the amount of data to be changed to 0.4. The ratio is the only hyperparameter for data augmentation of TAG. Thus, TAG does not suffer from hyperparameter optimization problems. We train each model using the Adam optimizer with a learning rate of 0.0001. We set the number of epochs to 5.

Performance on unsupervised classification

We evaluate unsupervised graph classification accuracy and running time of TAG. The graph classification accuracy of TAG and previous unsupervised methods are described in Table 3 . We adopt support vector machine (SVM) and multi-layer perceptron (MLP) as base classifiers for TAG and the baselines. Note that TAG achieves the best accuracy, giving 4.08% points and 2.14% points higher accuracy than the second-best competitors on average in SVM and MLP classifiers, respectively.

Bold and underlined text denote the best and the second-best accuracy, respectively. OOM and Avg. denote the out of memory error and average accuracy, respectively. Note that TAG shows the best classification accuracy.

MethodUnsupervised Setting (SVM)Unsupervised Setting (MLP)
MUT.PROT.NCI1N109DDPTCDBLPAvg.MUT.PROT.NCI1N109DDPTCDBLPAvg.
DGK [ ]85.6766.6765.4765.8673.3561.0379.4971.0866.4667.4054.1853.0262.3149.1075.2561.10
sub2vec [ ]74.4271.8857.4057.2668.2553.8062.7163.6755.2653.9151.7551.5647.9551.2060.3753.15
graph2vec [ ]68.5759.5754.0452.4157.9860.5055.1258.3164.9456.7954.3652.1755.3455.8454.6956.30
InfoGraph [ ]85.12 75.65 72.52 78.2766.9280.4876.0971.7869.2860.27 73.0757.6176.4966.96
MVGRL [ ]83.33OOMOOMOOMOOM64.71OOM21.15 89.47 OOMOOMOOMOOM58.82OOM21.19
GraphCL [ ]86.6774.4866.8967.2678.8661.9376.3373.2075.0669.3760.1059.4971.7360.5075.2867.36
JOAO [ ]88.2574.7666.8467.1779.9662.5076.3473.6975.5369.2860.1759.71 73.18 63.1975.2168.04
AD-GCL [ ]88.8275.0371.8571.4676.3957.6181.6574.6981.7662.1458.0959.5661.0855.9777.5165.16
CuCo [ ]87.3173.5064.3363.8376.66 72.78 76.3473.5471.2366.5960.4658.89 57.2573.4665.91
AutoGCL [ ] 89.42 74.9371.43 81.69 67.50 82.82 77.45 84.21 70.54 59.98 72.03 68.03 79.06 70.63
73.47 72.38 60.53 58.6372.52

The overall performance in the unsupervised setting of TAG with two classifiers including the running time is summarized in Figs ​ Figs6 6 and ​ and7. 7 . Fig 6 shows the results of TAG and previous approaches with an SVM classifier. Note that TAG shows the highest classification accuracy in most cases with the shortest running time. This shows that TAG effectively and efficiently finds the graph representations for unsupervised graph classification from large graphs. Fig 7 shows the accuracy and running time of TAG and the competitors measured with an MLP classifier. TAG outperforms the competitors for most datasets.

An external file that holds a picture, illustration, etc.
Object name is pone.0296171.g006.jpg

Note that TAG shows the highest classification accuracy with the shortest running time in most cases.

An external file that holds a picture, illustration, etc.
Object name is pone.0296171.g007.jpg

(a-g) show the accuracy and running time of each dataset. TAG outperforms the competitors in most cases.

Performance on supervised classification

TAG also operates in the supervised graph classification task in addition to the unsupervised one. We compare TAG with four baselines for supervised graph classification in Table 4 . We use classification accuracy and running time as the evaluation metrics. Note that TAG gives the highest accuracy, with 4.76% points higher average accuracy than the second-best method. Specifically, TAG in the supervised setting achieves 4.50% points and 13.26% points higher average accuracy than that in the unsupervised setting in SVM and MLP classifiers, respectively.

Bold and underlined text denote the best and the second-best accuracy, respectively. Avg. denotes the average accuracy. Note that TAG shows the best accuracy.

Supervised Setting
MethodMUT.PROT.NCI1N109DDPTCDBLPAvg.
GCN+GMP [ ]82.3573.5663.2163.1962.0075.3579.8871.36
GIN [ ] 90.49 75.7773.9470.0077.7873.7777.3177.01
ASAP [ ]86.5277.56 74.19 74.51 82.6671.8390.9879.75
GMT [ ]89.07 79.73 73.6773.93 84.95 75.74 91.82 81.27

Fig 8 shows the classification accuracy and the running time of TAG and baselines in a supervised setting. Note that TAG gives the shortest running time with the highest accuracy in most of the cases. This shows that TAG efficiently learns meaningful graph representations not only for unsupervised graph classification, but also supervised one.

An external file that holds a picture, illustration, etc.
Object name is pone.0296171.g008.jpg

(a-g) show the performance in each dataset. Note that TAG shows the highest classification accuracy with the shortest running time for most datasets.

Effectiveness of proposed augmentations

We compare the proposed augmentations of TAG with eight previous model-agnostic augmentation algorithms for graphs. ChangeAttr modifies features and the other methods change the structure of graphs. Recall that TAG performs graph contrastive learning in two levels: node-level and graph-level. For node-level, TAG needs feature-augmented graphs. For graph-level, TAG needs feature and structure augmentations. Thus, both augmentation algorithms are necessary for TAG. MVGRL [ 9 ], GraphCL [ 10 ], and CuCo [ 13 ] are previous methods that adopt model-agnostic graph augmentations. However, MVGRL causes out-of-memory errors for large-scale graph datasets. CuCo is more elaborate than GraphCL since it additionally performs curriculum learning. Therefore, we compare TAG with previous augmentation algorithms by applying them to CuCo.

Table 5 shows the classification results using different augmentations. The accuracy is measured with an SVM classifier. TAG outperforms the baselines in most cases. Specifically, TAG achieves 5.05% points higher average accuracy than the strongest baseline SubMix. Note that random-based augmentations DropNode, DropEdge, GraphCrop, and ChangeAttr degrade the performance of CuCo for all datasets. This proves that random-based augmentation methods have difficulty preserving the semantics. In contrast, TAG with the proposed augmentations help enhance the performance.

We report the best and the second-best accuracy with bold and underlined texts, respectively. Avg. denotes the average accuracy. Note that TAG presents the best accuracy among the models.

MethodMUT.PROT.NCI1N109DDPTCDBLPAvg.
CuCo + DropNode [ ]87.3173.5064.3663.8076.7560.5276.3371.80
CuCo + DropEdge [ ]88.8672.6163.7263.4177.5064.8176.3372.46
CuCo + GraphCrop [ ]88.2872.9663.2463.3277.1763.3771.5971.42
CuCo + ChangeAttr [ ]86.2373.6863.6063.7576.6661.0569.8770.69
CuCo + NodeAug [ ]82.4674.2464.4863.9379.7078.3378.7774.56
CuCo + Motif-Similarity [ ] 90.00 70.68 66.64 63.9278.7977.5077.1974.96
CuCo + NodeSam [ ]89.1176.7464.2364.69 82.05 78.98 76.42
CuCo + SubMix [ ]89.04 76.97 64.48 67.99 78.94 78.77 76.48
78.36

We also show the effectiveness of the degree-based node and edge selection of TAG for graph augmentation. We compare TAG with two different selection methods: TAG-random and TAG-reverse. TAG-random randomly selects the nodes or edges to be changed. TAG-reverse selects the nodes or edges from high to low degrees. Table 6 reports the classification accuracy of TAG and the baselines. We use SVM and MLP classifiers to measure the accuracy. Note that TAG outperforms the baselines in all datasets. Specifically, TAG achieves up to 4.36% points and 4.19% points higher average accuracy than the second-best baselines in SVM and MLP classifiers, respectively. This shows that the proposed augmentations of TAG considering the degree centrality effectively improves the graph classification accuracy.

TAG-random runs TAG by randomly selecting nodes or edges to be modified. TAG-reverse augments nodes or edges relevant to high degrees. Bold, underlined, and Avg. texts denote the best, the second-best, and the average accuracy, respectively.

MethodUnsupervised Setting (SVM)Unsupervised Setting (MLP)
MUT.PROT.NCI1N109DDPTCDBLPAvg.MUT.PROT.NCI1N109DDPTCDBLPAvg.
TAG-random80.55 81.08 64.9865.3181.9572.6378.3074.97 88.89 68.75 56.45 56.2669.3372.02 68.34 68.58
TAG-reverse 89.47 79.28 67.64 68.93 83.05 73.53 78.31 77.17 83.33 69.37 55.72 56.42 71.19 74.29 63.0767.62

Ablation study

We perform an ablation study for TAG and report the results in Table 7 . The methods w/o curriculum and w/o node-level are TAG without the curriculum learning and the two-staged structure performing only graph-level contrastive learning, respectively. We also run TAG while fixing the proposed augmentations. black Since TAG needs both feature and structure augmentation algorithms to conduct two-staged contrastive learning, we evaluate the performance of pairs of algorithms. For example, the ‘Edit feature + Delete node’ runs TAG using ‘edit feature’ and ‘delete node’ algorithms for feature and structure modification, respectively.

We report accuracies of graph classification using SVM and MLP classifiers. Bold, underlined, and Avg. texts denote the best, the second-best, and the average accuracy, respectively. The methods w/o curriculum and w/o node-level refer to TAG without the curriculum learning and the node-level contrastive learning, respectively. The fixed augmentation methods (Edit feature + Delete node, Edit feature + Delete edge, etc.) run TAG by using the same feature and structure augmentations for all graphs, while TAG randomly selects an augmentation for each graph. Note that TAG shows the best performance for all cases.

MethodUnsupervised Setting (SVM)Unsupervised Setting (MLP)
MUT.PROT.NCI1N109DDPTCDBLPAvg.MUT.PROT.NCI1N109DDPTCDBLPAvg.
w/o curriculum90.0778.1465.9964.6680.3670.0878.0675.3391.1867.5854.8855.1569.7370.7776.6869.42
w/o node-level90.4077.0765.2065.2580.1469.3277.4874.9886.0966.8054.3655.0670.3470.9476.7168.61
Edit feature + Delete node90.7080.7567.2565.0681.09 80.5477.8192.7069.9058.0856.91 75.1077.3971.76
Edit feature + Delete edge90.5481.1166.6164.4180.25 79.04 83.1477.8787.5569.9659.55 58.78 68.6076.5876.9971.14
Edit feature + Cut subgraph90.8280.8566.1363.5480.0866.3078.5875.1992.47 55.1656.5867.6374.9477.3370.72
Mix feature + Delete node90.7679.93 72.91 86.32 71.9382.7179.83 92.94 70.35 65.9874.4772.9670.89
Mix feature + Delete edge90.65 81.98 73.05 73.91 85.5277.77 83.56 80.85 88.3070.0159.3257.3170.7170.2673.1769.87
Mix feature + Cut subgraph90.7778.95 74.55 82.5478.5683.0380.45 69.4560.1056.1969.40 78.00 71.89
Add noise + Delete node 90.83 80.6367.0664.3582.1171.5978.4576.4391.3768.3656.9458.1068.7072.8976.9170.47
Add noise + Delete edge90.3681.0565.4464.6380.7571.9083.3076.7891.6269.4958.8655.9270.76 76.81 77.3071.54
Add noise + Cut subgraph90.7581.3066.6265.8481.1171.2079.0476.5590.0170.7358.0158.1369.2475.5877.6771.34
73.4772.38 78.36 91.7270.65 60.53 58.63 72.52 76.06

TAG with the curriculum learning improves the classification performance of SVM and MLP by 6.20% and 3.35% points on average, respectively, compared to that without the curriculum learning. Using both node-level and graph-level contrastive learning on TAG achieves 6.55% and 4.16% points higher average accuracy than using only graph-level contrastive learning on TAG in SVM and MLP classifiers, respectively. Experimental results of fixing the proposed augmentations show higher accuracies than the methods w/o curriculum and w/o node-level. The results prove that the proposed augmentation algorithms preserve the semantics well since the accuracies of the fixed augmentation methods are comparable to TAG. Furthermore, TAG achieves the best performance when it utilizes all the proposed augmentation algorithms. The results show that the proposed ideas, i.e., the two-staged framework, exploitation of curriculum learning, and the proposed augmentation algorithms for contrastive learning improve the accuracy of graph classification.

We propose TAG, a two-staged contrastive curriculum learning model for graphs. We introduce two types of data augmentations for graphs and propose six model-agnostic augmentation algorithms that minimize information loss. TAG conducts contrastive curriculum learning in two stages. In the first stage, TAG gathers the relational information between nodes from an original graph and a feature-modified graph. In the second stage, the proposed method utilizes both feature-modified and structure-modified graphs to learn the similarity between them. We exploit curriculum learning to effectively train the model via carefully selected ordering of negative samples. We evaluate TAG by measuring the graph classification accuracy and running time. TAG shows the fastest running time and the best accuracy achieving up to 4.08% points and 4.76% points higher average accuracy than the second-best competitors in unsupervised and supervised settings, respectively. Future works include designing an accurate graph classification method for hypergraphs.

Funding Statement

This work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) Flexible and Efficient Model Compression Method for Various Applications and Environments (2020-0-00894), Artificial Intelligence Graduate School Program (Seoul National University) (2021-0-01343), and Artificial Intelligence Innovation Hub (Artificial Intelligence Institute, Seoul National University) (2021-0-02068). The Institute of Engineering Research at Seoul National University provided research facilities for this work. The Institute of Computer Technology at Seoul National University provides research facilities for this study. The funders had no role in the methodology of the study.

Data Availability

Subscribe to the PwC Newsletter

Join the community, edit social preview.

graph representation with curriculum contrastive learning

Add a new code entry for this paper

Remove a code repository from this paper, mark the official implementation from paper authors, add a new evaluation result row.

TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK REMOVE
  • CONTRASTIVE LEARNING
  • GRAPH REPRESENTATION LEARNING
  • REPRESENTATION LEARNING

Remove a task

graph representation with curriculum contrastive learning

Add a method

Remove a method.

  • CONTRASTIVE LEARNING -

Edit Datasets

Adversarial curriculum graph contrastive learning with pair-wise augmentation.

16 Feb 2024  ·  Xinjian Zhao , Liang Zhang , Yang Liu , Ruocheng Guo , Xiangyu Zhao · Edit social preview

Graph contrastive learning (GCL) has emerged as a pivotal technique in the domain of graph representation learning. A crucial aspect of effective GCL is the caliber of generated positive and negative samples, which is intrinsically dictated by their resemblance to the original data. Nevertheless, precise control over similarity during sample generation presents a formidable challenge, often impeding the effective discovery of representative graph patterns. To address this challenge, we propose an innovative framework: Adversarial Curriculum Graph Contrastive Learning (ACGCL), which capitalizes on the merits of pair-wise augmentation to engender graph-level positive and negative samples with controllable similarity, alongside subgraph contrastive learning to discern effective graph patterns therein. Within the ACGCL framework, we have devised a novel adversarial curriculum training methodology that facilitates progressive learning by sequentially increasing the difficulty of distinguishing the generated samples. Notably, this approach transcends the prevalent sparsity issue inherent in conventional curriculum learning strategies by adaptively concentrating on more challenging training data. Finally, a comprehensive assessment of ACGCL is conducted through extensive experiments on six well-known benchmark datasets, wherein ACGCL conspicuously surpasses a set of state-of-the-art baselines.

Code Edit Add Remove Mark official

Tasks edit add remove, datasets edit, results from the paper edit add remove, methods edit add remove.

Graph Contrastive Representation Learning with Input-Aware and Cluster-Aware Regularization

  • Conference paper
  • First Online: 17 September 2023
  • Cite this conference paper

graph representation with curriculum contrastive learning

  • Jin Li   ORCID: orcid.org/0000-0003-3332-7790 12 ,
  • Bingshi Li   ORCID: orcid.org/0009-0008-0111-0064 12 ,
  • Qirong Zhang   ORCID: orcid.org/0000-0002-8204-9208 12 ,
  • Xinlong Chen   ORCID: orcid.org/0009-0002-1763-3122 12 ,
  • Xinyang Huang   ORCID: orcid.org/0009-0004-0814-3599 12 ,
  • Longkun Guo   ORCID: orcid.org/0000-0003-2891-4253 13 &
  • Yang-Geng Fu   ORCID: orcid.org/0000-0002-8507-9189 12  

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14170))

Included in the following conference series:

  • Joint European Conference on Machine Learning and Knowledge Discovery in Databases

1169 Accesses

With broad applications in network analysis and mining, Graph Contrastive Learning (GCL) is attracting growing research interest. Despite its successful usage in extracting concise but useful information through contrasting different augmented graph views as an outstanding self-supervised technique, GCL is facing a major challenge in how to make the semantic information extracted well-organized in structure and consequently easily understood by a downstream classifier. In this paper, we propose a novel cluster-based GCL framework to obtain a semantically well-formed structure of node embeddings via maximizing mutual information between input graph and output embeddings, which also provides a more clear decision boundary through accomplishing a cluster-level global-local contrastive task. We further argue in theory that the proposed method can correctly maximize the mutual information between an input graph and output embeddings. Moreover, we further improve the proposed method for better practical performance by incorporating additional refined gadgets, e.g. , measuring uncertainty of clustering and additional structural information extraction via local-local node-level contrasting module enhanced by Graph Cut. Lastly, extensive experiments are carried out to demonstrate the practical performance gain of our method in six real-world datasets over the most prevalent existing state-of-the-art models.

This research was supported by the University-Industry Cooperation Project of Fujian Province, China (2023H6008) and the National Natural Science Foundation of China (12271098). Paper with appendix can be found at https://drive.google.com/file/d/1FVziwZpsq4v5oLvPz9qFr77ozkQwhvFw/view?usp=sharing .

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

graph representation with curriculum contrastive learning

Dual-Branch Contrastive Learning for Network Representation Learning

graph representation with curriculum contrastive learning

Enhancing Heterogeneous Graph Contrastive Learning with Strongly Correlated Subgraphs

graph representation with curriculum contrastive learning

MPGCL: Multi-perspective Graph Contrastive Learning

E.g. , spectral embeddings used in Spectral Clustering [ 33 ].

Because we use GLMIMax in cluster’s level instead of the whole graph.

Chen, J., Ma, T., Xiao, C.: FastGCN: fast learning with graph convolutional networks via importance sampling. In: 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, 30 April–3 May 2018, Conference Track Proceedings. OpenReview.net (2018)

Google Scholar  

Ding, K., Xu, Z., Tong, H., Liu, H.: Data augmentation for deep graph learning: a survey. CoRR abs/2202.08235 (2022)

Ericsson, L., Gouk, H., Loy, C.C., Hospedales, T.M.: Self-supervised representation learning: introduction, advances, and challenges. IEEE Signal Process. Mag. 39 (3), 42–62 (2022). https://doi.org/10.1109/msp.2021.3134634

Article   Google Scholar  

Errica, F., Podda, M., Bacciu, D., Micheli, A.: A fair comparison of graph neural networks for graph classification. In: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, 26–30 April 2020. OpenReview.net (2020)

Grill, J.B., et al.: Bootstrap your own latent-a new approach to self-supervised learning. Adv. Neural. Inf. Process. Syst. 33 , 21271–21284 (2020)

Hamilton, W.L., Ying, Z., Leskovec, J.: Inductive representation learning on large graphs. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4–9 December 2017, Long Beach, CA, USA, pp. 1024–1034. Curran Associates Inc. (2017)

Hassani, K., Ahmadi, A.H.K.: Contrastive multi-view representation learning on graphs. CoRR abs/2006.05582 (2020)

Hjelm, R.D., et al.: Learning deep representations by mutual information estimation and maximization. In: 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, 6–9 May 2019. OpenReview.net (2019)

Jaiswal, A., Babu, A.R., Zadeh, M.Z., Banerjee, D., Makedon, F.: A survey on contrastive self-supervised learning. Technologies 9 (1), 2 (2021). https://doi.org/10.3390/technologies9010002

Jin, M., Zheng, Y., Li, Y.F., Gong, C., Zhou, C., Pan, S.: Multi-scale contrastive siamese networks for self-supervised graph representation learning. In: International Joint Conference on Artificial Intelligence 2021, Paolo, Brazil, pp. 1477–1483. Association for the Advancement of Artificial Intelligence (AAAI), CEUR-WS.org (2021)

Karypis, G., Kumar, V.: A software package for partitioning unstructured graphs, partitioning meshes, and computing fill-reducing orderings of sparse matrices. University of Minnesota, Department of Computer Science and Engineering, Army HPC Research Center, Minneapolis, MN 38 (1998)

Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Bengio, Y., LeCun, Y. (eds.) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 7–9 May 2015, Conference Track Proceedings. Conference Track Proceedings (2015)

Kipf, T.N., Welling, M.: Variational graph auto-encoders. Stat 1050 , 21 (2016)

Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, 24–26 April 2017, Conference Track Proceedings. OpenReview.net (2017)

Lin, Z.: Some software packages for partial SVD computation. CoRR abs/1108.1548 (2011)

Mavromatis, C., Karypis, G.: Graph infoclust: leveraging cluster-level node information for unsupervised graph representation learning. CoRR abs/2009.06946 (2020)

Mernyei, P., Cangea, C.: Wiki-CS: a Wikipedia-based benchmark for graph neural networks. CoRR abs/2007.02901 (2020)

Olatunji, I.E., Funke, T., Khosla, M.: Releasing graph neural networks with differential privacy guarantees. CoRR abs/2109.08907 (2021)

Pan, L., Shi, C., Dokmanic, I.: Neural link prediction with walk pooling. CoRR abs/2110.04375 (2021)

Pan, S., Hu, R., Long, G., Jiang, J., Yao, L., Zhang, C.: Adversarially regularized graph autoencoder for graph embedding. In: Lang, J. (ed.) Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, 13–19 July 2018, Stockholm, Sweden, pp. 2609–2615. ijcai.org (2018)

Pan, S., Hu, R., Long, G., Jiang, J., Yao, L., Zhang, C.: Adversarially regularized graph autoencoder for graph embedding. In: Lang, J. (ed.) Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13–19, 2018, Stockholm, Sweden. pp. 2609–2615. ijcai.org (2018)

Park, J., Lee, M., Chang, H.J., Lee, K., Choi, J.Y.: Symmetric graph convolutional autoencoder for unsupervised graph representation learning. In: 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), 27 October–2 November 2019, pp. 6518–6527. IEEE (2019)

Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, vol. 32 (2019)

Peng, Z., et al.: Graph representation learning via graphical mutual information maximization. In: Huang, Y., King, I., Liu, T., van Steen, M. (eds.) WWW 2020: The Web Conference 2020, Taipei, Taiwan, 20–24 April 2020, pp. 259–270. ACM/IW3C2 (2020)

Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2014, pp. 701–710. Association for Computing Machinery, New York (2014). https://doi.org/10.1145/2623330.2623732

Sato, R.: A survey on the expressive power of graph neural networks. CoRR abs/2003.04078 (2020)

Sen, P., Namata, G., Bilgic, M., Getoor, L., Galligher, B., Eliassi-Rad, T.: Collective classification in network data. AI Mag. 29 (3), 93–93 (2008)

Shchur, O., Mumme, M., Bojchevski, A., Günnemann, S.: Pitfalls of graph neural network evaluation. CoRR abs/1811.05868 (2018)

Thakoor, S., Tallec, C., Azar, M.G., Munos, R., Velickovic, P., Valko, M.: Bootstrapped representation learning on graphs. CoRR abs/2102.06514 (2021)

Velickovic, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., Bengio, Y.: Graph attention networks. Stat 1050 , 4 (2018)

Velickovic, P., Fedus, W., Hamilton, W.L., Liò, P., Bengio, Y., Hjelm, R.D.: Deep graph infomax. In: ICLR (Poster), vol. 2, no. 3, p. 4 (2019)

Vishwanathan, S.V.N., Schraudolph, N.N., Kondor, R., Borgwardt, K.M.: Graph kernels. J. Mach. Learn. Res. 11 , 1201–1242 (2010)

MathSciNet   MATH   Google Scholar  

Von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17 (4), 395–416 (2007)

Article   MathSciNet   Google Scholar  

Wu, F., Souza, A., Zhang, T., Fifty, C., Yu, T., Weinberger, K.: Simplifying graph convolutional networks. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning. Proceedings of Machine Learning Research, Long Beach, California, USA, vol. 97, pp. 6861–6871. PMLR (2019)

Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., Yu, P.S.: A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Syst. 32 (1), 4–24 (2021). https://doi.org/10.1109/tnnls.2020.2978386

Xu, K., Hu, W., Leskovec, J., Jegelka, S.: How powerful are graph neural networks? In: 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, 6–9 May 2019. OpenReview.net (2019)

Yang, Z., Cohen, W.W., Salakhutdinov, R.: Revisiting semi-supervised learning with graph embeddings. In: Balcan, M., Weinberger, K.Q. (eds.) Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, 19–24 June 2016. JMLR Workshop and Conference Proceedings, vol. 48, pp. 40–48. JMLR.org (2016)

Zhao, T., Liu, Y., Neves, L., Woodford, O., Jiang, M., Shah, N.: Data augmentation for graph neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 12, pp. 11015–11023 (2021)

Zheng, S., Zhu, Z., Zhang, X., Liu, Z., Cheng, J., Zhao, Y.: Distribution-induced bidirectional generative adversarial network for graph representation learning. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, 13–19 June 2020, pp. 7222–7231. Computer Vision Foundation/IEEE (2020)

Zhu, Y., Xu, Y., Yu, F., Liu, Q., Wu, S., Wang, L.: Deep graph contrastive representation learning. CoRR abs/2006.04131 (2020)

Zhu, Y., Xu, Y., Yu, F., Liu, Q., Wu, S., Wang, L.: Graph contrastive learning with adaptive augmentation. In: Leskovec, J., Grobelnik, M., Najork, M., Tang, J., Zia, L. (eds.) WWW 2021: The Web Conference 2021, Virtual Event/Ljubljana, Slovenia, 19–23 April 2021, pp. 2069–2080. ACM/IW3C2 (2021)

Download references

Author information

Authors and affiliations.

College of Computer and Data Science, Fuzhou University, Fuzhou, China

Jin Li, Bingshi Li, Qirong Zhang, Xinlong Chen, Xinyang Huang & Yang-Geng Fu

School of Mathematics and Statistics, Fuzhou University, Fuzhou, China

Longkun Guo

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Yang-Geng Fu .

Editor information

Editors and affiliations.

University of Michigan, Ann Arbor, MI, USA

Danai Koutra

University of Vienna, Vienna, Austria

Claudia Plant

Max Planck Institute for Software Systems, Kaiserslautern, Germany

Manuel Gomez Rodriguez

Politecnico di Torino, Turin, Italy

Elena Baralis

CENTAI, Turin, Italy

Francesco Bonchi

Ethics declarations

Ethics statement.

We believe in using machine learning responsibly and ethically and in minimizing any potential harm associated with its use. We will strive to ensure the accuracy and reliability of our models. We will always respect applicable laws, regulations, and best practices and will make sure our models are used ethically and responsibly.

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Cite this paper.

Li, J. et al. (2023). Graph Contrastive Representation Learning with Input-Aware and Cluster-Aware Regularization. In: Koutra, D., Plant, C., Gomez Rodriguez, M., Baralis, E., Bonchi, F. (eds) Machine Learning and Knowledge Discovery in Databases: Research Track. ECML PKDD 2023. Lecture Notes in Computer Science(), vol 14170. Springer, Cham. https://doi.org/10.1007/978-3-031-43415-0_39

Download citation

DOI : https://doi.org/10.1007/978-3-031-43415-0_39

Published : 17 September 2023

Publisher Name : Springer, Cham

Print ISBN : 978-3-031-43414-3

Online ISBN : 978-3-031-43415-0

eBook Packages : Computer Science Computer Science (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

Societies and partnerships

the ECML PKDD community

  • Find a journal
  • Track your research

IEEE Account

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

Biomedical temporal knowledge graph reasoning via contrastive adversarial learning

New citation alert added.

This alert has been successfully added and will be sent to:

You will be notified whenever a record that you have chosen has been cited.

To manage your alert preferences, click on the button below.

New Citation Alert!

Please log in to your account

Information & Contributors

Bibliometrics & citations, recommendations, adversarial graph contrastive learning with information regularization.

Contrastive learning is an effective unsupervised method in graph representation learning. Recently, the data augmentation based contrastive learning method has been extended from images to graphs. However, most prior works are directly adapted from the ...

Knowledge Graph Entity Typing with Contrastive Learning

Knowledge graph entity typing is an important way to complete knowledge graphs (KGs), aims at predicting the associating types of certain given entities. However, previous methods suppose that many (entity, entity type) pairs can be obtained for each ...

ArieL : Adversarial Graph Contrastive Learning

Contrastive learning is an effective unsupervised method in graph representation learning. The key component of contrastive learning lies in the construction of positive and negative samples. Previous methods usually utilize the proximity of nodes in the ...

Information

Published in.

cover image ACM Other conferences

Association for Computing Machinery

New York, NY, United States

Publication History

Permissions, check for updates, author tags.

  • Adversarial training
  • Contrastive learning
  • Natural language processing
  • Temporal knowledge graph reasoning
  • Research-article
  • Refereed limited

Acceptance Rates

Contributors, other metrics, bibliometrics, article metrics.

  • 0 Total Citations
  • 3 Total Downloads
  • Downloads (Last 12 months) 3
  • Downloads (Last 6 weeks) 3

View Options

Login options.

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

View options.

View or Download as a PDF file.

View online with eReader .

HTML Format

View this article in HTML Format.

Share this Publication link

Copying failed.

Share on social media

Affiliations, export citations.

  • Please download or close your previous search result export first before starting a new bulk export. Preview is not available. By clicking download, a status dialog will open to start the export process. The process may take a few minutes but once it finishes a file will be downloadable from your browser. You may continue to browse the DL while the export process is in progress. Download
  • Download citation
  • Copy citation

We are preparing your search results for download ...

We will inform you here when the file is ready.

Your file of search results citations is now ready.

Your search export query has expired. Please try again.

IMAGES

  1. Graph contrastive learning with edge dropout for recommendation

    graph representation with curriculum contrastive learning

  2. Graph contrastive learning. Getting high quality labeled dataset at

    graph representation with curriculum contrastive learning

  3. Figure 1 from CuCo: Graph Representation with Curriculum Contrastive

    graph representation with curriculum contrastive learning

  4. All you need to know about Graph Contrastive Learning

    graph representation with curriculum contrastive learning

  5. Deep Graph Contrastive Representation Learning

    graph representation with curriculum contrastive learning

  6. Curriculum Learning and Graph Neural Networks (or Graph Structure

    graph representation with curriculum contrastive learning

COMMENTS

  1. PDF CuCo: Graph Representation with Curriculum Contrastive Learning

    In this paper, we study the impact of negative samples on learning graph-level representations, and a novel curriculum contrastive learning frame-work for self-supervised graph-level representa-tion, called CuCo, is proposed. Specifically, we introduce four graph augmentation techniques to obtain the positive and negative samples, and uti-lize ...

  2. CuCo: Graph Representation with Curriculum Contrastive Learning

    In this paper, we study the impact of negative samples on learning graph-level representations, and a novel curriculum contrastive learning framework for self-supervised graph-level representation, called CuCo, is proposed. Specifically, we introduce four graph augmentation techniques to obtain the positive and negative samples, and utilize ...

  3. Accurate graph classification via two-staged contrastive curriculum

    Graph contrastive learning is a self-supervised approach that allows a model to learn the representations of graphs without labels by teaching the model which graph instances are similar or different. We use the data augmentation algorithms proposed in the Data augmentation section to generate similar graphs.

  4. CuCo: Graph Representation with Curriculum Contrastive Learning

    Abstract. Graph-level representation learning is to learn low-dimensional representation for the entire graph, which has shown a large impact on real-world applications. Recently, limited by ...

  5. CuCo: Graph Representation with Curriculum Contrastive Learning

    This paper introduces four graph augmentation techniques to obtain the positive and negative samples, and utilize graph neural networks to learn their representations, and proposes a novel curriculum contrastive learning framework for self-supervised graph-level representation, called CuCo. Graph-level representation learning is to learn low-dimensional representation for the entire graph ...

  6. source code of IJCAI 2021 paper "Graph Representation with Curriculum

    source code of IJCAI 2021 paper "Graph Representation with Curriculum Contrastive Learning" - BUPT-GAMMA/CuCo. Skip to content. Navigation Menu Toggle navigation. Sign in ... source code of IJCAI 2021 paper "Graph Representation with Curriculum Contrastive Learning" Resources. Readme Activity. Custom properties. Stars. 22 stars Watchers. 1 ...

  7. Adversarial Curriculum Graph Contrastive Learning with Pair-wise

    Graph contrastive learning (GCL) has emerged as a pivotal technique in the domain of graph representation learning. A crucial aspect of effective GCL is the caliber of generated positive and negative samples, which is intrinsically dictated by their resemblance to the original data. Nevertheless, precise control over similarity during sample generation presents a formidable challenge, often ...

  8. Accurate graph classification via two-staged contrastive curriculum

    Node-level contrastive curriculum learning captures the relational information between nodes in a graph G i and a feature-modified graph G f,i (line 8 in Algorithm 1). Graph-level contrastive learning extracts representative graph embeddings by maximizing the similarity between graphs G f,i and G s,i with the same origin (line 9 in Algorithm 1 ...

  9. PDF Directed Graph Contrastive Learning

    and the impact of different pacing functions on the performance of directed graph contrastive learning. 2 Directed Graph Data Augmentation In this section, we first design a directed graph data augmentation scheme named Laplacian perturba-tion. As it is time-consuming to calculate the Laplacian matrix, we then speed it up using the power ...

  10. PDF Adversarial Curriculum Graph Contrastive Learning with Pair-wise

    In this section, we propose a novel framework ACGCL for learning efective node representations for downstream tasks such as node classification. ACGCL consists of three main components: 1) pair graph augmentation; 2) subgraph contrastive learning; 3) adversar-ial curriculum training.

  11. ConCur: : Self-supervised graph representation based on contrastive

    AbstractContrastive learning has made breakthrough advancements in graph representation learning, ... Graph Augmentations and Curriculum Contrastive Training. Graph Augmentations aim at constructing positive and negative samples through different graph augmentation strategies. In the Curriculum Contrastive Training, we first utilize a triplet ...

  12. SGCL

    Graph contrastive representation learning aims to learn discriminative node representations by contrasting positive and negative samples. It helps models learn more generalized representations to achieve better performances on downstream tasks, which has aroused increasing research interest in recent years. Simultaneously, signed graphs ...

  13. ConCur: Self-supervised graph representation based on contrastive

    Most of the self-supervised methods learn node or graph representation in a contrastive manner, dubbed Graph Contrastive Learning (GCL), which extends the Information Maximization principle to maximize the lower bound of the Mutual Information (MI) [19], [20], [21], [22].As illustrated in Fig. 1 (a), a standard framework of GCL is mainly formed by three components: graph augmentation, encoder ...

  14. Adversarial Curriculum Graph Contrastive Learning with Pair-wise

    Graph contrastive learning (GCL) has emerged as a pivotal technique in the domain of graph representation learning. A crucial aspect of effective GCL is the caliber of generated positive and negative samples, which is intrinsically dictated by their resemblance to the original data. ... Adversarial Curriculum Graph Contrastive Learning (ACGCL ...

  15. [2106.07594] Graph Contrastive Learning Automated

    Self-supervised learning on graph-structured data has drawn recent interest for learning generalizable, transferable and robust representations from unlabeled graphs. Among many, graph contrastive learning (GraphCL) has emerged with promising representation learning performance. Unfortunately, unlike its counterpart on image data, the effectiveness of GraphCL hinges on ad-hoc data ...

  16. ConCur: Self-supervised graph representation based on contrastive

    Contrastive learning has made breakthrough advancements in graph representation learning, which encourages the representation of positive samples to be close and those of negative samples to be far away.However, existing graph contrastive learning (GCL) frameworks have made great efforts toward designing different augmentation strategies for positive samples, while randomly utilizing all other ...

  17. AGCL: Adaptive Graph Contrastive Learning for graph representation

    In conclusion, some graph representation learning methods utilize graph diffusion to generate a new diffusion matrix that can represent the global view, and this is the method that we used in this article. Graph Contrastive Learning [5], [8], [9], [10], [23] methods created multiple views and then maximize the feature disagreement between these ...

  18. PDF Graph Contrastive Learning with Augmentations

    Contrastive learning. The main idea of contrastive learning is to make representations agree with each other under proper transformations, raising a recent surge of interest in visual representation learning [45, 25, 26, 27, 18]. On a parallel note, for graph data, traditional methods trying to

  19. Graph Contrastive Representation Learning with Input-Aware ...

    With broad applications in network analysis and mining, Graph Contrastive Learning (GCL) is attracting growing research interest. Despite its successful usage in extracting concise but useful information through contrasting different augmented graph views as an outstanding self-supervised technique, GCL is facing a major challenge in how to make the semantic information extracted well ...

  20. Graph Representation Learning via Contrasting Cluster Assignments

    With the rise of contrastive learning, unsupervised graph representation learning (GRL) has shown strong competitiveness. However, existing graph contrastive models typically either focus on the local view of graphs or take simple considerations of both global and local views. This may cause these models to overemphasize the importance of individual nodes and their ego networks, or to result ...

  21. Exploring the Role of Node Diversity in Directed Graph Representation

    Directed graph contrastive learning Jan 2018 et al., 2020b] Zekun Tong, Yuxuan Liang, Changsheng Sun, David S Rosenblum, and Andrew Lim. Directed graph convolutional network.

  22. Contrastive Learning Gains with Graph-Based Approach

    Researchers introduced X-CLR, a novel contrastive learning method using graph-based sample relationships. This approach outperformed traditional models like SimCLR in low-data regimes, enhancing ...

  23. Biomedical temporal knowledge graph reasoning via contrastive

    However, entity representations that evolve over time are more difficult to learn compared to static representations, especially for those entities that lack historical information or emerge recently. In this paper, we propose a novel Contrastive Adversarial Learning (CAL) framework for TKG reasoning in biomedical domain.

  24. Supervised contrastive learning for graph representation enhancement

    An unsupervised graph-level representation learning framework called HGCL [34] is presented to capture hierarchical graph semantics through a multi-tiered contrastive learning approach. In addition, HSL-RG is a framework that adeptly predicts molecular properties with limited data by combining graph kernels and self-supervised learning [35] .

  25. PDF CLR ESP: Improved enzyme-substrate pair prediction using contrastive

    289 proposed contrastive learning strategy over the widely used concatenation strategy. A frozen 290 BERT-based encoder paired with learnable neural networks as a projection head and contrastive 291 loss function has previously shown success in single-modality conditions EC number prediction 292 (Yu et al., 2023).

  26. [2110.14863] Graph Communal Contrastive Learning

    Graph representation learning is crucial for many real-world applications (e.g. social relation analysis). A fundamental problem for graph representation learning is how to effectively learn representations without human labeling, which is usually costly and time-consuming. Graph contrastive learning (GCL) addresses this problem by pulling the positive node pairs (or similar nodes) closer ...

  27. Semisupervised Graph Contrastive Learning for Process Fault Diagnosis

    The complexity of unit interactions and the scarcity of labeled samples pose great challenges to effective fault diagnosis of industrial processes. To this end, a semisupervised fault diagnosis model based on graph isomorphism contrastive learning (GICL) is proposed. To model fault propagation, a topology graph with process variables is constructed to guide GICL to model the interactions ...

  28. Graph contrastive learning with consistency regularization

    Graph contrastive learning [12] is a GNN pre-training method that aims to learn perturbation-invariant representations for diverse graph-structured data.. Download : Download high-res image (255KB) Download : Download full-size image Fig. 2. Illustration of positive and negative pairs derived from positive and negative samples.

  29. Understanding Hallucination Rates in Language Models: Insights from

    Language models (LMs) exhibit improved performance with increased size and training data, yet the relationship between model scale and hallucinations remains unexplored. Defining hallucinations in LMs presents challenges due to their varied manifestations. A new study from Google Deepmind focuses on hallucinations where correct answers appear verbatim in training data. Achieving low ...

  30. HyperTaxel: Hyper-Resolution for Taxel-Based Tactile Signals Through

    resentation learning, involves using a graph neural network and contrastive learning to learn a geometrically-informed representation of the tactile signals. The second stage, hyper-resolution, uses the learned representation to map low-resolution taxel signals into a high-resolution contact surface using multi-contact localization.