CogDL: A Comprehensive Library for Graph Deep Learning

Graph neural networks (GNNs) have attracted tremendous attention from the graph learning community in recent years. It has been widely adopted in various real-world applications from diverse domains, such as social networks and biological graphs. The research and applications of graph deep learning present new challenges, including the sparse nature of graph data, complicated training of GNNs, and non-standard evaluation of graph tasks. To tackle the issues, we present CogDL1, a comprehensive library for graph deep learning that allows researchers and practitioners to conduct experiments, compare methods, and build applications with ease and efficiency. In CogDL, we propose a unified design for the training and evaluation of GNN models for various graph tasks, making it unique among existing graph learning libraries. By utilizing this unified trainer, CogDL can optimize the GNN training loop with several training techniques, such as mixed precision training. Moreover, we develop efficient sparse operators for CogDL, enabling it to become the most competitive graph library for efficiency. Another important CogDL feature is its focus on ease of use with the aim of facilitating open and reproducible research of graph learning. We leverage CogDL to report and maintain benchmark results on fundamental graph tasks, which can be reproduced and directly used by the community.


INTRODUCTION
Graph-structured data have been widely utilized in many real-world scenarios.Inspired by recent trends of representation learning on computer vision (CV) and natural language processing (NLP), graph neural networks (GNNs) [34,71] are proposed to apply neural architectures to perform message passing over graph structures.For example, graph convolutional networks (GCNs) [48] simplify the spectral graph convolutions with first-order approximation and utilize a layer-wise propagation rule of graph convolution.GraphSAGE [37] adopts a sampling-based method to generate node embeddings for large graphs, which can be applied to the inductive setting.Recent effort in GNNs has been focused on theoretical understanding [98], generalization [26], self-supervised learning [43,86], and capacity to handle web-scale applications [104].To date, GNNs have achieved impressive performance on various graph machine learning tasks in diverse domains [42].
With the prosperity of GNNs, the evaluation and fair comparison of GNN models can still be a concerning issue.Although there are enormous research works on graph machine learning, we observe that different papers may use their evaluation settings for the same graph dataset, making results indistinguishable.For example, as a widely used dataset, Cora [73], some papers use the "standard" semi-supervised splits following Planetoid [102], while others adopt random splits or fully-supervised splits.Reported results of a GNN model on the same dataset may differ in various papers, making it challenging to compare the performance of different GNN methods [22,24,42,74].
Moreover, the graph-structured data is fundamentally different from natural language or images, which can be easily formatted as Tensor and thus straightforwardly processed by GPUs.The non-Euclidean and sparse nature of graph data requires a better storage format for efficient computation, especially on GPUs.However, current deep learning frameworks do not well support the sparse computation of graph-structure data.For example, the PyTorch API of sparse tensors is limited and inefficient for graph operations.To bridge the gap, several open-source efforts have been made to develop dedicated libraries for the efficient development of graph representation learning research.For example, PyTorch Geometric (PyG) [27] is a graph learning library built upon PyTorch [62] to easily write and train GNNs for various applications.Deep Graph Library (DGL) [91] is an efficient and scalable package for deep learning on graphs, which provides several APIs allowing arbitrary message-passing computation over large-scale graphs.However, these popular graph libraries mainly focus on implementing basic GNN operators and do not take the whole training and evaluation of GNNs into consideration.The end-to-end performance of training GNNs could still be improved.
Present work: CogDL.In this paper, we introduce CogDL, a comprehensive library for graph deep learning that allows researchers, practitioners, and developers to train and evaluate GNN models under different graph tasks with ease and efficiency.The advantage of CogDL lies in its design of a unified GNN trainer, standard evaluation, and modular wrappers.Specifically, CogDL unifies the training and evaluation of graph neural network (GNN) models, making it different from other graph learning libraries.In addition, it is also equipped with decoupled modules that can be flexibly plugged for training GNNs, including ModelWrapper and DataWrapper.CogDL provides built-in GNN-specific wrappers, which could not be offered by other libraries.
The goal of CogDL is to accelerate research and applications of graph deep learning.Based on the design of the unified trainer and decoupled modules, CogDL users can instantiate a GNN trainer by trainer = Trainer(...) and then call trainer.run(...) to conduct experiments easily, which is not supported by existing graph libraries.We also provide a high-level API (i.e., experiment) on top of Trainer for one-line usage.Using the unified trainer not only saves users' time in writing additional code but also gives users the opportunity to enable many built-in training techniques, such as mixed precision training.For example, users only need to set Trainer(fp16=True) to enable the feature of mixed precision training without any modification.Furthermore, CogDL provides as many tasks with standard evaluation as possible for users (as shown Listing 1: One-line code to conduct experiments.from cogdl import experiment experiment(dataset=["cora","citeseer"], model=["gcn","gat","gcnii"]) in Table 1), and maintains experimental results on these fundamental graph tasks with recorded hyperparameter configurations.We summarize the characteristics of CogDL as follows: • Ease of use.Relevance to Web.GNNs have been widely used in web research and applications, such as social influence prediction [64,68], social recommendation [25], and web advertising [103].However, there still remain several challenges to applying powerful GNN models to real web applications.Practitioners need to not only choose an appropriate GNN but also train the GNNs on large-scale graphs.To mitigate the issues, CogDL provides unique advantages for web research and applications.First, the well-optimized sparse operators can speed up the process of new research and applications.Besides, the unified trainer with built-in wrappers can be directly utilized to conduct experiments with ease and efficiency.In addition, CogDL provides the benchmarking results of GNN methods for all kinds of graph tasks, which could help web-related users to choose appropriate methods.Finally, we have successfully applied CogDL to a real-world academic system-AMiner.More details about the application can be found in Appendix B.

THE COGDL FRAMEWORK
In this section, we present the unified design of the CogDL library, which is built on PyTorch [62], a well-known deep learning library.Based on PyTorch, CogDL provides efficient implementations of many models and our designed sparse operators.To date, it supports 10 important graph tasks, such as node classification.could be used for other scenarios.Our unified design also supports the precomputation-based GNN models, such as SIGN [28] and SAGN [75].
Based on the design of the unified trainer and decoupled modules, we could do arbitrary combinations of models, model wrappers, and data wrappers.For example, if we want to apply the DGI, a representative graph self-supervised learning method, to large-scale datasets, all we need is to substitute the full-graph data wrapper with the neighbor-sampling or clustering data wrappers without additional modifications.If we design a new GNN architecture for a specific task, we only need to write essential PyTorch-style code for the model.The rest could be automatically handled by CogDL by specifying the model wrapper and the data wrapper.We could quickly conduct experiments for the model using the trainer via trainer = Trainer(epochs,...) and trainer.run(...).Moreover, based on the unified trainer, CogDL provides native support for many useful features, including hyperparameter optimization, efficient training techniques, and experiment management without any modification to the model implementation.
We introduce efficient techniques of GNN training, which credit to the unified trainer of CogDL.These features could be enabled through fp16=True or actnn=True in Trainer API.

Mixed Precision
Training.CogDL supports mixed precision training, which is a popular technique to relieve the GPU memory and speed up the training process.PyTorch provides a convenient method for mixed precision in torch.cuda.amp,which integrates NVIDIA apex.For example, we conduct experiments to test the performance of mixed precision training.

Activation Compression Training. With the emergence of deep
GNNs, the activation footprints occupy more and more GPU memory.And we need to store the activation output of each layer to compute the gradient in the backward pass, which costs much GPU memory in the training step.We extend the activation compression training technique [16,54] to GNNs.The main advantage of activation compression training is that the GPU memory of training could dramatically decrease.
Based on the provided trainer, CogDL also supports the following features for conducting experiments with ease.Hyperparameter Optimization.Hyperparameter optimization (HPO) is an important feature for GNN libraries since current GNN models utilize more hyperparameters than before.We integrate a popular library, Optuna [4], into CogDL to enable HPO.Optuna is a fast AutoML library based on Bayesian optimization.CogDL implements hyperparameter search and even model search for graph learning.The key of the HPO is to define a search space.The search space will be automatically utilized by CogDL to start searching and output the best results.The usage of HPO can be found in Appendix A.2.
Experiment Management.Experiment management is crucial for training deep learning models, which could be utilized by researchers and developers for debugging and designing new blocks.CogDL integrates TensorBoard and WandB [6] for experiment management.TensorBoard is a common visualization toolkit for tracking metrics like loss and accuracy.WandB is a central dashboard to keep track of the experimental status.The experiment management could be enabled easily by assigning the logger argument.

Efficient Sparse Operators
We introduce the well-optimized sparse operators of CogDL, including SpMM-like operators, multi-head SpMM, and edge-wise softmax, which are developed and implemented in CogDL for GNN models.We first introduce the abstraction of sparse operators in CogDL and its corresponding optimization.The General Sparse Matrix-Matrix multiplication (GSpMM) operator is widely used in most GNNs.The reason is that many GNNs apply a general aggregation operation for each node from its neighbors: where (, , ) denotes an edge  from node  to node  with a message   , and  () is the hidden representation of node  of layer .When the reduce operation  is a summation, and the compute operation  is a multiplication, such an operation can be described as an SpMM operator  (+1) ←  () , where the sparse matrix  represents the adjacency matrix of the input graph , and each row of  () represents a representation vector of nodes (e.g.,  ()  and  ()  ).We observe that in existing graph neural network libraries such as PyG [27], or DGL [91], their backend support is not tailored to GNN.In PyG, the message-passing flow is built by torch-scatter, which is borrowed from the design principles of traditional graph processing.In DGL, its GSpMM function leverage APIs from cuS-PARSE [59], which is a vendor library that mainly targets for scientific computing of sparse matrices.Another observation is that many previous GPU-based SpMM works fail to be aware of the adaptability of workload balance in graphs.For example, GE-SpMM [44] and Sputnik [30] are row-split SpMM methods that could not efficiently support graphs that follow a power-law distribution.CogDL is fully aware of the work-load balance problem compared to PyG/DGL.We observe that the algorithm of neighbor group [45,93] could better slice the large nodes in the graph and thus has better efficiency when facing power-law distribution graphs.However, some graphs may naturally conform to node load balancing, such as the k-sampled graph [37], so it is better to use the row-split backend directly.
To tackle the following problems, we propose a new balance abstraction that not only exploits the data of the graph itself but also effectively accelerates our backend.As shown in Listing 2, our workload balance handler would be generated by a data-aware balancer, which could indicate whether to use a row-split or neighbor group algorithm.Empirically, when the degrees of the majority (90%) of the nodes are close to the average degree, then we would choose row-split as our backend since it is suitable for graphs that are naturally node balanced.Note that our workload balancer is a preprocessing method since it would generate the auxiliary arrays that are needed by the GSpMM backend.The Listing 2 demonstrates that our GSpMM's inputs include a work balance handler, graph data, and input features.Users could choose the reduce and compute operator, as illustrated in Equation 1.In our GSpMM design, we implement the warp-based thread mapping algorithm, which would bind different neighbor groups or split rows to different warps.This allows us to assign multiple adjacent rows or neighbor groups into a single CUDA block.Since the same processor shares an L1 cache, adjacent rows or neighbor groups can reduce the L1 loaded data to improve locality.Multihead SpMM is used to compute the aggregation procedure of GNN models with multi-head attention mechanisms (such as GAT), which means each edge has a different edge weight for each attention head; therefore, multiple SpMM operators are needed.PyG and DGL use multiple SpMM with broadcasting to implement it.In our multi-head SpMM kernel, we apply the memory coalescing technique by allocating consecutive warps along the dimension of the head.As different heads share the same location of non-zero elements, this data will be cached by the streaming multiprocessors and reused among different heads, which also saves bandwidth.
When considering a GAT model, we also need a sampled densedense matrix multiplication (SDDMM) operator  =  ⊙ (   ), where  and  are the sparse matrices,  and  are two dense matrices, ⊙ is element-wise matrix multiplication.The SDDMM operator is used for back-propagating the gradients to the sparse adjacency matrix since the adjacency matrix of the GAT model is computed by the attention mechanism.CogDL further speeds up multi-head graph attention by optimizing Edge-wise Softmax, which is defined by  ′  = exp(  )/(  ∈N () exp(  )), where   is the attention score between nodes  and .
In our design of the Edge-wise Softmax kernel, we use the method of warp level intrinsic provided by NVIDIA, a typical method that could accelerate the scan and reduce operation.To prevent the spillover of the exponent, we first apply the scan to find the max value of edge weight and then subtract this maximum from each value.After that, we compute each value by applying the exponent function and reduce all the values within the warp to acquire the sum of all the exponent values.As far as we know, CogDL is the first framework that provides a GPU-based fused Edge-wise Softmax kernel.Therefore, we do not provide the results of the experiment with PyG and DGL.

EVALUATION
In this section, we conduct extensive experiments to evaluate the framework design and characteristics of CogDL.

Training Techniques
For mixed precision training, we run a 2-layer GCN model with 2048 hidden units on the Reddit dataset.From Table 2, the mixed precision training brings 27% memory savings and 1.44× ∼ 2.02× speedup on NVIDIA 2080 Ti and 3090 GPUs with almost no performance drop.For activation compression training, we conduct experiments of the 2-layer GCN model with 256 hidden units on Table 4: End-to-end inference time in seconds of 2-layer GCN and GAT models with hidden size 128.The GAT model uses 4 attention heads.OOM means out of memory.PyG will be more likely to occur out of memory because it is using the torch-scatter method that will store the temporary edge embedding in the GPU memory.

End-to-End Performance
We compare the end-to-end inference time of GCN and GAT models on several datasets with other popular GNN frameworks: CogDL v0.5.

BENCHMARKING
In this section, with CogDL, we introduce several downstream tasks to evaluate implemented methods.The statistics of the datasets are shown in Table 9 and Table 10 in the Appendix.Based on the experiment API, we build a reliable leaderboard for each task, which maintain state-of-the-art results.

Network Embedding
Setup.Unsupervised node classification task aims to learn a mapping function  :  ↦ → R  that projects each node to a -dimensional space ( ≪ | |) in an unsupervised manner.Structural properties of the network should be captured by the mapping function.We collect the most popular datasets used in the unsupervised node classification task, including BlogCatalog [108], Wikipedia [67], PPI [10], DBLP [81], Flickr [82].We implement and compare the following methods for the unsupervised node classification task, including SpectralClustering [83], DeepWalk [63], LINE [80], node2vec [36], GraRep [11], HOPE [61], NetMF [67], ProNE [110], NetSMF [66].Skip-gram network embedding considers the vertex paths traversed by random walks over the network as the sentences and leverages Skip-gram for learning latent vertex representation.For matrix factorization based methods, they first compute a proximity matrix and perform matrix factorization to obtain the embedding.Actually, NetMF [67] has shown the aforementioned Skip-gram models with negative sampling can be unified into the matrix factorization framework with closed forms.We run all network embedding methods on five real-world datasets and report the Micro-F1 results (%) using logistic regression with L2 normalization.
Results and Analysis.Table 5 shows the results and we find some interesting observations.DeepWalk and node2vec consider vertex paths traversed by random walk to reach high-order neighbors.NetMF and NetSMF factorize diffusion matrix  =0     rather than adjacency matrix .ProNE and LINE are essentially 1-order methods, but ProNE further propagates the embeddings to enlarge the receptive field.Incorporating global information can improve performance but may hurt efficiency.The propagation in ProNE, which is similar to graph convolution, shows that incorporating global information as a post-operation is effective.The results demonstrate that matrix factorization (MF) methods like NetMF and ProNE are very powerful and full of vitality as they outperform Skip-gram based methods in almost all datasets.ProNE and NetSMF are also of high efficiency and scalability and able to embed super-large graphs in feasible time in one single machine.
Results and Analysis.Table 6 and Table 7 summarize the evaluation results of all compared models in semi-supervised and fullysupervised datasets, respectively, under the setting of node classification.We observe that incorporating high-order information plays an important role in improving the performance of models, especially in citation datasets (Cora, CiteSeer, and PubMed) that are of relatively small scale.Most high-order models, such as GRAND, APPNP, and GDC, aim to use graph diffusion matrix Ā =  =0   Â to collect information of distant neighbors.GCNII uses the residual connection with identity mapping to resolve over-smoothing in GNNs and achieves remarkable results.GRAND [26] leverages unlabelled data in semi-supervised settings and obtains state-of-the-art results.As for self-supervised methods, DGI and MVGRL maximize local and global mutual information.MVGRL performs better by replacing the adjacency matrix with graph diffusion matrix but is less scalable.On the contrary, GRACE, which optimizes InfoNCE, does not perform well on citation datasets with the public split.

Graph Classification
Setup.Graph classification assigns a label to each graph and aims to map graphs into vector spaces.Graph kernels are historically dominant and employ a kernel function to measure the similarity between pairs of graphs to map graphs into vector spaces with deterministic mapping functions.But they suffer from computational bottlenecks.Recently, graph neural networks attract much attention and indeed show promising results in this task.In the context of graph classification, GNNs often employ the readout operation to obtain a compact representation of the graph and apply classification based on the readout representation.We collect eight popular benchmarks used in graph classification tasks, including Bioinformatics datasets (PROTEINS, MUTAG, PTC, and NCI1), and Social networks (IMDB-B, IMDB-M, COLLAB, REDDIT-B).We implement the following graph classification models and compare their results: GIN [98], DiffPool [105], SAGPool [51], Sort-Pool [111], DGCNN [94] and Infograph [76] are based on GNNs.PATCHY_SAN [60] is inspired by convolutional neural networks.Deep graph kernels (DGK) [100] and graph2vec [58] use graph kernels.As for evaluation, for supervised methods we adopt 10fold cross-validation with 90%/10% split and repeat 10 times; for unsupervised methods, we perform the 10-fold cross-validation with LIB-SVM [14].Then we report the accuracy for classification performance of all methods on these datasets.
Results and Analysis.Table 8 reports the results of the aforementioned models on the task, including both unsupervised and supervised graph classification.The development of GNNs for graph classification is mainly in two aspects.One line (like GIN) aims to design more powerful convolution operations to improve the expressiveness.Another line is to develop effective pooling methods to generate the graph representation.Neural network based methods show promising results in bioinformatics datasets (MUTAG, PTC, PROTEINS, and NCI1), where nodes are with given features.But in social networks (IMDB-B, IMDB-M, COLLAB, REDDIT-B) lacking node features, methods based on graph kernels achieve good performance and even surpass neural networks.Graph kernels are more capable of capturing structural information to discriminate non-isomorphic graphs, while GNNs are better encoders with features.Global pooling, which is used in GIN and SortPool, collects node features and applies a readout function.Hierarchical pooling, such as DiffPool and SAGPool, is proposed to capture structural information in different graph levels, including nodes and subgraphs.
The experimental results indicate that, though hierarchical pooling seems more complex and intuitively would capture more information, it does not show significant advantages over global pooling.

Other Graph Tasks
As shown in Table 1, CogDL also provides other important tasks in graph learning.In this section, we briefly introduce the setting and evaluation of some tasks, including heterogeneous node classification, link prediction, multiplex link prediction, etc.
Heterogeneous Node Classification.This task is built for heterogeneous GNNs conducted on heterogeneous graphs such as academic graphs.For heterogeneous node classification, we use macro-F1 to evaluate the performance of all heterogeneous models under the setting of Graph Transformer Networks (GTN) [107].
Link Prediction.Many real applications, such as citation link prediction, can be viewed as a link prediction task [47].We formulate the link prediction task as: given a part of edges in the graph, after learning representations for each vertex  ∈  , the method needs to accurately predict the edges that belong to the rest of the edges rather than random edges.In practice, we remove 15 percent of original edges for each dataset and adopt ROC AUC [38] as the evaluation metric.
Multiplex Link Prediction.GATNE [13] formalizes the representation learning problem of multiplex networks and proposes a unified framework to solve the problem in transductive and inductive settings.We follow the setting of GATNE [13] to build the multiplex heterogeneous link prediction task.In CogDL, for those methods that only deal with homogeneous networks, we separate the graphs based on the edge types and train the method on each separate graph.We also adopt ROC AUC as the evaluation metric.
Knowledge Graph Completion.This task aims to predict missing links in a knowledge graph through knowledge graph embedding.In CogDL, we implement two families of knowledge graph embedding algorithms: triplet-based knowledge graph embedding and knowledge-based GNNs.The former includes TransE [9], DistMult [101], ComplEx [84], and RotatE [78]; the latter includes RGCN [72] and CompGCN [85].We evaluate all implemented algorithms with standard benchmarks including FB15k-237, WN18 and WN18RR with Mean Reciprocal Rank (MRR) metric.
PyG provides CUDA kernels for sparse operators with high data throughput.PyG 2.0 further utilizes GraphGym [106] to achieve new features like reproducible configuration and hyperparameter optimization.DGL not only provides flexible and efficient messagepassing APIs but also allows users to easily port and leverage the existing components across multiple deep learning frameworks (e.g., PyTorch [62], TensorFlow [2]).However, CogDL not only supports flexible implementations of GNN models but also provides a unified trainer for the end-to-end training of GNN models with efficient techniques.

CONCLUSIONS
This paper introduces CogDL, a comprehensive library for graph deep learning that focuses on the research and applications of graph representation learning.CogDL provides simple APIs such that users can train and evaluate graph learning methods with ease and efficiency.Besides, CogDL implements many state-of-the-art models with wrappers for various graph tasks.Finally, CogDL collects and maintains leaderboards with reproducible configurations on widely-used public datasets.In the future, we will consistently improve the efficiency of CogDL and support more features to satisfy the latest need of both scientific and industrial applications.For example, the dynamic scenario of graph learning, where nodes and edges will continuously be added and removed, needs to be considered.Besides, graph self-supervised learning and pre-training have become one of the most exciting research directions in recent years, which presents a new challenge for graph libraries.

A COGDL PACKAGE
In this section, we introduce the details of the CogDL library.

A.1 Basic Components
The basic elements in CogDL include Graph, Dataset, and Model.
Graph.Producing the results with one-line code provides a convenient way for experiments.Furthermore, we integrate a popular library, Optuna [4], into CogDL to enable hyper-parameter search.By feeding the search_space function that defines the search space of hyper-parameters, CogDL will start the searching process for the validation metric and output the best results.

B APPLICATIONS
Our library has been successfully applied to a large-scale academic mining and search system-AMiner (https://www.aminer.cn/)-for the analysis of papers and scholars, as illustrated in Figure 3.There are more than 320 million papers with more than 1 billion citation links in the AMiner database.Most applications (e.g., paper tagging) in AMiner are related to academic papers and scholars.Therefore, it is important to obtain accurate embeddings for papers and scholars for downstream tasks.Take paper tagging as an example.Each publication in AMiner has several tags extracted from the raw text (e.g., title and abstract) of each publication.However, publications with citation links may have similar tags, and we can utilize the citation network to improve the quality of tags.The paper tagging problem can be considered as a multi-label node classification task, where each label represents a tag.We can utilize paper embeddings to improve the quality of paper tags since we find many papers missing tags in the AMiner database.With the help of the AMiner team, we train a downstream classifier for the paper tagging task on 4.8 million papers in computer science.The experimental results demonstrate that those embeddings could help improve the recall by 12.8% compared with the previous method.In addition to the paper tagging task, the embeddings generated by CogDL can also help the paper recommendation of AMiner.Specifically, the AMiner homepage gives paper recommendations for users based on users' historical behaviors.The quality of the paper embeddings plays an important role in the recommendation.We hope that our library can help more users to build better services, especially those web-related applications, benefiting the community of graph learning.
We provide simple APIs in CogDL such that users only need to write one line of code to train and evaluate any graph representation learning methods.CogDL develops a unified trainer with decoupled modules for users to train GNNs easily.Based on this unique design, CogDL can provide extensive features such as hyperparameter optimization, efficient training techniques, and experiment management.• End-to-end performance.Efficiency is one of the most significant characteristics of CogDL.CogDL develops well-optimized sparse kernel operators to speed up the training of GNN models.
[87]example, these efficient sparse operators enable CogDL to achieve about 2× speedups on the 2-layer GCN[48]and GAT[87]models compared with PyG and DGL.• Reproducibility.Users can quickly conduct the experiments to reproduce the results based on CogDL, as illustrated in Listing 1. CogDL implements 70+ models and collects 70+ datasets for 10 fundamental graph tasks, as shown in Table 1, facilitating open, robust, and reproducible deep learning research on graphs.

Table 1 :
All supported tasks, datasets, and models in CogDL.To date, CogDL integrates 10 important graph tasks with the standard evaluation of 70+ datasets.Besides, CogDL implements 70+ graph representation methods, which can be directly used by users.Note that some datasets and methods can be used in several tasks.

Table 2 :
Performance of mixed precision training.
2, PyTorch-Geometric (PyG) v2.0.2, and Deep Graph Library (DGL) v0.7.2 with PyTorch backend.The statistics of datasets could be found in Table9.We conduct experiments using Python 3.7.10 and PyTorch v1.8.0 on servers with Nvidia GeForce RTX 2080 Ti (11GB GPU Memory) and 3090 (24GB GPU Memory).From Table4, CogDL achieves at most 2× speedup on the 2-layer GCN model compared with PyG and DGL.For the 2-layer GAT model, CogDL achieves 1.32× ∼ 2.36× speedup compared with PyG and DGL.Furthermore, OOM occurs when running the PyG's GAT model on Reddit and Yelp datasets, even using 3090 (24G).The GAT model implemented by DGL also cannot run the Yelp dataset using 2080Ti (11G).The results demonstrate significant advantages of CogDL in inference time and memory savings, compared to state-of-the-art GNN frameworks.

Table 5 :
Micro-F1 score (%) of network embedding methods reproduced by CogDL.50% of nodes are labeled for training in PPI, Blogcatalog, and Wikipedia, 5% in DBLP and Flickr.These datasets correspond to different downstream scenarios.

Table 6 :
Accuracy (%) reproduced by CogDL for semisupervised and self-supervised node classification on Citation datasets.↓ and ↑ mean our results are lower or higher than the result in original papers.

Table 7 :
Results of node classification on fully-supervised datasets, including full-batch and sampling-based methods.Flickr, Reddit, and ogbn-arxiv use accuracy metric, whereas PPI uses Micro-F1 metric.

Table 8 :
Accuracy Results (%) of both unsupervised and supervised graph classification.↓ and ↑ mean our results are lower or higher than the results in original papers.‡ means the experiment is not finished in 24 hours for one seed.
The Graph is the basic data structure of CogDL to store graph data with abundant graph operations.Graph supports both convenient graph modification, such as removing/adding edges, and efficient computing, such as SpMM.We also provide common graph manipulations, including adding self-loops, graph normalization, sampling neighbors, obtaining induced subgraphs, etc.Dataset.The dataset component reads in data and processes it to produce tensors of appropriate types.Each dataset specifies its loss function and evaluator through function get_loss_fn and get_evaluator for training and evaluation.Our package provides many real-world datasets and easy-to-use interfaces for users to define customized datasets.CogDL provides two ways to construct a customized dataset.One way is to convert raw data files into the required format of CogDL and specify the argument _ℎ.The other way is to define a customized dataset class with the necessary functions to load and process the data.CogDL implements various graph neural networks as research baselines and for applications.A GNN layer such as GCNLayer has already been implemented based on Graph and efficient sparse operators.A model in the package consists of the model builder (i.e., __init__) and forward propagation (i.e., forward) functions in the PyTorch style.Listing 4 shows the APIs implemented in each model to provide a unified paradigm for usage, where add_args is used to define model-specific hyperparameters for experiments.Figure2illustrates the benefits of CogDL's unified trainer and modular wrappers over PyG and DGL.Users can directly use the Trainer for running the experiments and do not need to write tedious code to build the training loop for GNN training and evaluation.Besides, we introduce another easy-to-use usage for conducting experiments through experiment API.We can feed a dataset, a model, and hyper-parameters to the experiment API, which calls the Trainer API.