TGC: Transaction Graph Contrast Network for Ethereum Phishing Scam Detection

Phishing scams have become the most serious type of crime involved in Ethereum. However, existing methods ignore the natural camouflage and sparse distribution of phishing scams in Ethereum leading to unsatisfactory performance, and they are also limited by the data scale which cannot be applied to real-world dynamic scenarios. In this paper, we propose a Transaction Graph Contrast network (TGC) to enhance phishing scam detection performance on Ethereum. TGC inputs subgraphs instead of the entire graph for training, which eases the model’s requirements for machine configuration and data connectivity. Motivated by phishing nodes are surrounded by normal nodes, we design the comparison between node-level to help phishing nodes learn the unique properties of themselves different from their neighbors. Observing the small number and sparse distribution of phishing nodes, we narrow the distance between phishing nodes by comparing node context-level structures, so as to learn universal transaction patterns. We further combine the obtained features with common statistics to identify phishing addresses. Evaluated on real-world Ethereum phishing scams datasets, our TGC outperforms the state-of-the-art methods in detecting phishing addresses and has obvious advantages in large-scale and dynamic scenarios.


Baselines
Feature-based -Features only 1 are 219-dimensional statistical features from the node's 1-order and 2-order neighbors.

Analysis of Semantic-level Attention
High attention value of FCF meta-path -Learning user browser parameters is more important

Conclusions
-TGC outperforms all the other compared methods by a significant margin, especially in large graphs -TGC has better node representation capability than exiting Ethereum phishing detection methods -Network representation methods based on deep learning are not performing well

Analysis of Semantic-level Attention
High attention value of FCF meta-path -Learning user browser parameters is more important

Background 4 ----
The Growth of Scam on Cryptocurrency-Cryptocurrency-based crime hit a new all-time high in 2022, with illicit addresses receiving $20.6 billion over the course of the year.-the rise of decentralized finance (DeFi) and the allure of blockchain's anonymity have given rise to a plethora of cybercrimes -Increased focus on cryptocurrency security issues -high visibility and lots of potential victims -victims lost $645,000 within the first week of the phishing campaign, and the attacker's illegal profits exceeded $3,000,000 in just one month -phishing scams are the most deceptive scams TGC: Transaction Graph Contrast Network for Ethereum Phishing Scam Detection Phishing Scams on Ethereum -identifying phishing scams on Ethereum becomes a crucial research topic Background Background Take high-reward propaganda to induce remittances (email/chat) -offering additional Ether coins as incentives -visit fraudulent platforms or websites -promise high returns if purchase digital assets -no fixed platforms pattern!ACSAC 2023 TGC: Transaction Graph Contrast Network for Ethereum Phishing Scam Detection Using manual-designed features as node embedding -Limitation : Rely on professional knowledge to extract features Inefficient and non-automated Task The address of Ethereum ----> the node in the graph Transaction relationship ----> edge between nodes Learn efficient node representations through the transaction network and classify nodes Limitation -Using graph neural network to mining deep features -Limitation : overlook the unique challenges of Ethereum phishing scams Weak node representation Network Representation Learning ACSAC 2023 TGC: Transaction Graph Contrast Network for Ethereum Phishing Scam Detection unsatisfactory performance The natural camouflage -97.99% of neighbors are normal addresses -Sparsity of distribution -low proportion (0.345%) -sparse distribution -Large scale and dynamic nature -Ethereum transaction network is both vast and dynamic Challenges Collect relevant addresses and transactions construct the sample centered r -ego networks and build two different views Combine the outputs and feed into the classifier to get the result TGC: Transaction Graph Contrast Network for Ethereum Phishing Scam Detection TGC use two designed contrastive modules learn the discriminative representation of phishing addresses Ego Network Construct & Subgraph Sampling -Construct a local substructure of each sample node -" -ego network " consists of the -order neighbors of the central sample node and the connection relationships between them -Random walk with restart (RWR) sampling strategy -each ego network is sampled twice -generating two local subgraphs -Each sample gets a carefully designed pair of "local subgraph vs. local subgraph" TGC Learn unique characteristics different from their neighbors Node-level Contrast -Instance pair " Target node vs. node " -Treat neighbors as the negative samples -Intra-view negative pair -Inter-view negative pair -Positive samples are the representation of the same node in different views -Contrastive Loss 9 ACSAC 2023 TGC: Transaction Graph Contrast Network for Ethereum Phishing Scam Detection TGC Learn unique characteristics different from their neighbors Context-level contrast TGC -Instance pair " Target context vs. context " -Treat subgraphs which generated by different r-ego networks as a negative pair -Generated by the same rego networks as a positive pair -Contrastive Loss Capture the transactional structural patterns behind phishing and normal addresses TGC Phishing Addresses Detection Final Node Representation -concat the three features -Unique potential features -Learned from the Node-level Contrast module -Transaction pattern features -Learned from the Context-level Contrast module -As of March 2023, 5,639 phishing addresses.-randomly select 25,000 active normal addresses -Labeling -Etherscan -labeled nodes being the central nodes, Two-layer BFS -9,237,535 Ethereum addresses and 219,927,673 transaction records.-Generate a large graph based on the transaction information crawled around all labeled phishing nodes, and select the largest connected component -Sample with random walks to obtain subgraphs of different sizes TGC: Transaction Graph Contrast Network for Ethereum Phishing Scam Detection

-Few-
The TGC subgraph sampling training method can remain lightweight in large-scale network scenarios -TGC has better node representation capability and stable performance than other methods on large graphs TGC: Transaction Graph Contrast Network for Ethereum Phishing Scam Detection Evaluation -Dynamic Data Comparison Results Observation Results -our proposed TGC method is least affected by the reduction of data size -All modules in TGC are important -TGC is robust to hyperparameter perturbation.TGC: Transaction Graph Contrast Network for Ethereum Phishing Scam Detection We propose a Transaction Graph Contrast Network (TGC) to enhance phishing scams detection performance on Ethereum -TGC inputs subgraphs instead of the entire graph for training, which eases the model's requirements for machine configuration and data connectivity, and can be well adapted to dynamic networks -Motivated by the natural camouflage and sparsity distribution of phishing addresses, we design node-level contrast and context-level contrast to learn the unique properties and universal transaction patterns of phishing addresses -We hope that our work demonstrates the serious threat of phishing scams on Ethereum and calls for effective countermeasures deployed by the blockchain community.