This repository contains code of the paper "Guardians of the Air: In-Device Detection of 5G Control-Plane Threats" for detecting Layer-3 (L3) control-plane threats from cellular network traffic in the User Equipment (UE).
.
├── App/
│ └── App is hosted externally because of its size. Download: https://pennstateoffice365-my.sharepoint.com/:f:/g/personal/tvw5452_psu_edu/IgDNTg5SYdf9SZ6FrTGHONc-AXKxypkntgAZ3kURBOmlpl8?e=47SafX
├── Code/
│ ├── ConnSentinel/
│ │ ├── connsentinel.py
│ │ └── data_entropy/
│ └── ExFinder/
│ ├── encode_data.py
│ ├── encoder_mapper.json
│ ├── exfinder.py
│ └── extract_message.py
├── Dataset/
│ └── Dataset is hosted externally because of its size. Download: https://pennstateoffice365-my.sharepoint.com/:f:/g/personal/tvw5452_psu_edu/IgA-17pGa6QkRrVIFx_lzIedAQ-vOQTiSQM09dyPCZ-4TTY?e=sNfXkj
├── LICENSE
├── README.md
└── requirements.txt
This project implements the two modules of 5GShield: ConnSentinel, which pre-filters suspicious base stations by detecting anomalous broadcast messages, and ExFinder, which identifies abnormal protocol behavior caused by protocol-level attack exploits.
The current release includes both the ConnSentinel and ExFinder offline analysis pipelines. The real-time processing pipeline is implemented inside the app.
The 5GShield app requires root access on the Android device.
The app uses Cellular-Pro's API to access raw baseband data before processing it in real-time. This API is required to run the app. To get the API, please contact alibaba1126@126.com.
We recommend using a Qualcomm-based device. Exynos and HiSilicon devices are not supported yet because of engineering issues.
extract_message.py expects raw .CSV files containing the following columns:
TIME, TYPE, SIGNAL, DETAILED
To collect compatible CSV traces for message-flow extraction, you can use the 5GShield app, which exports cellular traces in a format that can be processed by the extraction script.
The current scripts treat label == 0 as benign traffic. Any label greater than 0 is treated as an attack or anomaly.
- Python 3.8+
- PyTorch
- PyTorch Geometric
- NumPy
- Pandas
- scikit-learn
- XGBoost
- SciPy
- tensorboardX
- prettytable
- matplotlib
- psutil
See requirements.txt for the pinned and unpinned package list.
python3 -m venv venv
source venv/bin/activatepip install -r requirements.txtPyTorch Geometric installation can depend on your Python, CUDA, and PyTorch versions. If the generic install does not work, follow the official PyTorch Geometric installation guide for your environment.
If you use the provided dataset and trained artifacts, you can run ExFinder directly with Code/ExFinder/exfinder.py. The extraction and encoding scripts are only needed if you want to process your own collected trace.
ConnSentinel processes raw CSV traces and compares SIB1/MIB broadcast-message with previous records or nearby cells.
Edit the configuration at the bottom of Code/ConnSentinel/connsentinel.py:
target_plmns = {"310260"} # replace with the PLMN you want to evaluate
process_folder_and_compare_sib1_messages(
"PATH_TO_YOUR_EVAL_OPERATOR_FOLDER",
target_plmns=target_plmns,
weight_file="data_entropy/field_max_std_entropy_tmobile_sib1.txt",
mib_weight_file="data_entropy/field_max_std_entropy_tmobile_mib.txt"
)Use the entropy files in Code/ConnSentinel/data_entropy/ that match the operator being evaluated.
Then run:
# From the ConnSentinel folder
cd Code/ConnSentinel
python3 connsentinel.pyEdit the input and output folder constants in Code/ExFinder/extract_message.py:
input_folder_path = "PATH_TO_YOUR_INPUT_FOLDER"
output_folder_path = "PATH_TO_YOUR_OUTPUT_FOLDER"Then run:
# From the repository root
python3 Code/ExFinder/extract_message.pyThe script scans the input folder for .CSV files and writes processed files named:
<original_name>_processed.CSV
Code/ExFinder/encode_data.py uses Code/ExFinder/encoder_mapper.json to one-hot encode categorical features.
Edit the folder constants in Code/ExFinder/encode_data.py:
FOLDER_PATH = "PATH_TO_UNENCODED_CSV_FOLDERS"
PROCESSED_FOLDER_PATH = "PATH_TO_PROCESSED_CSV_FOLDERS"Then run:
# From the repository root
python3 Code/ExFinder/encode_data.pyThe encoder writes CSV files with:
features,id,label
where features is an encoded vector, id is the message name, and label is copied from the processed input.
Code/ExFinder/exfinder.py builds graph windows from encoded CSV files and evaluates anomaly detection.
Before running, update the hard-coded paths in main():
pretrained_model_path = "Dataset/eval/trained_models/autoencoder.pth"
test_path = "Dataset/eval/test/encoded"
knn_eval_path = "Dataset/eval/trained_models/knn.joblib"
model_filename = "Dataset/eval/trained_models/classifier.joblib"If training from scratch, set:
pretrain = True
train_classifier = Trueand update the training data path:
base_path = "Dataset/eval/train/encoded"Then run:
# From the repository root
python3 Code/ExFinder/exfinder.pyThe evaluation functions available in the script are:
evaluate_unsupervised(...): k-NN distance based anomaly detection.evaluate_supervised(...): direct classifier evaluation on graph embeddings.evaluate_hybrid(...): unsupervised anomaly detection plus supervised attack correction and classification.
The current main() calls evaluate_hybrid(...) by default.
This project is released under the CC0 1.0 Universal license. See LICENSE for details.