LLM Honeypot

Code for our paper "LLM Honeypot: Leveraging Large Language Models as Advanced Interactive Honeypot Systems" published in 2024 IEEE Conference on Communications and Network Security (CNS).

You can download the paper via: [IEEE] - [DOI]

Abstract

The rapid evolution of cyber threats necessitates innovative solutions for detecting and analyzing malicious activity. Honeypots, which are decoy systems designed to lure and interact with attackers, have emerged as a critical component in cybersecurity. In this paper, we present a novel approach to creating realistic and interactive honeypot systems using Large Language Models (LLMs). By fine-tuning a pre-trained open-source language model on a diverse dataset of attacker-generated commands and responses, we developed a honeypot capable of sophisticated engagement with attackers. Our methodology involved several key steps: data collection and processing, prompt engineering, model selection, and supervised fine-tuning to optimize the model’s performance. Evaluation through similarity metrics and live deployment demonstrated that our approach effectively generates accurate and informative responses. The results highlight the potential of LLMs to revolutionize honeypot technology, providing cybersecurity professionals with a powerful tool to detect and analyze malicious activity, thereby enhancing overall security infrastructure.

Citation

If this work is helpful, please cite as:

@INPROCEEDINGS{
  10735607,
  author={Otal, Hakan T. and Canbaz, M. Abdullah},
  booktitle={2024 IEEE Conference on Communications and Network Security (CNS)}, 
  title={LLM Honeypot: Leveraging Large Language Models as Advanced Interactive Honeypot Systems}, 
  year={2024},
  pages={1-6},
  doi={10.1109/CNS62487.2024.10735607}
}

Contact

hotal [AT] albany [DOT] edu

Downloads last month
27
Safetensors
Model size
7.5B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for hotal/honeypot-llama3-8B

Finetuned
(487)
this model

Dataset used to train hotal/honeypot-llama3-8B