Towards Automated Penetration Testing: Introducing LLM Benchmark, Analysis, and Improvements Oct 25 • 1
Code Evaluation Collection Collection of Papers on Code Evaluation (from code generation language models) • 45 items • Updated Oct 29 • 14
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking Paper • 2403.09629 • Published Mar 14 • 75
GROVE: A Retrieval-augmented Complex Story Generation Framework with A Forest of Evidence Paper • 2310.05388 • Published Oct 9, 2023 • 4