Evaluating LLM Generated Detection Rules in Cybersecurity
An open-source evaluation framework and three benchmark metrics for measuring LLM-generated cybersecurity detection rules.
Content tagged with "featured"
An open-source evaluation framework and three benchmark metrics for measuring LLM-generated cybersecurity detection rules.
Foundational report co-authored with researchers from FHI, OpenAI, CSER, EFF, and CNAS on the misuse risks of advanced AI.
An RL agent that learns to make non-breaking modifications to malicious PE binaries to evade static ML-based malware classifiers.
An early natural-language interface for security analysts — a precursor to today's agent-based SOC tooling.