#llms

2026-05-15

A 30,000-example open-source benchmark for evaluating natural-language → DSL generation, with a public model leaderboard.

View

Evaluating LLM Generated Detection Rules in Cybersecurity

2025-09-20 Anna Bertiger, Bobby Filar, Aryan Luthra, Stefano Meschiari, Aiden Mitchell, Sam Scholten, Vivek Sharath Conference on Applied Machine Learning in Information Security (CAMLIS) 2025

#LLMs #Evaluation #Cybersecurity #Featured

An open-source evaluation framework and three benchmark metrics for measuring LLM-generated cybersecurity detection rules.

View

Evaluating LLM-Generated Detection Rules

2025-09-20

#Evaluation #LLMs #Cybersecurity

A benchmark and three metrics for measuring LLM-generated cybersecurity rules — CAMLIS 2025.

View

BabbelPhish

2023-08-01

#LLMs #NLU

Accelerating Adoption of Domain-Specific Languages with Large Language Models.

View

BabbelPhish Dataset

2023-07-15

#Datasets #LLMs

Open-source natural language to domain-specific language dataset for email security.

View