BabbelPhish Dataset
A ~3,000-example dataset pairing natural language descriptions with Message Query Language (MQL) queries, intended for fine-tuning and evaluating LLMs in the email detection-engineering setting.
Superseded by the MQL Benchmark (~30,000 examples, four difficulty tiers, public leaderboard). Kept here as historical context.
Sources used to construct it:
- Sublime Security documentation
- The Message Data Model schema
- The Sublime Rules repository
- Curation from the Sublime Community Slack
Each example was reviewed by a human-in-the-loop annotation pass.
Dataset: huggingface.co/datasets/sublime-security/babbelphish