MQL Benchmark
A 30,000-example open-source benchmark for evaluating natural-language → DSL generation, with a public model leaderboard.
Content tagged with "benchmarks"
A 30,000-example open-source benchmark for evaluating natural-language → DSL generation, with a public model leaderboard.