X

openai com

Bing Rank
Average Position of Bing Search Engine Ranking of related query such as 'Sales AI Agent', 'Coding AI Agent', etc.

Last Updated: 2025-02-09

We introduce MLE-bench, a benchmark for measuring how well AI agents perform at machine learning engineering. To this end, we curate 75 ML engineering-related competitions from Kaggle, creating a diverse set of challenging tasks that test real-world ML engineering skills such as training models, preparing datasets, and running experiments.

Statistic

Prompts

Reviews

Tags

Write Your Review

Detailed Ratings

ALL
Correctness
Helpfulness
Interesting
Upload Pictures and Videos