Loading...
About the Role
- Mercor is partnering with a leading AI research lab to support a Frontier Code Agents project.
- Contributors help evaluate and improve frontier AI coding models through structured technical assessments.
- The work focuses on realistic machine learning engineering workflows and model evaluation.
- Spots are limited and filling quickly on a first come, first serve basis.
What You'll Do
- Use frontier AI coding agents to complete and evaluate complex machine learning and AI engineering tasks.
- Review model-generated implementations involving model training, inference systems, MLOps, and LLM applications.
- Identify bugs, edge cases, performance issues, and failure modes.
- Compare outputs from multiple frontier models and assess their strengths and weaknesses.
- Apply professional engineering judgment to realistic ML engineering scenarios.
Time Commitment
- Sprint based project that runs in 12-24 hour stretches based on client requirement.
Compensation
- $400 per accepted task.
- Typical tasks take approximately 2–3 hours after ramp-up.
- Compensation is tied to accepted work.
Who Should Apply
- 2+ years of professional machine learning engineering experience.
- Experience building production ML systems, model deployment infrastructure, LLM applications, or AI-powered products.
- Regular use of AI coding agents such as Cursor, Claude Code, Codex, Windsurf, Gemini CLI, or similar tools.
- Ability to evaluate model-generated machine learning implementations and technical tradeoffs.
- Experience deploying ML systems to production is preferred.
Apply directly on Mercor to get started.