
- Employment
- Full-time
- Seniority
- Staff
About the role
Member of Technical Staff
Remote in United States, Canada or Latin America • $200K-320K + Equity
Member of Technical Staff - AI Benchmark Curation & Validation
Role Summary
Own the quality of OB-1's benchmark suite. Execute tasks with the AI agent, analyze results, identify broken or gamed benchmarks, and curate hundreds of tasks for production. You need deep technical judgment to instantly recognize poor task design.
Core Responsibilities
Task Execution & Analysis (40%): Run OB-1 against tasks. Analyze results. Understand why it succeeds or fails.
Task Design Review (40%): Judge if tasks are well-designed, solvable, and test real capability. Spot what's trivial or can be gamed. Refine as needed.
Curation & Scaling (20%): Filter task batches for quality. Build repeatable curation process as volume scales to 500+.
Required Expertise
Expert-level understanding of 2+ domains: ML systems, C++ performance optimization, or Verilog/chip design
IOI/IMO-level competitive programming background (or similar)
5+ years building production systems
1+ year professional experience with Python and one of Rust or C++
Experience with a Typescript a plus
High bar for quality with ability to articulate why tasks are good or bad
What we offer
• Competitive compensation: $200,000 - $320,000 base salary plus significant equity
• Opportunity to work on cutting-edge AI technology with real-world impact
• Collaborative environment with a world-class team of engineers and researchers
• Access to state-of-the-art computing resources and AI models
• The chance to shape the future of how software is built
Perks & benefits
- Equity Compensation
753,000+ hidden jobs like this
OpenBlock and thousands of companies post here first — often days before LinkedIn or Indeed. Your first 5 applications are free; go Pro to apply without limits.
Everything Pro unlocks:
- Unlimited applications — free stops at 5
- Track every application in one place
- Apply straight to the source, one click
- Save & organize roles you love
- Roles pulled from company boards before the big sites