Back to all jobs
R

AI Quality Engineer

Rootly

OfficeHybrid4d ago
Employment
Full-time

About the role

About Rootly

  • Design and execute prompt-based test scenarios that cover happy paths, edge cases, and adversarial inputs across Rootly's agentic AI features
  • Evaluate AI outputs for accuracy, relevance, consistency, and alignment with expected workflow behaviour
  • Build and maintain an evaluation framework; structured test libraries, scoring rubrics, and regression suites to track AI performance over time
  • Identify failure modes, hallucinations, reasoning gaps, and unexpected agent behaviours; document findings and work with engineers to resolve them
  • Partner with Product and Engineering on new AI feature releases, contributing to acceptance criteria and quality gates before launch
  • Define and track quality metrics (accuracy rates, failure frequency, regression trends) and report findings to stakeholders
  • Stay current on LLM evaluation techniques, prompt engineering best practices, and agentic testing methodologies
  • +5 years in QA, product operations, AI/ML evaluation, or a closely related role
  • Hands-on experience testing or evaluating LLM-powered or agentic AI products
  • Strong prompt engineering instincts -- you understand how wording, context, and structure affect model behaviour
  • Comfortable writing scripts or working with evaluation tools (Python a plus; not required to be a full-stack engineer)
  • Sharp analytical thinking; you can spot a subtle reasoning failure and articulate exactly why it's a problem
  • Clear written communicator; able to translate AI behaviour findings for both technical and non-technical audiences
  • Familiarity with incident management, DevOps, or IT operations workflows is a strong asset
  • Experience with evaluation frameworks (e.g. LangSmith, PromptFlow, Braintrust, or similar)
  • Exposure to red-teaming or adversarial testing of AI systems
  • Comfortable writing E2E tests with Playwright
  • Background working at a B2B SaaS or developer-tools company
  • Familiar with mobile app testing (iOS/Android)


Why Rootly?

  • Competitive compensation and early equity in a fast-growing, venture-backed company.
  • Comprehensive medical, dental, and vision coverage.
  • 3 weeks of vacation, plus unlimited sick and mental health days, and a company-wide end-of-year shutdown to recharge.
  • $500 stipend for home office setup.
  • Unlimited token usage and access to AI tools
  • A fast-moving, high-impact environment where your leadership and ideas directly shape the future of the company.

Perks & benefits

  • Vision Insurance
  • Home Office Budget
  • Equity Compensation

764,000+ hidden jobs like this

Rootly and thousands of companies post here first — often days before LinkedIn or Indeed. Your first 5 applications are free; go Pro to apply without limits.

Everything Pro unlocks:

  • Unlimited applications — free stops at 5
  • Track every application in one place
  • Apply straight to the source, one click
  • Save & organize roles you love
  • Roles pulled from company boards before the big sites

Weekly

$9.99
$4.99/week

For an active search. Cancel anytime.

Most popular

Monthly

$24.99
$12.99/month

The smart pick. Save 35% vs weekly.

Lifetime

$99
$49.99once

Pay once. Every future feature, forever.