LLM Red Team Intern (Evaluation Systems)

Elloe Ai

KenyaRemote11mo ago

Apply

Employment: Internship
Seniority: Junior

About the role

Create prompts to trigger hallucinations, policy violations, or failure scenarios
Stress test Elloe-protected deployments using open and proprietary models
Document behavioral exploits across use cases (healthcare, compliance, gov)

Build truthsets and scoring rubrics tied to factuality, policy, or ethical standards
Benchmark Elloe’s modules across model types (Claude, GPT-4, Gemini, open models)
Collaborate with product to refine and expand our eval harnesses

Identify blind spots in current detection logic
Recommend scoring methods or red flag thresholds for deployment
Support internal model comparison reports or customer safety audits

ML/AI researcher or engineer (undergrad, grad, or early career)
Experience working with LLMs, eval sets, and prompt design
Strong attention to detail, grounded in safety and adversarial thinking
Bonus: exposure to safety benchmarks like TruthfulQA, MMLU, or red teaming tools

Exposure to high-stakes LLM safety deployments
Published frameworks or scoring methods used by enterprises
Mentorship from technical founders operating at the bleeding edge of AI safety

Start Date: Rolling
Duration: 12–16 weeks
Compensation: Research stipend
Location: Remote-first; flexible for global candidates
To Apply: Share a jailbreak or eval idea you’d love to run against GPT-4.

764,000+ hidden jobs like this

Elloe Ai and thousands of companies post here first — often days before LinkedIn or Indeed. Your first 5 applications are free; go Pro to apply without limits.

Everything Pro unlocks:

Unlimited applications — free stops at 5
Track every application in one place
Apply straight to the source, one click
Save & organize roles you love
Roles pulled from company boards before the big sites

Weekly

$9.99

$4.99/week

For an active search. Cancel anytime.

Get Weekly

Monthly

$24.99

$12.99/month

The smart pick. Save 35% vs weekly.

Get Monthly

Lifetime

$99

$49.99once

Pay once. Every future feature, forever.

Get Lifetime