Back to all jobs
Manus AI logo

DevOps & SRE Engineer

Manus AI
SingaporeOn-site
Employment
Full-time

About the role

Key Responsibilities

Cluster Operations & Management

  • Manage and maintain container clusters (Kubernetes, Docker) and open-source component clusters (Kafka, Redis, Elasticsearch) across multiple business units

  • Ensure optimal performance, scalability, and reliability of distributed systems

Infrastructure Platform Development

  • Design, build, and enhance infrastructure operation platforms

  • Develop and maintain systems for infrastructure management, CI/CD pipelines, monitoring/alerting, and centralized logging

  • Drive platform standardization and automation initiatives

High Availability & Reliability

  • Ensure maximum uptime for production services through proactive monitoring and incident response

  • Continuously optimize service architecture, deployment strategies, and operational processes

  • Implement and maintain SLA/SLO frameworks and reliability engineering practices

Automation & Process Improvement

  • Lead the development of automated operations and maintenance systems

  • Create self-service tools and workflows to improve team productivity

  • Establish best practices for infrastructure such as code and configuration management

Required Qualifications

Experience & Education

  • 2+ years of hands-on experience in Systems Operations, DevOps, or Site Reliability Engineering (SRE)

  • Bachelor's degree in Computer Science, Engineering, or related technical field preferred

Cloud & Infrastructure

  • Experience with public cloud platforms (AWS, Azure, or GCP) is highly valued

  • Strong understanding of large-scale internet architecture and distributed systems

  • Proven experience with infrastructure monitoring, logging, and observability tools

Technical Skills

  • Proficiency in scripting and automation using Shell, Python, or similar languages

  • Strong knowledge of containerization technologies (Kubernetes, Docker)

  • Hands-on experience operating production-grade container clusters and managing CI/CD pipelines

  • Strong familiarity with common infrastructure components: Nginx, MySQL, Redis, Kafka, Elasticsearch

Advanced Networking (Preferred)

  • Experience with Service Mesh architectures, Cilium CNI, and eBPF technologies

  • Understanding network security, load balancing, and traffic management

  • Knowledge of cloud-native networking patterns and best practices

About Manus AI

Manus is a general AI agent that bridges minds and actions: it doesn't just think, it delivers results. Manus excels at various tasks in work and life, getting everything done while you rest. At Manus AI, we offer a highly collaborative and innovative environment where experts across engineering, research, and business come together to push the boundaries of AI applications. If you're passionate about cutting-edge technology and making a real impact, we’d love to hear from you!

Contact us: recruiting@manus.im

731,000+ hidden jobs like this

Manus AI and thousands of companies post here first — often days before LinkedIn or Indeed. Your first 5 applications are free; go Pro to apply without limits.

Everything Pro unlocks:

  • Unlimited applications — free stops at 5
  • Track every application in one place
  • Apply straight to the source, one click
  • Save & organize roles you love
  • Roles pulled from company boards before the big sites

Weekly

$9.99
$4.99/week

For an active search. Cancel anytime.

Most popular

Monthly

$24.99
$12.99/month

The smart pick. Save 35% vs weekly.

Lifetime

$99
$49.99once

Pay once. Every future feature, forever.