Back to all jobs
F

【SRE】Site Reliability Engineer

funnow

Taipei CityOn-site10mo ago
Employment
Full-time

About the role

【Capsule】
At FunNow, we’re building joyful experiences, at the speed of now. As a Site Reliability Engineer, you’ll play a crucial role in ensuring our platform stays fast, resilient, and secure for millions of users booking spontaneous fun across Asia. But here’s the twist: we don’t just monitor uptime — we build with AI and automation. From Kubernetes tuning to auto-healing infrastructure, CI/CD pipelines to incident response, you'll be hands-on in evolving our DevOps culture. If you love scalable systems, believe in developer efficiency, and treat infrastructure as code, welcome aboard.

【Typical Accountability】

  • Design robust architectures to comprehensively improve system availability, scalability, and service quality
  • Ensure stable service operation, monitor core service status, and quickly troubleshoot issues
  • Conduct in-depth analysis of system performance bottlenecks and propose and implement improvement solutions
  • Maintain and optimize Kubernetes clusters (EKS/GKE), effectively handling resource pressure, node anomalies, and other situations
  • Maintain and improve CI/CD pipelines and automated deployment systems (GitHub Actions / ArgoCD) to significantly enhance engineering team development efficiency
  • Establish and continuously optimize system monitoring and alerting mechanisms (Prometheus / Grafana / Alertmanager)
  • Assist with incident response and problem investigation
  • Regularly participate in system inspections and audits, proactively proposing and implementing improvements
  • Assist in maintaining and implementing fundamental security settings (e.g., IAM, resource permissions, encrypted storage)
  • Actively share your experience to collectively enhance the team's engineering culture

【Essential Competencies】

  • Familiarity with container technologies such as Docker or Kubernetes, and practical experience with Kubernetes operations (deployment, scheduling, resource management)
  • Familiarity with AWS services (e.g., ECS, EKS, S3, CloudFront, IAM, VPC, etc.), and practical experience maintaining AWS or GCP (we primarily use AWS)
  • Familiarity with at least one CI/CD tool (e.g., GitHub Actions, GitLab CI)
  • Proficiency in MySQL daily management and performance analysis
  • Familiarity with service-related log analysis and monitoring tools (e.g., CloudWatch, ELK/EFK, Grafana), and practical experience with Prometheus/Grafana
  • Experience maintaining Elasticsearch clusters
  • Familiarity with Git and basic Git flow operations
  • High degree of self-management, proactive and responsible work attitude, meticulousness, and excellent communication and teamwork skills

【Desirable Competencies】

  • Exposure to or familiarity with the Golang ecosystem
  • Familiarity with Infra-as-Code tools such as CDK, Terraform
  • Experience with IPO advisory or ISO audit
  • Security awareness

【Who You Are】

  • You enjoy solving real-world problems, are proactive in investigation, and act quickly
  • You value stability and data accuracy, and possess a high sense of responsibility
  • You are passionate about learning new tools and enjoy sharing improvement methods
  • You maintain clear communication and good documentation habits in team collaboration

755,000+ hidden jobs like this

funnow and thousands of companies post here first — often days before LinkedIn or Indeed. Your first 5 applications are free; go Pro to apply without limits.

Everything Pro unlocks:

  • Unlimited applications — free stops at 5
  • Track every application in one place
  • Apply straight to the source, one click
  • Save & organize roles you love
  • Roles pulled from company boards before the big sites

Weekly

$9.99
$4.99/week

For an active search. Cancel anytime.

Most popular

Monthly

$24.99
$12.99/month

The smart pick. Save 35% vs weekly.

Lifetime

$99
$49.99once

Pay once. Every future feature, forever.