【SRE】Site Reliability Engineer

funnow

Taipei CityOn-site10mo ago

Employment: Full-time

About the role

【Capsule】
At FunNow, we’re building joyful experiences, at the speed of now. As a Site Reliability Engineer, you’ll play a crucial role in ensuring our platform stays fast, resilient, and secure for millions of users booking spontaneous fun across Asia. But here’s the twist: we don’t just monitor uptime — we build with AI and automation. From Kubernetes tuning to auto-healing infrastructure, CI/CD pipelines to incident response, you'll be hands-on in evolving our DevOps culture. If you love scalable systems, believe in developer efficiency, and treat infrastructure as code, welcome aboard.

【Typical Accountability】

Design robust architectures to comprehensively improve system availability, scalability, and service quality
Ensure stable service operation, monitor core service status, and quickly troubleshoot issues
Conduct in-depth analysis of system performance bottlenecks and propose and implement improvement solutions
Maintain and optimize Kubernetes clusters (EKS/GKE), effectively handling resource pressure, node anomalies, and other situations
Maintain and improve CI/CD pipelines and automated deployment systems (GitHub Actions / ArgoCD) to significantly enhance engineering team development efficiency
Establish and continuously optimize system monitoring and alerting mechanisms (Prometheus / Grafana / Alertmanager)
Assist with incident response and problem investigation
Regularly participate in system inspections and audits, proactively proposing and implementing improvements
Assist in maintaining and implementing fundamental security settings (e.g., IAM, resource permissions, encrypted storage)
Actively share your experience to collectively enhance the team's engineering culture

【Essential Competencies】

Familiarity with container technologies such as Docker or Kubernetes, and practical experience with Kubernetes operations (deployment, scheduling, resource management)
Familiarity with AWS services (e.g., ECS, EKS, S3, CloudFront, IAM, VPC, etc.), and practical experience maintaining AWS or GCP (we primarily use AWS)
Familiarity with at least one CI/CD tool (e.g., GitHub Actions, GitLab CI)
Proficiency in MySQL daily management and performance analysis
Familiarity with service-related log analysis and monitoring tools (e.g., CloudWatch, ELK/EFK, Grafana), and practical experience with Prometheus/Grafana
Experience maintaining Elasticsearch clusters
Familiarity with Git and basic Git flow operations
High degree of self-management, proactive and responsible work attitude, meticulousness, and excellent communication and teamwork skills

【Desirable Competencies】

Exposure to or familiarity with the Golang ecosystem
Familiarity with Infra-as-Code tools such as CDK, Terraform
Experience with IPO advisory or ISO audit
Security awareness

【Who You Are】

You enjoy solving real-world problems, are proactive in investigation, and act quickly
You value stability and data accuracy, and possess a high sense of responsibility
You are passionate about learning new tools and enjoy sharing improvement methods
You maintain clear communication and good documentation habits in team collaboration

755,000+ hidden jobs like this

funnow and thousands of companies post here first — often days before LinkedIn or Indeed. Your first 5 applications are free; go Pro to apply without limits.

Everything Pro unlocks:

Unlimited applications — free stops at 5
Track every application in one place
Apply straight to the source, one click
Save & organize roles you love
Roles pulled from company boards before the big sites

Weekly

$9.99

$4.99/week

For an active search. Cancel anytime.

Get Weekly

Monthly

$24.99

$12.99/month

The smart pick. Save 35% vs weekly.

Get Monthly

Lifetime

$99

$49.99once

Pay once. Every future feature, forever.

Get Lifetime