Back to all jobs
F

Site Reliability Engineer

Felix

WorldwideRemote20h ago
Employment
Full-time

About the role

  • Manage and optimize our infrastructure on Google Cloud Platform (GCP) and Google Kubernetes Engine (GKE).
  • Automate provisioning and configuration using Terraform, Helm, and scripting languages such as Go, Python, and Bash.
  • Build, maintain, and improve monitoring and alerting systems using OpenTelemetry standards
  • Participate in on-call rotations, incident response, and post-mortem analyses, ensuring rapid recovery and continuous learning from failures.
  • Define and track SLOs/SLIs and error budgets to monitor service health and performance.
  • Implement cloud security best practices to protect sensitive data and maintain the integrity of our systems.
  • Collaborate across Engineering, Security, and Product teams to embed reliability and automation in every phase of development and deployment.
  • Contribute to GKE cost optimization and resource management strategies to enhance efficiency and control operational spend.
  • 4+ years of experience as a Platform Engineer.
  • Strong hands-on experience with GCP and GKE.
  • Proficiency in Kubernetes (architecture, deployments, networking, and troubleshooting).
  • Solid programming or scripting skills in Go, Python, or Bash.
  • Proficiency with Docker and Linux
  • Experience with Terraform 
  • Experience with Helm
  • Experience with GitHub Actions
  • Strong understanding of monitoring and observability using Prometheus, Grafana, and logging frameworks.
  • Familiarity with incident management, on-call operations, and post-mortem processes.
  • Knowledge of network fundamentals (TCP/IP, DNS, Load Balancing).
  • Experience with PostgreSQL or distributed databases.
  • Awareness of FinOps and cloud cost management principles.
  • Excellent problem-solving, communication, and collaboration skills, with a proactive mindset.
  • GCP certifications, such as Professional DevOps Engineer or Cloud Architect.
  • Certified Kubernetes Administrator (CKA).
  • Experience in FinOps, cloud security, or regulated industries.
  • Familiarity with PagerDuty or similar incident management tools.
  • Background implementing SLOs/SLIs and error budgets in production environments.
  • These are the applicable requisites, although equivalent competencies in any of the above will also be considered.
  • Competitive salary
  • Initial stock options grant
  • Annual performance bonus
  • Health, dental, and vision plans 
  • Remote work environment, although we have offices in Miami and México City and would love to work in hybrid model if you are up to it.
  • Continuous learning opportunities 
  • Unlimited PTO
  • Paid parental leave
  • Empowering opportunities for growth in a dynamic entrepreneurial environment

Perks & benefits

  • Vision Insurance
  • Unlimited Vacation
  • Paid Time Off
  • Equity Compensation

483,000+ hidden jobs like this

Felix and thousands of companies post here first — often days before LinkedIn or Indeed. Your first 5 applications are free; go Pro to apply without limits.

Everything Pro unlocks:

  • Unlimited applications — free stops at 5
  • Track every application in one place
  • Apply straight to the source, one click
  • Save & organize roles you love
  • Roles pulled from company boards before the big sites

Weekly

$9.99
$4.99/week

For an active search. Cancel anytime.

Most popular

Monthly

$24.99
$12.99/month

The smart pick. Save 35% vs weekly.

Lifetime

$99
$49.99once

Pay once. Every future feature, forever.