Back to all jobs

- Employment
- Full-time
About the role
How You'll Make an Impact
- Support key ITIL processes, including Incident management, request management, problem management and change management.
- Define and document runbooks and standard operating procedures.
- Field operational requests from our Application Support team and other internal stakeholders
- Triage and solve issues within defined SLA’s to ensure an excellent customer experience and to unblock other development and support teams
- Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
- Identify and troubleshoot problems, investigate root causes, and champion fixes across the organization.
- Work with infrastructure-as-Code (IaC) with a focus on continuous improvement.
- Collaborate with cross-functional team members on features and implementation within an agile environment.
- Report on SLAs and performance metrics as part of the Operations function.
- Participate in on-call rotation.
- A modern AWS cloud infrastructure managed through infrastructure-as-code (Terraform), configuration-as-code (Ansible), and CI/CD (Jenkins)
- RDS MySQL, Redshift, Redshift Spectrum, MongoDB, and Elasticsearch
- Kinesis, SQS, and RabbitMQ
- DevOps tools written in Python
- Back-end applications written using Java, Dropwizard, Spring Boot, and Hibernate
- Front-end applications written using TypeScript, JavaScript, React (Context Api and Hooks), and Redux
- Monitoring with DataDog, and CloudWatch
We'd Love to See
- Bachelor’s degree in computer science, Software engineering or equivalent experience
- 2+ years of experience in an IT Operational, DevOps, SRE, or Software Engineering role.
- Experience with cloud computing (AWS and Azure) services and a developing-level of knowledge with the management and setup of cloud infrastructure.
- You can write code - in any language. You have implemented your work in a production environment and can back it up with examples.
- Experience with tools and platforms such as: Ansible, Build/Release Pipelines, Docker, Github, Terraform etc.
- Developing-level of knowledge with distributed systems in the cloud using observability and telemetry for oversight of code deployments and service level objectives (SLOs).
- Developing experience with the operational aspects of software systems using telemetry, centralized logging, and alerting with tools such as: CloudWatch, Datadog, Prometheus, etc.
731,000+ hidden jobs like this
Vena Solutions and thousands of companies post here first — often days before LinkedIn or Indeed. Your first 5 applications are free; go Pro to apply without limits.
Everything Pro unlocks:
- Unlimited applications — free stops at 5
- Track every application in one place
- Apply straight to the source, one click
- Save & organize roles you love
- Roles pulled from company boards before the big sites