About the role
Company Description
We are SkillOnNet, leading the igaming entertainment by providing our customers with the most entertaining and trustworthy experience possible, while also reinventing the gambling industry. We are home to more than 30 well-known brands, including PlayOJO, DruckGluck, BacanaPlay, Genting, and many more. We are committed to long-term development and sustainability, and we are trying to revolutionize our industry for the benefit of our players, ourselves, and the entertainment industry as a whole.
Job Description
We are seeking a skilled and proactive DevOps Engineer to join our growing Tech team. This role is critical in ensuring the reliability, scalability, and performance of our infrastructure and development pipelines. You will work closely with the Heads of Development, System Administrators, and the Information Security team to maintain and evolve our systems and services.
Responsibilities include:
Kubernetes (K8s) Administration: Deploy, manage, scale, and troubleshoot Kubernetes clusters, ensuring high availability, performance, and reliability across production and non-production environments.
Observability & Monitoring: Design, maintain, and upgrade monitoring and alerting platforms using Grafana, Prometheus, and Alert manager to improve system visibility, incident detection, and operational efficiency.
Containerization & Image Management: Build, optimize, secure, and maintain Docker images and containerized applications, following best practices for performance, scalability, and vulnerability management.
CI/CD Automation: Design and maintain GitHub Actions workflows to automate testing, integration, deployment, and release processes, reducing manual intervention and accelerating software delivery.
Infrastructure Operations: Manage Linux servers and infrastructure components, including performance tuning, capacity planning, backup strategies, disaster recovery, and system hardening.
Ceph Storage Administration: Deploy, maintain, and optimize Ceph distributed storage clusters, ensuring data durability, scalability, and high availability for critical workloads.
Cloudflare Platform Management: Configure and manage Cloudflare services, including DNS, WAF, CDN, SSL/TLS, caching policies, and Workers to enhance security, performance, and application availability.
Security & Reliability Engineering: Implement infrastructure security best practices, automate compliance controls, manage secrets, and improve system resilience through proactive monitoring and incident response.
Infrastructure as Code (IaC): Automate infrastructure provisioning and configuration management using tools such as Terraform and Ansible to ensure consistency, repeatability, and scalability.
Incident Management & Troubleshooting: Lead root-cause analysis, resolve production issues, and implement preventive measures to minimize downtime and improve platform stability.
What we are looking for:
Bachelor's degree in Computer Science, Information Technology, Engineering, or a related field, or equivalent practical experience.
Proven experience in DevOps, Site Reliability Engineering (SRE), Platform Engineering, or Infrastructure Operations roles.
Strong hands-on expertise with Kubernetes and Docker, including deploying, scaling, troubleshooting, and maintaining containerized workloads in production environments.
Experience designing and managing CI/CD pipelines, preferably using GitHub Actions, with a focus on automation, reliability, and deployment best practices.
Solid knowledge of observability and monitoring platforms, including Grafana, Prometheus, and Alertmanager, with experience implementing effective monitoring and alerting strategies.
Experience administering distributed storage systems such as Ceph, including performance optimization, capacity planning, and high-availability configurations.
Strong understanding of Linux system administration, networking, security best practices, and infrastructure troubleshooting.
Experience managing servers, backups, disaster recovery processes, and overall infrastructure health and performance.
Hands-on experience with Cloudflare services, including DNS, WAF, CDN, SSL/TLS, caching, and security configurations.
Strong problem-solving, analytical, communication, and cross-functional collaboration skills.
Experience with Infrastructure as Code (IaC) tools such as Terraform or Ansible is highly desirable.
Familiarity with cloud platforms (AWS, Azure, or GCP) is considered an advantage.
Nice to Have
Experience with database administration, including installation, configuration, performance tuning, backup and recovery, replication, and high-availability setups for databases such as PostgreSQL, MySQL, MariaDB, or MongoDB.
Knowledge of security monitoring, vulnerability management, and compliance best practices.
Experience supporting highly available, mission-critical production environments.
What's in it for YOU!
Excellent work environment
Monetary vouchers on Birthdays and other special occasions
Fully equipped kitchen and in-house entertaining space
Eligibility to enroll in Company's medical insurance plan
Eligibility to enroll in Company's pension plan
Exciting company activities including monthly lunches, corporate gatherings, an intercompany football team, competitions, and many other activities
Casual dress code
A chance to advance professionally inside one of the world's largest iGaming organizations
What Life at SkillOnNet is like!
SkillOnNet is a firm believer in putting people first and our “family oriented” multinational culture is what drives us. We care and focus on our staff and ensure that you are provided with the most relevant and valuable tools, privileges and amenities.
Perks & benefits
- Medical Insurance
481,000+ hidden jobs like this
SkillOnNet and thousands of companies post here first — often days before LinkedIn or Indeed. Your first 5 applications are free; go Pro to apply without limits.
Everything Pro unlocks:
- Unlimited applications — free stops at 5
- Track every application in one place
- Apply straight to the source, one click
- Save & organize roles you love
- Roles pulled from company boards before the big sites