Back home
A

Sr. Backend Operations Engineer - Coverstar

A16Z Speedrun

WorldwideRemote1y ago

About the role


Responsibilities:

  • Expand and enhance our Grafana/Prometheus monitoring solution.
  • Consolidate logs, metrics, and system health data for actionable insights and streamlined troubleshooting.
  • Configure automated alerts based on predefined thresholds and anomaly detection to ensure rapid incident response.
  • Diagnose and resolve infrastructure incidents between 9AM-9PM EDT, leveraging monitoring tools and system logs.
  • Implement corrective actions and preventive measures to avoid recurrence.
  • Analyze and optimize database queries, indexing, and partitioning strategies for enhanced performance and scalability.
  • Regularly inspect database tables, identifying areas for improvement and recommending necessary maintenance activities.
  • Monitor database usage trends to predict and proactively address scaling needs, preventing performance issues.
  • Improve platform-wide security monitoring with real-time analytics and automated anomaly detection to quickly identify and respond to threats.
  • Utilize security tools to simulate realistic attack scenarios to uncover vulnerabilities.
  • Conduct ongoing vulnerability assessments and automated penetration testing.
  • Strengthen and document incident response procedures, ensuring clear cross-team communication and swift incident remediation.
  • Develop and maintain robust CI/CD pipelines for efficient code integration, testing, and deployment.
  • Implement and integrate comprehensive testing frameworks, including unit, integration, and end-to-end tests, ensuring high-quality code delivery.
  • Collaborate with teams to enforce industry-standard security checks and continuous monitoring across the software delivery lifecycle.


Qualifications:

  • Extensive experience in backend infrastructure operations, including monitoring, incident management, database optimization, and security.
  • Strong proficiency with Grafana, Prometheus, PostgreSQL (Aurora), and CI/CD pipeline tools.
  • Proven ability to implement proactive security measures and conduct continuous assessments.
  • Excellent problem-solving and incident management skills.
  • Strong collaboration and communication skills, capable of cross-team coordination and documentation.
  • Availability during core operational hours (9AM-9PM EDT).

About the company

A

A16Z Speedrun

No company description available.

774,000+ hidden jobs like this

A16Z Speedrun and thousands of companies post here first, often days before LinkedIn or Indeed. Your first 5 applications are free; go Pro to apply without limits.

Weekly

$9.99
$4.99/week

For an active search. Cancel anytime.

Recommended

Monthly

$24.99
$12.99/month

The smart pick. Save 35% vs weekly.

Lifetime

$99
$49.99once

Pay once. Every future feature, forever.

  • Unlimited applications — free stops at 10
  • Track every application in one place
  • Apply straight to the source, one click
  • Save & organize roles you love
  • Roles pulled from company boards before the big sites