Back to all jobs
A

Data Scientist

ardentprinciples

McLeanOn-site3w ago
Employment
Full-time

About the role

Key Responsibilities

  • Building production data pipelines and ETL/ELT workflows at scale.
  • Using Apache Spark and PySpark for distributed data processing.
  • Advanced Python programming skills including data manipulation libraries (Pandas, NumPy) and data engineering best practices.
  • Understanding data security, privacy, governance, and compliance principles.
  • Workflow orchestration tools such as Step Functions and Airflow.
  • Containerization such as Docker or Podman, and deploying data applications in cloud environments.
  • AWS services (in particular S3, Lambda, and Step Functions).
  • PostgreSQL and MySQL in production environments, including performance tuning and schema design.
  • SQL and query optimization for complex analytical workloads.
  • Version control (Git) and CI/CD practices for data pipelines.
  • Working with stakeholders to understand data requirements, assess feasibility, and design appropriate solutions with minimal oversight.
  • Strong problem-solving and debugging skills for data quality issues, pipeline failures, and performance bottlenecks.

Highly Desired Qualifications

  • Data lakehouse architectures using Apache Iceberg.
  • Configuring, deploying, and integrating data platform components: Apache Ranger (access control and data governance), Trino (distributed SQL query engine), Data catalogs (Unity Catalog OSS, Apache Polaris, etc.), and Apache Superset (data visualization and dashboarding).
  • Bash scripting for automation and data processing tasks.
  • Infrastructure as Code (Terraform or CloudFormation) for data infrastructure.
  • Tracking data lineage and associated tooling such as OpenLineage.
  • Using Java.
  • Data quality frameworks, testing methodologies, and validation strategies.
  • Background with large-scale data migrations or platform modernization efforts.
  • Integrating AI/ML services and models (translation, OCR, speech-to-text, NLP, language detection, topic modeling), LLMs, and RAG (retrieval-augmented generation) pipelines.
  • Geospatial data processing (H3, PostGIS, or similar).
  • Contributing to data engineering documentation, best practices, or design patterns.
  • NoSQL databases (DynamoDB, etc.).
  • Excellent written and verbal communication skills with both technical and non-technical audiences.
  • Linux Operating Systems
  • Agile/Scrum development methodologies in a fast-paced, collaborative team environment.
  • Working effectively in high-performing, cross-functional teams with multiple concurrent projects.
  • Working directly with stakeholders to gather requirements, understand needs, and translate them into technical solutions with minimal oversight.
  • Self-directed work with a strong ownership mentality and commitment to code quality, testing, and documentation.
  • Context-switching between projects and systems as priorities demand.

What We Offer You

  • Highly Competitive Salary: Recognizing and rewarding your expertise and contributions.
  • Generous Paid Time Off: Providing ample time for rest, relaxation, and personal pursuits.
  • Dedicated Training Budget: Supporting continuous learning and professional development.
  • 100% Employer-Covered Family Vision, Dental, and Health Insurance: Ensuring comprehensive health coverage for you and your family.
  • 100% Employer-Covered Life and Disability Insurance: Offering financial security and peace of mind.
  • 401(k) Plan with a 6% Employer Match: Helping you plan and save for a secure retirement, with 100% vesting from day one.
  • 11 Paid Government Holidays: Observing national holidays to ensure time off with family and friends.
  • Spot Bonuses for Exceptional Performance: Rewarding outstanding contributions and achievements.

Perks & benefits

  • 401k
  • Medical Insurance
  • Paid Time Off
  • Pension Matching
  • Learning Budget

723,000+ hidden jobs like this

ardentprinciples and thousands of companies post here first — often days before LinkedIn or Indeed. Your first 5 applications are free; go Pro to apply without limits.

Everything Pro unlocks:

  • Unlimited applications — free stops at 5
  • Track every application in one place
  • Apply straight to the source, one click
  • Save & organize roles you love
  • Roles pulled from company boards before the big sites

Weekly

$9.99
$4.99/week

For an active search. Cancel anytime.

Most popular

Monthly

$24.99
$12.99/month

The smart pick. Save 35% vs weekly.

Lifetime

$99
$49.99once

Pay once. Every future feature, forever.