Back to all jobs
A
Data Scientist
ardentprinciples
McLeanOn-site3w ago
- Employment
- Full-time
About the role
Key Responsibilities
- Building production data pipelines and ETL/ELT workflows at scale.
- Using Apache Spark and PySpark for distributed data processing.
- Advanced Python programming skills including data manipulation libraries (Pandas, NumPy) and data engineering best practices.
- Understanding data security, privacy, governance, and compliance principles.
- Workflow orchestration tools such as Step Functions and Airflow.
- Containerization such as Docker or Podman, and deploying data applications in cloud environments.
- AWS services (in particular S3, Lambda, and Step Functions).
- PostgreSQL and MySQL in production environments, including performance tuning and schema design.
- SQL and query optimization for complex analytical workloads.
- Version control (Git) and CI/CD practices for data pipelines.
- Working with stakeholders to understand data requirements, assess feasibility, and design appropriate solutions with minimal oversight.
- Strong problem-solving and debugging skills for data quality issues, pipeline failures, and performance bottlenecks.
Highly Desired Qualifications
- Data lakehouse architectures using Apache Iceberg.
- Configuring, deploying, and integrating data platform components: Apache Ranger (access control and data governance), Trino (distributed SQL query engine), Data catalogs (Unity Catalog OSS, Apache Polaris, etc.), and Apache Superset (data visualization and dashboarding).
- Bash scripting for automation and data processing tasks.
- Infrastructure as Code (Terraform or CloudFormation) for data infrastructure.
- Tracking data lineage and associated tooling such as OpenLineage.
- Using Java.
- Data quality frameworks, testing methodologies, and validation strategies.
- Background with large-scale data migrations or platform modernization efforts.
- Integrating AI/ML services and models (translation, OCR, speech-to-text, NLP, language detection, topic modeling), LLMs, and RAG (retrieval-augmented generation) pipelines.
- Geospatial data processing (H3, PostGIS, or similar).
- Contributing to data engineering documentation, best practices, or design patterns.
- NoSQL databases (DynamoDB, etc.).
- Excellent written and verbal communication skills with both technical and non-technical audiences.
- Linux Operating Systems
- Agile/Scrum development methodologies in a fast-paced, collaborative team environment.
- Working effectively in high-performing, cross-functional teams with multiple concurrent projects.
- Working directly with stakeholders to gather requirements, understand needs, and translate them into technical solutions with minimal oversight.
- Self-directed work with a strong ownership mentality and commitment to code quality, testing, and documentation.
- Context-switching between projects and systems as priorities demand.
What We Offer You
- Highly Competitive Salary: Recognizing and rewarding your expertise and contributions.
- Generous Paid Time Off: Providing ample time for rest, relaxation, and personal pursuits.
- Dedicated Training Budget: Supporting continuous learning and professional development.
- 100% Employer-Covered Family Vision, Dental, and Health Insurance: Ensuring comprehensive health coverage for you and your family.
- 100% Employer-Covered Life and Disability Insurance: Offering financial security and peace of mind.
- 401(k) Plan with a 6% Employer Match: Helping you plan and save for a secure retirement, with 100% vesting from day one.
- 11 Paid Government Holidays: Observing national holidays to ensure time off with family and friends.
- Spot Bonuses for Exceptional Performance: Rewarding outstanding contributions and achievements.
Perks & benefits
- 401k
- Medical Insurance
- Paid Time Off
- Pension Matching
- Learning Budget
723,000+ hidden jobs like this
ardentprinciples and thousands of companies post here first — often days before LinkedIn or Indeed. Your first 5 applications are free; go Pro to apply without limits.
Everything Pro unlocks:
- Unlimited applications — free stops at 5
- Track every application in one place
- Apply straight to the source, one click
- Save & organize roles you love
- Roles pulled from company boards before the big sites