Back to all jobs
sourcemeridian logo

164. Data Engineer

sourcemeridian
Medellín1w ago

About the role

<h2 data-start="224" data-end="311"><strong>We’re looking for a Data Engineer to join Source Meridian.</strong></h2> <h2 data-start="318" data-end="347"><strong data-start="322" data-end="347">About Source Meridian</strong></h2> <p data-start="673" data-end="799">Source Meridian is a development software company that works to solve the industry’s most challenging problems in healthcare practices. We are laser focused on specific technologies in the healthcare and life science industries: Healthcare technology, artificial intelligence, and healthcare interoperability.</p> <h2><strong>About the Role</strong></h2> <p>We're looking for a Data Engineer to help build and operate an AWS-native data platform processing healthcare claims data and tokenized identifiers. You'll design and implement Spark-based pipelines that transform, intersect, and enrich tokenized datasets stored primarily as Parquet on S3, queried via Athena and related AWS services. This environment intentionally avoids managed lakehouse platforms (e.g., no Databricks and no Snowflake)—you'll be doing "real" data engineering directly on AWS.</p> <h2 data-start="1253" data-end="1272">What You’ll Do</h2> <ul data-start="1273" data-end="2167"> <li data-start="1273" data-end="1358"> <p data-start="1275" data-end="1358">Build and maintain Spark pipelines to process large-scale Parquet datasets on S3.</p> </li> <li data-start="1359" data-end="1481"> <p data-start="1361" data-end="1481">Implement tokenization workflows, including transit token → real token conversion and dataset intersection/join logic.</p> </li> <li data-start="1482" data-end="1612"> <p data-start="1484" data-end="1612">Process and deliver healthcare claims datasets for matched individuals, ensuring accurate identity mapping and data integrity.</p> </li> <li data-start="1613" data-end="1713"> <p data-start="1615" data-end="1713">Orchestrate data pipelines using Airflow and/or AWS-native orchestration tools when appropriate.</p> </li> <li data-start="1714" data-end="1828"> <p data-start="1716" data-end="1828">Develop reliable, testable, and observable ETL/ELT processes (retries, idempotency, monitoring, reprocessing).</p> </li> <li data-start="1829" data-end="1932"> <p data-start="1831" data-end="1932">Optimize performance and cost across Spark jobs, S3 partitioning/layout, and Athena query patterns.</p> </li> <li data-start="1933" data-end="2032"> <p data-start="1935" data-end="2032">Contribute to dbt models when applicable (transformations, documentation, data quality checks).</p> </li> <li data-start="2033" data-end="2167"> <p data-start="2035" data-end="2167">Collaborate with cross-functional stakeholders in a healthcare environment, with a strong focus on privacy and secure data handling.</p> </li> </ul> <h2 data-start="2174" data-end="2202">Required Qualifications</h2> <ul data-start="2203" data-end="3041"> <li data-start="2203" data-end="2263"> <p data-start="2205" data-end="2263">1 -2 years of professional experience in Data Engineering.</p> </li> <li data-start="2264" data-end="2397"> <p data-start="2266" data-end="2397">Strong experience with&nbsp;<strong data-start="2289" data-end="2305">Apache Spark</strong>&nbsp;(PySpark or Scala), including joins, intersections, partitioning, and performance tuning.</p> </li> <li data-start="2398" data-end="2728"> <p data-start="2400" data-end="2468">Strong hands-on experience with the&nbsp;<strong data-start="2436" data-end="2454">AWS data stack</strong>, including:</p> <ul data-start="2471" data-end="2728"> <li data-start="2471" data-end="2553"> <p data-start="2473" data-end="2553">Amazon S3 (Parquet datasets, partition strategies, data layout best practices)</p> </li> <li data-start="2556" data-end="2624"> <p data-start="2558" data-end="2624">Amazon Athena (SQL, query optimization, managing large datasets)</p> </li> <li data-start="2627" data-end="2728"> <p data-start="2629" data-end="2728">Familiarity with AWS-native data lake patterns (Glue Catalog, Lake Formation concepts are a plus)</p> </li> </ul> </li> <li data-start="2729" data-end="2839"> <p data-start="2731" data-end="2839">Experience building and operating pipelines using&nbsp;<strong data-start="2781" data-end="2792">Airflow</strong>&nbsp;(DAGs, scheduling, dependencies, backfills).</p> </li> <li data-start="2840" data-end="2906"> <p data-start="2842" data-end="2906">Excellent&nbsp;<strong data-start="2852" data-end="2859">SQL</strong>&nbsp;skills and solid data modeling fundamentals.</p> </li> <li data-start="2907" data-end="3041"> <p data-start="2909" data-end="3041">Advanced English level: able to lead technical discussions, write clear documentation, and work directly with US-based stakeholders.</p> </li> </ul> <h2 data-start="3048" data-end="3065">Nice to Have</h2> <ul data-start="3066" data-end="3498"> <li data-start="3066" data-end="3134"> <p data-start="3068" data-end="3134">Experience with&nbsp;<strong data-start="3084" data-end="3091">dbt</strong>&nbsp;(core, tests, documentation, exposures).</p> </li> <li data-start="3135" data-end="3222"> <p data-start="3137" data-end="3222">Familiarity with healthcare data (claims data, eligibility, member-level datasets).</p> </li> <li data-start="3223" data-end="3315"> <p data-start="3225" data-end="3315">Experience with tokenization, identity resolution, or privacy-preserving data workflows.</p> </li> <li data-start="3316" data-end="3414"> <p data-start="3318" data-end="3414">Knowledge of AWS security concepts such as&nbsp;<strong data-start="3361" data-end="3411">IAM, KMS, encryption, and secure data handling</strong>.</p> </li> <li data-start="3415" data-end="3498"> <p data-start="3417" data-end="3498">Experience running Spark on AWS (e.g., EMR) or Spark-on-containers architectures.</p> </li> </ul> <h2 data-start="3505" data-end="3520">Tech Stack</h2> <ul data-start="3521" data-end="3719"> <li data-start="3521" data-end="3548"> <p data-start="3523" data-end="3548">AWS-native architecture</p> </li> <li data-start="3549" data-end="3593"> <p data-start="3551" data-end="3593">Amazon S3 + Parquet (core storage layer)</p> </li> <li data-start="3594" data-end="3626"> <p data-start="3596" data-end="3626">Amazon Athena (query engine)</p> </li> <li data-start="3627" data-end="3659"> <p data-start="3629" data-end="3659">Apache Spark (no Databricks)</p> </li> <li data-start="3660" data-end="3687"> <p data-start="3662" data-end="3687">Airflow (orchestration)</p> </li> <li data-start="3688" data-end="3719"> <p data-start="3690" data-end="3719">dbt (optional, as applicable)</p> </li> </ul> <h2 data-start="2398" data-end="2418">Soft Skills</h2> <ul data-start="2419" data-end="2675"> <li data-start="2419" data-end="2456"> <p data-start="2421" data-end="2456">Strong and empathetic leadership.</p> </li> <li data-start="2457" data-end="2497"> <p data-start="2459" data-end="2497">Proven&nbsp;<strong data-start="2466" data-end="2494">client-facing experience</strong>.</p> </li> <li data-start="2498" data-end="2537"> <p data-start="2500" data-end="2537">Excellent&nbsp;<strong data-start="2510" data-end="2534">communication skills</strong>.</p> </li> <li data-start="2538" data-end="2586"> <p data-start="2540" data-end="2586">Strong&nbsp;<strong data-start="2547" data-end="2573">expectation management</strong>&nbsp;abilities.</p> </li> <li data-start="2587" data-end="2675"> <p data-start="2589" data-end="2675">Strategic mindset with a solution-oriented approach and strong decision-making skills.</p> </li> </ul> <h2 data-start="3216" data-end="3237"><strong data-start="3220" data-end="3237">What We Offer</strong></h2> <p data-start="3239" data-end="3414">✔ Permanent contract<br data-start="3259" data-end="3262">✔ Learning and continuous growth environment 🚀<br data-start="3309" data-end="3312">✔ Benefits package focused on health and well-being 🎉<br data-start="3366" data-end="3369">✔ Competitive salary based on experience 💰</p> <p data-start="3416" data-end="3470">&nbsp;</p> <p data-start="3416" data-end="3470">📍&nbsp;<strong data-start="3419" data-end="3470">Apply only if you reside in Colombia or Ecuador</strong></p> <p data-start="3416" data-end="3470">&nbsp;</p> <p data-start="3477" data-end="3605">At&nbsp;<strong data-start="3480" data-end="3499">Source Meridian</strong>, you’ll be part of a high-impact&nbsp;<strong data-start="3533" data-end="3548">tech-health</strong>&nbsp;company, building products that truly make a difference.</p> <p data-start="3607" data-end="3716">If you meet the profile — or know someone who might be interested —&nbsp;<strong data-start="3675" data-end="3689">apply now!</strong></p> <p data-start="3607" data-end="3716">We’d love to meet you 💬</p>

731,000+ hidden jobs like this

sourcemeridian and thousands of companies post here first — often days before LinkedIn or Indeed. Your first 5 applications are free; go Pro to apply without limits.

Everything Pro unlocks:

  • Unlimited applications — free stops at 5
  • Track every application in one place
  • Apply straight to the source, one click
  • Save & organize roles you love
  • Roles pulled from company boards before the big sites

Weekly

$9.99
$4.99/week

For an active search. Cancel anytime.

Most popular

Monthly

$24.99
$12.99/month

The smart pick. Save 35% vs weekly.

Lifetime

$99
$49.99once

Pay once. Every future feature, forever.