Back to all jobs

About the role
<h2 class="iCIMS_InfoMsg iCIMS_InfoField_Job"><span style="font-size: large;">Overview</span></h2>
<p>Black Canyon Consulting <strong>(BCC)</strong> is searching for <strong>Data Engineer(s)</strong> to support our work for the National Center for Biotechnology Information (NCBI) at the National Library of Medicine (NLM), an institute of the National Institutes of Health. <span class="c-message__edited_label">This opportunity is full time at the NCBI in Bethesda, MD.</span></p>
<p>The NCBI is part of the National Library of Medicine (NLM) at the National Institutes of Health (NIH). <a href="https://www.ncbi.nlm.nih.gov/" target="_blank" data-saferedirecturl="https://www.google.com/url?q=https://www.ncbi.nlm.nih.gov/&source=gmail&ust=1652453937473000&usg=AOvVaw2LTP8wzuDBpFA8Iyje_vCS">NCBI</a> is the world's premier biomedical center hosting over six million daily users seeking research, clinical, genetic, and other information impacting biomedical research and public health. At NCBI, you can literally help accelerate cures for diseases! NCBI's wide range of applications, platforms (node, python, Django, C++, you name it), and environments (big data [petabytes], machine learning, multiple clouds) serve more users than almost any other US Government Agency, according to <a href="https://analytics.usa.gov/" target="_blank" data-saferedirecturl="https://www.google.com/url?q=https://analytics.usa.gov/&source=gmail&ust=1652453937473000&usg=AOvVaw2IhfoazXAUq_GXdeUQll4e">https://analytics.usa.gov/</a>.</p>
<p>We attract the best people in the business with our competitive benefits package that includes medical, dental and vision coverage, 401k plan with employer contribution, paid holidays, vacation, and tuition reimbursement. If you enjoy being a part of a high performing, professional service and technology focused organization, please apply today!</p>
<div class="iCIMS_InfoMsg iCIMS_InfoMsg_Job">
<div class="iCIMS_Expandable_Container">
<div class="iCIMS_Expandable_Text">
<div class="iCIMS_InfoMsg iCIMS_InfoMsg_Job">
<div class="iCIMS_Expandable_Container">
<div class="iCIMS_Expandable_Text">
<h3><strong>Job Description </strong></h3>
<ul style="font-weight: 400;">
<li>There are multiple openings. You will work with a talented group of scientists and software developers to design, develop, test, and maintain programs for NCBI's world's premier biomedical data resources, with examples below - </li>
</ul>
<ul style="font-weight: 400;">
<ul>
<li><a href="https://pubmed.ncbi.nlm.nih.gov/" data-saferedirecturl="https://www.google.com/url?q=https://pubmed.ncbi.nlm.nih.gov/&source=gmail&ust=1669215701496000&usg=AOvVaw1uw6rawTcfC8InkkX2dXSh">PubMed</a> - with 33+ million biomedical literature and 5+ million daily users</li>
<li>GenBank - with over 12 trillion nucleotide bases. A part of the <a href="https://www.ncbi.nlm.nih.gov/genbank/collab" data-saferedirecturl="https://www.google.com/url?q=https://www.ncbi.nlm.nih.gov/genbank/collab&source=gmail&ust=1669215701496000&usg=AOvVaw2Gb6Wd_oJToYPNRQafe85j">International Nucleotide Sequence Database Collaboration</a>(INSDC), exchanging data with the DNA DataBank of Japan (DDBJ) and the European Nucleotide Archive (ENA) daily.</li>
<li>SRA - The Sequence Read Archive(SRA), the largest publicly available repository of high-throughput sequencing data, available in multiple cloud providers and NCBI servers, also part of <a href="https://www.ncbi.nlm.nih.gov/genbank/collab" data-saferedirecturl="https://www.google.com/url?q=https://www.ncbi.nlm.nih.gov/genbank/collab&source=gmail&ust=1669215701496000&usg=AOvVaw2Gb6Wd_oJToYPNRQafe85j">International Nucleotide Sequence Database Collaboration</a>(INSDC). </li>
<li><a href="https://clinicaltrials.gov/" data-saferedirecturl="https://www.google.com/url?q=https://clinicaltrials.gov/&source=gmail&ust=1669215701496000&usg=AOvVaw0ibFaDmXKwLf9pAXIUAq1_">ClinicalTrials.gov</a> - providing access to both privately and publicly funded clinical trial studies around the world</li>
</ul>
<li>Specific tasks may include implementing efficient bioinformatic algorithms, and facilitating the development of cloud-ready tools and pipelines to improve the performance and scalability of searching in and submitting to the more than ten terabytes of genetic sequence data at NCBI.</li>
</ul>
<h3><strong>Required Skills</strong></h3>
<ul style="font-weight: 400;">
<li>Proficiency in Python</li>
<li>Experience with MS SQL server and relational database design and optimization</li>
<li>Programming experience in a Linux environment and shell scripts such as BASH</li>
<li>Experience in handling large amounts of data</li>
<li>Ability to work with common structured documents (at least one of XML, JSON)</li>
<li>Experience with CI/CD pipelines, unit tests, integration, and regression testing</li>
</ul>
<h3> <strong>Desired Skills</strong></h3>
<ul style="font-weight: 400;">
<li>Experience with Cloud technologies:</li>
</ul>
<ul style="font-weight: 400;">
<ul>
<li>AWS: EC2, S3, Lambda</li>
<li>GCP: GKE, Google Store, Cloud functions</li>
</ul>
<li>5+ years of working with genetic and biological data</li>
<li>Familiarity with NGS computational tools and formats (BWA, GATK, Galaxy, etc.)</li>
<li>Demonstrated active involvement into open source communities (github, etc.)</li>
<li>Experience managing production workflow of an online public databases</li>
<li>Experience with RESTful API design</li>
</ul>
</div>
</div>
</div>
</div>
</div>
</div>
Perks & benefits
- 401k
- Vision Insurance
- Paid Time Off
747,000+ hidden jobs like this
Black Canyon Consulting and thousands of companies post here first — often days before LinkedIn or Indeed. Your first 5 applications are free; go Pro to apply without limits.
Everything Pro unlocks:
- Unlimited applications — free stops at 5
- Track every application in one place
- Apply straight to the source, one click
- Save & organize roles you love
- Roles pulled from company boards before the big sites