Back to all jobs
B
Staff Scientist
BCC-NIH
Bethesda MD2mo ago
- Seniority
- Staff
About the role
<h2 class="iCIMS_InfoMsg iCIMS_InfoField_Job"><span style="font-size: large;">Overview</span></h2>
<p>Black Canyon Consulting is seeking a <strong>Staff Scientist</strong> to work with a Principal Investigatory in the National Institutes of Health at the National Library of Medicine to support the development of high-fidelity artificial intelligence models designed to decode the functional landscape of the human and mouse genomes. This effort will leverage Telomere-to-Telomere (T2T) reference assemblies to advance understanding of gene regulation, particularly within complex and repetitive genomic regions.</p>
<p>This position requires a unique combination of computational genomics expertise, machine learning proficiency, and scalable software engineering capabilities to support large-scale data integration and model development.</p>
<h3><strong>Responsibilities</strong></h3>
<ul>
<li>Lead the design, development, and implementation of AI-driven models for gene regulation analysis</li>
<li>Architect and scale a TREDNet-based framework for cloud-native execution</li>
<li>Optimize models for distributed, multi-GPU training environments</li>
<li>Integrate and analyze large-scale genomic and epigenomic datasets, including:</li>
<ul>
<li>ENCODE / modENCODE</li>
<li>NIH Roadmap Epigenomics</li>
<li>UCSC Genome Database</li>
</ul>
<li>Apply AI methodologies to functionally annotate repetitive genomic regions, including centromeres and telomeres</li>
<li>Develop and maintain scalable, containerized pipelines using Docker and/or Singularity</li>
<li>Implement MLOps best practices, including experiment tracking, model versioning, and reproducibility</li>
<li>Deploy and manage workflows in cloud environments (AWS, GCP, or Azure)</li>
<li>Collaborate with interdisciplinary teams across computational and life sciences domains</li>
</ul>
<h3><strong>Required Qualifications</strong></h3>
<ul>
<li>PhD in Computer Science, Computational Biology, Bioinformatics, or a related field</li>
<li>Minimum of 5 years of experience developing and deploying machine learning or deep learning models</li>
<li>Strong experience with cloud platforms (AWS, GCP, or Azure)</li>
<li>Proficiency in deep learning frameworks (PyTorch preferred; TensorFlow or HuggingFace acceptable)</li>
<li>Deep understanding of neural network architectures (CNNs, transformers, sequence models)</li>
<li>Strong programming skills in Python and experience working in Linux-based environments</li>
<li>Experience with MLOps practices, including experiment tracking and model versioning</li>
<li>Experience building and deploying containerized workflows (Docker and/or Singularity)</li>
<li>Experience with distributed training across GPUs or multi-node environments</li>
<li>Strong knowledge of genomics, gene regulation, and epigenomics</li>
<li>Experience working with large-scale biological datasets (e.g., ENCODE, Roadmap Epigenomics, UCSC Genome Browser)</li>
<li>Familiarity with genomics data formats (FASTA, VCF, BAM/CRAM, BED)</li>
</ul>
<h3><strong>Preferred Qualifications</strong></h3>
<ul>
<li>Experience with Telomere-to-Telomere (T2T) genome assemblies</li>
<li>Experience analyzing repetitive genomic regions (e.g., centromeres, telomeres)</li>
<li>Background in regulatory, functional, or comparative genomics (e.g., human vs. mouse)</li>
<li>Experience with hyperparameter tuning and large-scale model optimization</li>
<li>Familiarity with genomic foundation models or sequence-based deep learning approaches</li>
<li>Experience running ML workloads on GPU-enabled cloud or HPC environments</li>
<li>Familiarity with workflow orchestration tools (e.g., Nextflow, Snakemake, Airflow)</li>
<li>Experience transitioning research models into production-grade pipelines</li>
<li>Familiarity with CI/CD and infrastructure-as-code tools (e.g., Terraform)</li>
<li>Experience working in interdisciplinary teams</li>
</ul>
<h3><strong>Deliverables</strong></h3>
<ul>
<li>Develop a containerized (Docker/Singularity) TREDNet pipeline capable of scaling across multiple GPU nodes in a cloud environment</li>
<li>Produce a comprehensive functional map of the T2T reference genome, identifying regulatory motifs in previously unresolved regions</li>
<li>Develop comparative models between human and mouse cell lines to identify conserved regulatory mechanisms</li>
</ul>
<h2><strong>Benefits and Salary</strong></h2>
<p>We attract the best people in the business with our competitive benefits package, including medical, dental, and vision coverage; a 401(k) plan with employer contribution; paid holidays, vacation, and tuition reimbursement.</p>
<p>We offer a competitive salary commensurate with experience and location. The targeted range for this position is $110,000 - $140,000.</p>
<p>If you enjoy being part of a high-performing, professional, technology-focused organization, please apply today!</p>
Perks & benefits
- 401k
- Vision Insurance
- Paid Time Off
731,000+ hidden jobs like this
BCC-NIH and thousands of companies post here first — often days before LinkedIn or Indeed. Your first 5 applications are free; go Pro to apply without limits.
Everything Pro unlocks:
- Unlimited applications — free stops at 5
- Track every application in one place
- Apply straight to the source, one click
- Save & organize roles you love
- Roles pulled from company boards before the big sites