Back to all jobs

About the role
<div class="content-intro"><p>Innodata (Nasdaq: INOD) is a global data engineering company. We believe that data and Artificial Intelligence (AI) are inextricably linked. Our mission is to enable the responsible advancement of artificial intelligence by providing the data, evaluation frameworks, and human expertise required to build AI systems that can be trusted at scale. We provide a range of transferable solutions, platforms, and services for Generative AI / AI builders and adopters. In every relationship, we honor our 36+ year legacy delivering the highest quality data and outstanding outcomes for our customers.</p></div><p><strong>Scope of the Role: </strong></p>
<p>We are seeking a Data Engineer to design and build enterprise data warehouses, data lakes, and pipelines that power data-driven decision-making for data center supply chain and real estate operations. This role is responsible for creating scalable, secure, and optimized ETL infrastructure on GCP/AWS, while enabling advanced AI/ML use cases such as RAG, copilots, and agentic AI for predictive analytics and workflow automation.</p>
<p data-start="774" data-end="910"><strong>What You’ll Own:</strong></p>
<ul>
<li>Design and implement data-driven solutions on GCP including BigQuery, Cloud Storage, Dataflow, Pub/Sub, and Looker/BI.</li>
<li>Build ETL scripts using SQL and Python to extract, clean, and transform structured and unstructured data from ERP, procurement, logistics, and facility management systems.</li>
<li>Develop and optimize data pipelines for ingestion, transformation, and loading into enterprise data lakes and warehouses.</li>
<li>Build and extend end-to-end data and BI solutions, spanning extraction, storage, transformation, and visualization layers.</li>
<li>Partner with supply chain, real estate, and AI/ML teams to provide pipelines for AI solutions (e.g., RAG ingestion, Copilot integration, multi-agent workflows).</li>
<li>Ensure data governance, lineage, and compliance across supply chain datasets.</li>
<li>Continuously optimize query performance, ETL processes, and pipeline reliability.</li>
</ul>
<p data-start="774" data-end="910"><strong>You’ll Thrive in This Role If You Have:</strong></p>
<ul>
<li data-leveltext="" data-font="Symbol" data-listid="4" data-list-defn-props="{"335551671":0,"335552541":1,"335559685":720,"335559991":360,"469769226":"Symbol","469769242":[8226],"469777803":"left","469777804":"","469777815":"hybridMultilevel"}" data-aria-posinset="7" data-aria-level="1"><span data-contrast="auto">Advanced proficiency in </span><span data-contrast="auto">SQL </span><span data-contrast="auto">(complex queries, optimization) and </span><span data-contrast="auto">Python </span><span data-contrast="auto">(data engineering, scripting, APIs). </span></li>
<li data-leveltext="" data-font="Symbol" data-listid="4" data-list-defn-props="{"335551671":0,"335552541":1,"335559685":720,"335559991":360,"469769226":"Symbol","469769242":[8226],"469777803":"left","469777804":"","469777815":"hybridMultilevel"}" data-aria-posinset="7" data-aria-level="1">Experience building ETL/ELT pipelines operating on <span data-contrast="auto">structured and unstructured data sources</span><span data-contrast="auto">. </span></li>
<li data-leveltext="" data-font="Symbol" data-listid="4" data-list-defn-props="{"335551671":0,"335552541":1,"335559685":720,"335559991":360,"469769226":"Symbol","469769242":[8226],"469777803":"left","469777804":"","469777815":"hybridMultilevel"}" data-aria-posinset="7" data-aria-level="1">Knowledge of <span data-contrast="auto">enterprise data warehouse and data lake architectures</span><span data-contrast="auto">. </span></li>
<li data-leveltext="" data-font="Symbol" data-listid="4" data-list-defn-props="{"335551671":0,"335552541":1,"335559685":720,"335559991":360,"469769226":"Symbol","469769242":[8226],"469777803":"left","469777804":"","469777815":"hybridMultilevel"}" data-aria-posinset="7" data-aria-level="1">Exposure to <span data-contrast="auto">data pipelines for AI/ML </span><span data-contrast="auto">(vector DB ingestion, embeddings, RAG pipelines, copilots, agents). </span></li>
<li data-leveltext="" data-font="Symbol" data-listid="4" data-list-defn-props="{"335551671":0,"335552541":1,"335559685":720,"335559991":360,"469769226":"Symbol","469769242":[8226],"469777803":"left","469777804":"","469777815":"hybridMultilevel"}" data-aria-posinset="7" data-aria-level="1">Familiarity with <span data-contrast="auto">supply chain or data center operations data </span><span data-contrast="auto">is a strong plus. </span></li>
<li data-leveltext="" data-font="Symbol" data-listid="4" data-list-defn-props="{"335551671":0,"335552541":1,"335559685":720,"335559991":360,"469769226":"Symbol","469769242":[8226],"469777803":"left","469777804":"","469777815":"hybridMultilevel"}" data-aria-posinset="7" data-aria-level="1">Bonus: experience with ML Engineering, <span data-contrast="auto">data visualization tools </span><span data-contrast="auto">(Looker, Tableau, Power BI) and </span><span data-contrast="auto">MLOps practices</span><span data-contrast="auto">. </span><span data-ccp-props="{}"> </span></li>
<li data-leveltext="" data-font="Symbol" data-listid="4" data-list-defn-props="{"335551671":0,"335552541":1,"335559685":720,"335559991":360,"469769226":"Symbol","469769242":[8226],"469777803":"left","469777804":"","469777815":"hybridMultilevel"}" data-aria-posinset="7" data-aria-level="1"><span data-ccp-props="{}"><span class="TextRun SCXW18054354 BCX8" lang="EN-IN" data-contrast="auto"><span class="NormalTextRun SCXW18054354 BCX8">Strong hands-on expertise with </span></span><span class="TextRun SCXW18054354 BCX8" lang="EN-IN" data-contrast="auto"><span class="NormalTextRun SCXW18054354 BCX8">GCP services</span></span><span class="TextRun SCXW18054354 BCX8" lang="EN-IN" data-contrast="auto"><span class="NormalTextRun SCXW18054354 BCX8">: </span><span class="NormalTextRun SCXW18054354 BCX8">BigQuery</span><span class="NormalTextRun SCXW18054354 BCX8">, Dataflow, Pub/Sub, Cloud Storage, Looker/BI (or similar, preferred). </span></span></span></li>
</ul>
<p><em>The expected salary range for this position is $100,000 – $120,000 USD per year, based on experience, skills, and qualifications.</em></p>
<p> </p><div class="content-conclusion"><p></p>
<p class="x_elementToProof"><em>Please be aware of recruitment scams involving individuals or organizations falsely claiming to represent employers. Innodata will never ask for payment, banking details, or sensitive personal information during the application process. To learn more on how to recognize job scams, please visit the Federal Trade Commission’s guide at </em><a href="https://consumer.ftc.gov/articles/job-scams." target="_blank">https://consumer.ftc.gov/articles/job-scams.</a><em> </em></p>
<p class="x_elementToProof"><em>If you believe you’ve been targeted by a recruitment scam, please report it to Innodata at </em><a href="mailto:verifyjoboffer@innodata.com" target="_blank">verifyjoboffer@innodata.com</a><em> and consider reporting it to the FTC at </em><a href="http://reportfraud.ftc.gov/" target="_blank">ReportFraud.ftc.gov</a><em>.</em></p>
<p></p></div>
731,000+ hidden jobs like this
Innodata Inc. and thousands of companies post here first — often days before LinkedIn or Indeed. Your first 5 applications are free; go Pro to apply without limits.
Everything Pro unlocks:
- Unlimited applications — free stops at 5
- Track every application in one place
- Apply straight to the source, one click
- Save & organize roles you love
- Roles pulled from company boards before the big sites