Data Scientist
https://dbs.wd3.myworkdayjobs.com/dbs_careers
- Employment
- Full-time
About the role
Role Overview
We are seeking a versatile Data Scientist to lead the development of high-quality, audio-driven digital avatars. This role combines cutting-edge Generative AI with foundational Machine Learning to create responsive, identity-consistent virtual humans. You will bridge the gap between "brain" and "body" by integrating RAG-based agents with multimodal synthesis models (ViT/VLM) to build avatars that don't just look real—they interact intelligently.
Core Responsibilities
Multimodal Synthesis: Develop SOTA audio-to-video pipelines using Vision Transformers (ViT) and VLMs to drive lip-sync, micro-expressions, and head poses.
Intelligent Interaction: Architect RAG (Retrieval-Augmented Generation) systems using LangChain and AI Agents to provide avatars with a searchable knowledge base and autonomous reasoning capabilities.
Customized Avatar Generation: Build person-specific fine-tuning workflows (LoRA, Adapters) to ensure 1:1 identity preservation from minimal reference footage.
Hybrid Modeling: Apply a mix of Deep Learning (CNNs for texture, RNN/LSTM for temporal audio sequences) and Classical ML (XGBoost/Random Forest for metadata classification or signal gating).
End-to-End Optimization: Own the pipeline from raw audio/text input to real-time rendered output, ensuring low-latency performance on GPU clusters.
Required Technical Stack
Generative AI & Agents:
Frameworks: Mastery of LangChain or LlamaIndex for building RAG pipelines.
Agents: Experience deploying autonomous agents to handle multi-step reasoning tasks.
Computer Vision & Multimodal:
Architectures: Deep expertise in ViT (feature encoding) and VLM (CLIP/BLIP for alignment).
Deep Learning: Hands-on experience with CNNs (spatial features), RNNs/LSTMs (temporal audio-visual sync), and GANs/Diffusion.
Core Machine Learning:
Algorithms: Proficiency in Random Forest, XGBoost, and SVMs for auxiliary data tasks (e.g., emotion classification or quality gating).
Frameworks: PyTorch (primary), TensorFlow, and Scikit-learn.
Data & Infrastructure:
Vector DBs: Experience with Pinecone, Milvus, or Weaviate for RAG storage.
Tools: FFmpeg for video processing and NVIDIA DeepStream for deployment.
Qualifications
Experience: 5+ years in Data Science with a focus on Multimodal ML or Digital Humans.
Education: Master’s or PhD in CS, AI, or a related quantitative field.
Problem Solving: Proven ability to solve the "uncanny valley" through superior temporal consistency and identity-aware fine-tuning.
Location:
Guangzhou (DTC)Job:
AnalyticsSchedule:
RegularEmployee Status:
Full time731,000+ hidden jobs like this
https://dbs.wd3.myworkdayjobs.com/dbs_careers and thousands of companies post here first — often days before LinkedIn or Indeed. Your first 5 applications are free; go Pro to apply without limits.
Everything Pro unlocks:
- Unlimited applications — free stops at 5
- Track every application in one place
- Apply straight to the source, one click
- Save & organize roles you love
- Roles pulled from company boards before the big sites