Data Scientist

https://dbs.wd3.myworkdayjobs.com/dbs_careers

Guangzhou / Guangzhou (DTC)Hybrid4w ago

Employment: Full-time

About the role

Role Overview
We are seeking a versatile Data Scientist to lead the development of high-quality, audio-driven digital avatars. This role combines cutting-edge Generative AI with foundational Machine Learning to create responsive, identity-consistent virtual humans. You will bridge the gap between "brain" and "body" by integrating RAG-based agents with multimodal synthesis models (ViT/VLM) to build avatars that don't just look real—they interact intelligently.

Core Responsibilities

Multimodal Synthesis: Develop SOTA audio-to-video pipelines using Vision Transformers (ViT) and VLMs to drive lip-sync, micro-expressions, and head poses.
Intelligent Interaction: Architect RAG (Retrieval-Augmented Generation) systems using LangChain and AI Agents to provide avatars with a searchable knowledge base and autonomous reasoning capabilities.
Customized Avatar Generation: Build person-specific fine-tuning workflows (LoRA, Adapters) to ensure 1:1 identity preservation from minimal reference footage.
Hybrid Modeling: Apply a mix of Deep Learning (CNNs for texture, RNN/LSTM for temporal audio sequences) and Classical ML (XGBoost/Random Forest for metadata classification or signal gating).
End-to-End Optimization: Own the pipeline from raw audio/text input to real-time rendered output, ensuring low-latency performance on GPU clusters.

Required Technical Stack

Generative AI & Agents:
- Frameworks: Mastery of LangChain or LlamaIndex for building RAG pipelines.
- Agents: Experience deploying autonomous agents to handle multi-step reasoning tasks.
Computer Vision & Multimodal:
- Architectures: Deep expertise in ViT (feature encoding) and VLM (CLIP/BLIP for alignment).
- Deep Learning: Hands-on experience with CNNs (spatial features), RNNs/LSTMs (temporal audio-visual sync), and GANs/Diffusion.
Core Machine Learning:
- Algorithms: Proficiency in Random Forest, XGBoost, and SVMs for auxiliary data tasks (e.g., emotion classification or quality gating).
- Frameworks: PyTorch (primary), TensorFlow, and Scikit-learn.
Data & Infrastructure:
- Vector DBs: Experience with Pinecone, Milvus, or Weaviate for RAG storage.
- Tools: FFmpeg for video processing and NVIDIA DeepStream for deployment.

Qualifications

Experience: 5+ years in Data Science with a focus on Multimodal ML or Digital Humans.
Education: Master’s or PhD in CS, AI, or a related quantitative field.
Problem Solving: Proven ability to solve the "uncanny valley" through superior temporal consistency and identity-aware fine-tuning.

Location:

Guangzhou (DTC)

Job:

Analytics

Schedule:

Regular

Employee Status:

Full time

731,000+ hidden jobs like this

https://dbs.wd3.myworkdayjobs.com/dbs_careers and thousands of companies post here first — often days before LinkedIn or Indeed. Your first 5 applications are free; go Pro to apply without limits.

Everything Pro unlocks:

Unlimited applications — free stops at 5
Track every application in one place
Apply straight to the source, one click
Save & organize roles you love
Roles pulled from company boards before the big sites

Weekly

$9.99

$4.99/week

For an active search. Cancel anytime.

Get Weekly

Monthly

$24.99

$12.99/month

The smart pick. Save 35% vs weekly.

Get Monthly

Lifetime

$99

$49.99once

Pay once. Every future feature, forever.

Get Lifetime