Senior Data Scientist

Pune City1mo ago

Seniority: Senior

About the role

<h2>The Role</h2> <p>The <strong>Senior Data Scientist </strong>will lead the maturation of Securly's content classification system — building the ML infrastructure that determines, at scale, whether web content is appropriate for K-12 students, and establishing the rigorous evaluation framework that product and leadership teams depend on.</p> <p>This is applied ML with direct student safety impact — not research. You will lead a significant uplift of Securly's classification models: refactoring binary models to proper multiclass classification, building labeled evaluation datasets, and producing standardized model cards with per-category precision, recall, F1, and confusion matrix analysis.</p> <p>At L5, you are the technical leader of the data science function for content safety. You will define the evaluation methodology the team follows, set the standard for what a model card must contain before a model ships, mentor the team on applied ML rigor, and serve as the interface between data science and engineering on production integration constraints.</p> <div class="meta-item">Level: <strong>L5</strong></div> <div class="meta-item">Experience: <strong>8–15 Years</strong></div> <div class="meta-item">Location: <strong>Pune, India</strong></div> <div class="meta-item">Work Type: <strong>Hybrid (2 days onsite)</strong></div> <div class="meta-item">Reports To: <strong>Engineering Manager, Data Platform</strong></div> <h2>What It Means to Be L5 at Securly</h2> <p>L5 at Securly is a Staff Engineer. You are the technical owner, not just an implementer.</p> <ul> <li>Drive technical direction for your initiative end-to-end: from architecture to production, with minimal oversight from your engineering manager.</li> <li>Identify and resolve ambiguity in requirements, system boundaries, and design tradeoffs without waiting for a fully-formed spec.</li> <li>Mentor L3/L4 engineers on the team: code reviews, design feedback, pairing, and raising the bar for what production-quality work looks like.</li> <li>Partner with your L6 technical lead and the Distinguished Engineer on architectural decisions, surfacing tradeoffs clearly rather than deferring them upward.</li> <li>Contribute to cross-team engineering standards: you are expected to influence practices beyond your immediate squad.</li> <li>Translate technical context into clear written artifacts that non-engineers (PM, Support, Leadership) can act on.</li> <li>Participate in on-call rotation and own the full incident lifecycle for your system: detection, diagnosis, resolution, and retrospective.</li> </ul> <h2>What You'll Do</h2> <ul> <li>Define the evaluation methodology for content classification at Securly: establish what a model card must contain and hold every model release to that standard before it ships.</li> <li>Lead the multiclass refactor of Securly's content classification models: redesign binary models to handle multi-label, multi-class content categories (Adult Content, Violence, Self-Harm, Social Media, and others).</li> <li>Build and maintain labeled evaluation datasets with robust annotation workflows; address class imbalance and label noise systematically; document dataset curation decisions in a versioned data card.</li> <li>Connect offline evaluation to production monitoring — surface classification drift and error patterns before they become customer-facing issues.</li> <li>Investigate and resolve misclassification errors: false positives (over-blocking) and false negatives (under-blocking); produce written root cause analyses.</li> <li>Build and maintain training data pipelines: ingestion, cleaning, labeling, and versioning at scale.</li> <li>Mentor the existing AI team on evaluation methodology, model development practices, and data science communication rigor.</li> <li>Communicate precision/recall tradeoffs to product managers and engineers; produce executive-level summaries of classification quality for leadership.</li> <li>Collaborate with engineering to integrate model outputs into the production filtering stack with appropriate latency and reliability constraints.</li> <li>Research and prototype improvements: feature representations, model architectures, active learning for label efficiency, domain adaptation for emerging content categories.</li> </ul> <h2>Skills & Requirements</h2> <div class="skill-section must"> <h3>Must-Have</h3> <ul> <li>Machine learning — multi-label/multi-class classification, model evaluation methodology, handling class imbalance, feature engineering for text and URL data. 5+ years in applied ML roles.</li> <li>Python (ML stack) — production-quality code: scikit-learn, PyTorch or TensorFlow, pandas, numpy. Notebooks for exploration; production-grade pipelines for delivery.</li> <li>Text / NLP feature engineering — URL tokenization, domain analysis, HTML content features, TF-IDF or embedding-based representations for web content classification.</li> <li>ML evaluation rigor — precision/recall tradeoffs, confusion matrix analysis, offline vs. online evaluation, A/B testing, reproducible model cards. At L5, you define the evaluation standard.</li> <li>Data engineering for ML — training data pipelines, data versioning, handling noisy and partially labeled datasets, annotation workflow design.</li> <li>Technical communication and stakeholder influence — ability to present quantitative model quality findings to both engineering and non-technical leadership.</li> </ul> </div> <div class="skill-section pref"> <h3>Strongly Preferred</h3> <ul> <li>Large-scale classification in production — shipping models with latency and throughput constraints; understanding the gap between offline eval metrics and live production behavior.</li> <li>Active learning / annotation workflows — strategies for efficient label acquisition on large, imbalanced datasets.</li> <li>Cloud ML infrastructure — AWS SageMaker, GCP Vertex AI, or equivalent for training pipelines, experiment tracking, and model deployment.</li> </ul> </div> <div class="skill-section nice"> <h3>Nice to Have</h3> <ul> <li>Web content / URL classification domain — prior work on web categorization, safe browsing, or content policy systems.</li> <li>K-12 / CIPA compliance — understanding of regulated content categories and compliance requirements around false negative rates.</li> <li>LLM-based classification — zero-shot or few-shot content classification for emerging categories without labeled training data.</li> <li>Graph / network features — domain co-occurrence, DNS graph signals, or network-based features for domain classification at scale.</li> </ul> </div> <h2>Who You Are</h2> <ul> <li>You have shipped ML models to production and lived with the consequences — you know what model drift looks like and how to catch it before it becomes a customer issue.</li> <li>You treat evaluation as a first-class engineering artifact. A model without a model card is not finished — and you set and enforce that standard for the team.</li> <li>You define the methodology, not just apply it. You produce the evaluation framework that other data scientists use, and you hold them to it.</li> <li>You can communicate precision/recall tradeoffs to a product manager and to a senior engineer in the same conversation, calibrated to each audience.</li> <li>You are energized by problems with real stakes: a false negative in Self-Harm classification is not an acceptable error rate.</li> <li>You mentor by example and by expectation: your code, your analysis, and your documentation set the standard.</li> </ul> <h2>About Securly</h2> <div class="about"> <p>Securly processes over 1.1 billion requests per day and 54 TB of data daily, protecting more than 20 million students across 20,000+ schools globally. Since pioneering the first cloud-based web filter for K-12 in 2013, Securly has built one of the most trusted, high-scale platforms for student safety, wellness, and engagement. By turning data into meaningful, actionable intelligence, Securly enables schools to identify risk earlier, reduce harmful incidents, and strengthen student support.</p> <p>We are proud to be consistently recognized as a Top Place to Work, named a Top 40 Most Used EdTech platform, and included on the GSV 150 list as one of the most transformational growth companies in digital learning and workforce skills.</p> </div> <h2>Benefits</h2> <div class="benefits"> <ul> <li>Comprehensive Health Insurance (employee, parents, spouse, children)</li> <li>Accidental & Term Life Insurance</li> <li>Learning & Development reimbursement</li> <li>Paid Time Off</li> <li>Public Holidays (10+ per year)</li> <li>Retirement Benefits (EPF & gratuity)</li> <li>Parental Leave (as per statutory norms)</li> </ul> </div> <div class="eeo"><strong>Equal Opportunity Employer</strong><br>Securly is an Equal Opportunity Employer committed to inclusion, fairness, and respect. We welcome applicants from all backgrounds, identities, and experiences. #LI-REMOTE #LI-DO1</div>

Perks & benefits

Medical Insurance
Paid Time Off

731,000+ hidden jobs like this

securly13 and thousands of companies post here first — often days before LinkedIn or Indeed. Your first 5 applications are free; go Pro to apply without limits.

Everything Pro unlocks:

Unlimited applications — free stops at 5
Track every application in one place
Apply straight to the source, one click
Save & organize roles you love
Roles pulled from company boards before the big sites

Weekly

$9.99

$4.99/week

For an active search. Cancel anytime.

Get Weekly

Monthly

$24.99

$12.99/month

The smart pick. Save 35% vs weekly.

Get Monthly

Lifetime

$99

$49.99once

Pay once. Every future feature, forever.

Get Lifetime