Software Development Engineer III (ML Platform)
The Role
At Everseen, we are scaling the ML Platform powering our global machine learning lifecycle. Partnering closely with AI Engineering, Research, and Operations, we collaborate to accelerate engineering iteration, guarantee experiment reproducibility, and bring clear transparency to workloads, resource allocation, and cloud costs.
In this role, you will help shape scalable services at the intersection of infrastructure and AI, contributing to a core sub-system or a significant slice of our MLOps platform.
As a senior team member, you will guide the design and hands-on delivery of elegant, strategic code for complex system features. Working closely across teams, you will help navigate technically complex initiatives with high business impact and architectural ambiguity. Success in this role thrives on mutual mentorship to foster team growth, a strong familiarity with key focus areas—such as model serving, training orchestration, or data lineage—and a shared architectural understanding of the MLOps stack.
What you’ll do
- Owns a core sub-system, a significant slice of a technology pillar, or complex business-critical features.
- Leads the design of complex features, services, or new sub-systems (e.g., component contracts, interface/API designs, SDKs, or high-throughput data pipelines).
- Directly drives progress on highly complex tasks that involve multiple cross-functional dependencies, high business impact, or ambiguous requirements. Support vulnerability scanning, management, penetration testing, and incident response.
- Influences team-level architecture, implementation approaches, and continuous technical improvements.
- Contributes significantly to strategic technical decisions within your specific product, platform, or service area.
- Aligns technical consensus across team members and represents your team's sub-system in cross-pillar/cross-team discussions.
- Contributes hands-on to the most technically challenging parts of the team's workload.
- Designs and implements highly reusable, efficient, and elegant code built for complex requirements and long-term system strategy.
- Improves team-level software quality by establishing stronger test coverage, advanced validation patterns, and proactive defect prevention.
- Drives improvements in team CI/CD processes and release quality through automated build, test, and deployment practices.
- Leads the diagnosis and troubleshooting of complex production issues within team systems, improving system resilience and observability.
- Utilizes production diagnostics, logs, stack traces, and system metrics to identify trends, isolate root causes, and propose engineering improvement opportunities.
- Research, evaluate, and propose third-party software solutions to optimize system performance and expand capabilities.
- Responsible for creating, reviewing, and maintaining high-quality technical documentation and durable design knowledge to ensure systems are easily understood and accessible.
- Mentors' engineers (SDE I/II) support their technical growth and reviews peers' designs.
- Proactively shares skills, knowledge, and technical expertise with members of the engineering team to raise the overall quality bar.
- Actively fosters a culture of collaboration, open communication, and continuous learning within the team.
Collaborating With
- AI Engineering: Our primary consumers — you build the training orchestration, model serving, and registry they use to ship product models, and the dataset and lineage plumbing underneath.
- Research: You give ML researchers fast, reproducible experimentation — experiment tracking, GPU access, and the evaluation harness that turns ideas into measured results.
- Operations: You partner with the teams running models at the edge — collecting data and helping them measure model performance.
- Annotation: You support the annotators who label our data with the annotation tooling they work in every day.
Profile and Skills
- 6+ years of relevant software engineering experience (or 3+ years of highly accelerated experience in a specialized global SaaS or high-growth technology environment).
- Bachelor’s degree in computer science, Engineering, or a related technical field (or equivalent practical experience).
Technical Skills & Competencies
- Programming Proficiency: In-depth knowledge of at least one relevant language (e.g., Python, JavaScript/TypeScript, C++), with strong programming skills to build and operate robust services.
- System Architecture: Strong understanding of distributed systems, microservices, and modern software engineering architectures.
- Linux & Systems Engineering: Excellent troubleshooting skills within Linux environments (including log investigations, performance tests, and connectivity analysis).
- Cloud & Infrastructure: Strong understanding of major cloud platforms, including container orchestration, storage services, and cloud security and scalability principles. Excellent understanding of cloud computing service models, including Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS) architecture.
- Best Practices: Solid understanding of engineering best practices, CI/CD pipelines, observability setups, and operational readiness.
Soft Skills & Leadership Attributes
- Problem Solving: Exceptional practical problem-solving and analytical skills, with a proven ability to navigate ambiguity and construct efficient solutions.
- Communication: Strong written and verbal technical communication skills, including the ability to lead collaborative design discussions and constructive reviews.
- Results-Oriented: Self-learning capability with strong attention to detail, a drive to achieve objectives efficiently, and a focus on delivering high customer satisfaction.
About Everseen
Our Culture
