Diabolocom - AI Research Engineer

AI Research Engineer

Paris

AI – Engineering /

Full-time /

Remote

Submit your application

Resume/CV ✱
ATTACH RESUME/CV
Couldn't auto-read resume.
Analyzing resume...
Success!
File exceeds the maximum upload size of 100MB. Please try a smaller size.
Full name✱
Email✱
Phone ✱
Current location
No location found. Try entering a different location
Loading
Current company

Salary expectations

What would be your salary expectations? (EUR gross per year)✱

Research Experience

In which domains have you contributed to SOTA or production-grade research?✱
- NER / Information Extraction (Zero-shot/Few-shot focus)
- RAG (Hybrid Search, Re-ranking, Dense Retrieval)
- Agentic Reasoning (Chain-of-Thought, ReAct, Formal Verification)
- Generative Modeling / Synthetic Data (GANs, VAEs, LLM-distillation)
- ASR / Audio Foundations (CTC, Whisper-variants, Diarization)
- Efficient ML (Quantization, PEFT, Knowledge Distillation)
- No prior experience in research-to-production pipelines
Explain the difference between Encoder-only, Decoder-only, and Encoder-Decoder architectures. Also explain in what research scenario would you choose an Encoder-only model over a Decoder-only model for a classification task, and why?✱
Synthetic Data Generation: Describe a time you generated synthetic data for a project. How did you ensure diversity (preventing the model from repeating patterns)? How did you validate the quality of the generated data without manually checking every row? If you have no experience with this, please specify.✱
Agentic System Architecture: You are building an agent that must call set of external API. How do you ensure the LLM outputs valid JSON for the API? How do you handle "State"? (e.g., The user provides the Order ID in message 1, but confirms the cancellation in message 3).✱
Latency vs. Quality in Real-Time Systems: We need a chatbot that processes voice transcripts in real-time. The goal is low latency (Time-To-First-Token) without sacrificing accuracy. How would you architect this? How does your design change if the input text is noisy (ASR errors like "uhm", "ah", or misspelled entities)?✱
Evaluation & Benchmarking: Have you trained an LLM for a specific task (e.g., extracting ticket details). How did you go about training the model? What was the data? How did you evaluate this model beyond just loss curves? Were standard benchmarks (like MMLU) relevant there? And if not, what specific metrics would you design for this task? If you have no experience with this, please specify.✱
Multilingual Adaptation: An LLM is trained on English and French. You need to adapt it to Japanese. What specific challenges arise regarding tokenization for Japanese compared to European languages? How would you efficiently adapt the model without full re-training?✱

AI Research Engineer

Submit your application

Links

Salary expectations

Research Experience

Additional information