Back from the 2026 Databricks Data + AI Summit
A recap of the Databricks Data+AI Summit 2026 in San Francisco, covering the keynote announcements from Genie ONE and the AI Gateway to Reyden and...
Read more βExplore all articles about data science, machine learning, and AI
A recap of the Databricks Data+AI Summit 2026 in San Francisco, covering the keynote announcements from Genie ONE and the AI Gateway to Reyden and...
Read more β
A practical walkthrough of Python packaging: project layout, pyproject.toml, publishing to PyPI, testing with pytest, documentation with Sphinx, and CI/CD with GitHub Actions illustrated through...
Read more β
A practical walkthrough of text embeddings from fundamentals to production β covering model selection, vector storage, similarity metrics, and validation through a real cold-start content...
Read more β
I spent a weekend rebuilding my Jekyll blog using Claude Code and vibe coding. The result? A functional website in days instead of months. But...
Read more β
Debunking the AI failure myth by analyzing the MIT report, discussing ROI measurement, integration challenges, and what really makes data systems succeed.
Read more β
After years of building in-house ML platforms, we migrated to Databricks in December 2024. This handbook shares practical tips and tricks for working with the...
Read more β
Building a French TV show dataset using LLMs for HTML parsing, comparing zero-shot and few-shot prompting, and exploring fine-tuning for data extraction.
Read more β
Reflections on a decade in data science and AI, covering technology trends, organizational changes, project management, and lessons learned across industries.
Read more β
Key highlights from RecSys 2024 including advances in matrix factorization, LLM integration, and the latest research from Netflix, Spotify, and GroupLens.
Read more β
A comprehensive overview of matrix factorization techniques for recommender systems, from SVD to deep learning approaches, with practical implementation tips.
Read more β
Building a machine learning playground for the Suika Game using physics simulation, creating baseline agents, and setting up an experiment framework.
Read more β
A comprehensive guide to designing recommender systems in 2024, covering core features, design principles, and practical implementation strategies.
Read more β
So it's been 1 year now since I started to get involved in organizing two meetup groups in MTL with Pydata MTL and MLOps Community...
Read more β
Experimenting with OpenAI's Whisper to transcribe French podcasts, comparing different deployment strategies, and benchmarking costs and performance.
Read more β
Exploring the roles of data scientists and machine learning engineers, their differences, and how they complement each other in modern ML projects.
Read more β
Key takeaways from Apply(ops) 23 conference featuring insights from Uber, Lidl, Hello Fresh, and Pinterest on MLOps platforms, multi-cloud strategies, and production ML at scale....
Read more β
Comprehensive recap of RecSys 2023 covering industry practices, reproducibility research, new datasets, transformers in recommendations, and practical insights from Netflix, BBC, and Pinterest.
Read more β
Explore fallback strategies and serving rules in recommender systems. Learn how these underestimated pillars ensure reliable predictions in production environments.
Read more β
Five years of MLOps journey at Ubisoft building ML platforms for video games. Insights on challenges, tools, workflows, and lessons learned bringing ML to production....
Read more β
Tackle the cold start problem in recommender systems using transformers. Build a Marvel Snap deck recommendation system handling new cards with embedding techniques.
Read more β
Explore bringing machine learning from R&D to production at Ubisoft. Learn about creating an ML platform to support data scientists building production-ready pipelines.
Read more β
The following article will focus on my first experiences as an ML practitioner in Unity, a popular game engine. First, we'll start by introducing game...
Read more β
Hello, this is autumn and who says autumn says Recsys time, and this year Seattle was the place to be. I attended digitally to the...
Read more β
I recently decided to experiment with Docker containers to build standalone applications to optimize the operation flow of my different data/scraper pipelines. I have limited...
Read more β
For a few months, I wanted to test DVC, a toolkit around versioning for ML projects built by iterative. I tried it a bit at...
Read more β
Explore Surprise, a Python scikit package for building recommender systems on explicit ratings. Learn collaborative filtering and two-stage recommender implementations.
Read more β
Recently I heard about a package developed by Facebook research (META research !?) called KATS, released by Facebook's Infrastructure Data Science team end of last...
Read more β
Hello, in this article, I will give you a quick tour of a project that I recently resurrected from the dead to collect the French...
Read more β
I wanted for a long time to participate thoughtfully in a Kaggle competition ( I think I made some tests a few years ago but...
Read more β
Once again, I attended (virtually) this year with some of my colleagues at RecSys 2021 in Amsterdam. In this article, I will recap exciting papers...
Read more β
I recently started to prototype an image classifier at work, and this work led me to the fastai package that I had in my backload...
Read more β
I focused on this topic for the past three years at Ubisoft, but I never found suitable datasets to use for experiments on my blog...
Read more β
I wanted to write for a few weeks around ml/ds libraries that I have on my backlog of things to try. One article per library...
Read more β
Hello readers, I wanted a long time to write an article on an AWS service that I am using in my daily job called EMR....
Read more β
In this article, there will be a description on the Recsys conference that happened in September 2020 virtually (Thanks Ubisoft to have offer me the...
Read more β
In this article, there will be an overview of the service of AWS Sagemaker. The idea will be to see from my DS perspective how...
Read more β
This article will be part of my annual dive in R; the idea will be to use two R libraries in time-series forecasting and causal...
Read more β
In this article, there will be an introduction to the Neo4j graph database, leverage the technology for analytics and recommendation purposes.
Read more β
In this article, I am going to present some of my findings on my exploration of TensorFlow, the idea will be with TensorFlow to build...
Read more β
In this article, I am going to illustrate some of the works around music information that I work since the past few weeks applicated to...
Read more β
For this article, I am going to describe my hands-on on a new library that has open-sourced recently by Netflix to operate and version machine...
Read more β
In this article, I am going to present a pipeline that I built a few weeks ago to collect data (text and pictures) from the...
Read more β
Hello, in this article, I am going to detail a dataset that I built a few weeks ago on the game Hearthstone.
Read more β
The version used for this article is mlflow 1.4.0
Read more β
Hello, the season of the Open starts again this year (once again !?) so I am writing this article to:
Read more β
For this article, I am going to start the analysis of the data extracted with the pipeline explained on this article. The goal of this...
Read more β
Since I published the article on the London smart meter and the possible analysis of the data, I am receiving regularly messages of people that...
Read more β
I started this project in echo of the Kaggle competition related to PUBG, where the goal was to predict the player rank in the match,...
Read more β
Learn how to build a web scraping system to collect and analyze Crossfit Open data, including athlete profiles, gym information, and performance metrics from the...
Read more β
Learn how to build an interactive dashboard using Dash (Plotly) to visualize personal fitness and health data from Nokia devices, Strava, and Crossfit sessions.
Read more β
Hello, the goal of this article is to offer a clear description of the dataset that I uploaded in November 2017 on Kaggle followed by...
Read more β
Hello readers, for this article I am going to explain my approach to create a forecast system of the French (metropolitan) energy consumption. This kind...
Read more β
Hello reader, in this article I will explain my approach to deploy a chatbot in Python on the Messenger platform.
Read more β
Welcome to my blog where I share projects and insights from my work as a data scientist at EDF Energy in England, with a focus...
Read more β