Dr. Kilian Fatras

Machine Learning research scientist at Dreamfold


Greetings and welcome to my personal website!

I am a machine learning research scientist at Dreamfold. My work focuses on protein design with the ultimate objective of producing medicine. To achieve this goal, we leverage recent generative AI frameworks.

Before that, I was a Postdoctoral fellow at Mila laboratory and McGill University in Montréal. I was working under the supervision of Pr. Adam Oberman and Pr. Ioannis Mitliagkas. My research focused on generative models, protein design and optimal transport. I am the co-creator and core maintainer of the TorchCFM library which open-sources our work on the novel Flow Matching generative models!

I completed my PhD under the supervision of Pr. Nicolas Courty and Pr. Rémi Flamary at IRISA-INRIA Panama and Obelix. My research focused on the interaction of optimal transport and deep learning with applications to domain adaptation, noisy labels and generative modelling. The recording of my thesis defense can be found on YouTube, the slides here and the manuscript here.

I hold a master's degree in applied mathematics and machine learning from both Ecole Polytechnique and ENSTA ParisTech. In 2017, I was an exchange student at UC Berkeley where I discovered the world of machine learning.

If you are interested in learning more about my professional experience and qualifications, please refer to my resume, which can be found here.

Thank you for visiting my website and feel free to contact me with any questions or inquiries.


News !


  1. I have joined Dreamfold as a machine learning research scientist.
  2. Flow Matching for protein backbone generation has been accepted at ICLR 2024 as a spotlight presentation! See you in Vienna!
  3. Our papers Flow Matching with XGBoost for generating tabular data generation and Simulation-free Schrödinger bridges via score and flow matching have been accepted to AISTATS! See you in Valencia!
  4. I succesfully defended my PhD thesis "Optimal Transport and Deep Learning: Learning from one another" ! You can watch the defense here.

Research interests


My work focuses on AI for Science, generative modeling, distribution shifts (domain adaptation, out-of-distribution samples, ...) and optimal transport. Recently, I developped a strong interest for biological applications with a focus on protein generation.


Talks


27/06/23 - Genentech : Optimal transport to learn robust representations and trajectory inferences
11/04/23 - Montréal FAIR : Evaluation of deep partial Domain Adaptation methods
28/02/23 - Montréal Microsoft AI lab seminar: Evaluation of deep partial Domain Adaptation methods
08/11/22 - Huawei (Noah's Ark Lab) seminar: Evaluation of deep partial Domain Adaptation methods
04/12/22 - Canadian Mathematical Society winter meeting 2022: Minibatch Optimal Transport distances in Deep Learning
25/11/22 - LITIS seminar: Evaluation of deep partial Domain Adaptation methods
07/04/22 - DS4DM Coffee Talks Polytechnique Montréal: Unbalanced minibatch Optimal Transport
01/09/21 - CMAP Ecole Polytechnique: Unbalanced minibatch Optimal Transport
28/04/21 - Montréal Machine Learning and Optimization (MTL MLOpt): Unbalanced minibatch Optimal Transport; applications to Domain Adaptation
09/07/19 - GDR-ISIS: Transport optimal en apprentissage statistique et traitement du signal

Papers


SE(3)-Stochastic Flow Matching for Protein Backbone Generation
Avishek Joey Bose, Tara Akhound-Sadegh, Kilian Fatras, Guillaume Huguet, Jarrid Rector-Brooks, Cheng-Hao Liu, Andrei Cristian Nica, Maksym Korablyov, Michael Bronstein, Alexander Tong
(to appear) Spotlight at ICLR, 2024
Keywords: Flow matching, protein backbone design, SE(3) manifold, Equilibrium conformation generation

ArXiv Code Bibtex
@misc{bose2023se3stochastic,
  title={SE(3)-Stochastic Flow Matching for Protein Backbone Generation}, 
  author={Avishek Joey Bose and Tara Akhound-Sadegh and Kilian Fatras and Guillaume Huguet and Jarrid Rector-Brooks and Cheng-Hao Liu and Andrei Cristian Nica and Maksym Korablyov and Michael Bronstein and Alexander Tong},
  year={2023},
  eprint={2310.02391},
  archivePrefix={arXiv},
  primaryClass={cs.LG}
}
            

Generating and Imputing Tabular Data via Diffusion and Flow-based Gradient-Boosted Trees
Alexia Jolicoeur-Martineau, Kilian Fatras, Tal Kachman
(to appear) AISTATS 2024
Keywords: Flow matching, diffusion models, XGBoost, tabular data

ArXiv Code Bibtex
@misc{jolicoeurmartineau2023generating,
  title={Generating and Imputing Tabular Data via Diffusion and Flow-based Gradient-Boosted Trees}, 
  author={Alexia Jolicoeur-Martineau and Kilian Fatras and Tal Kachman},
  year={2023},
  eprint={2309.09968},
  archivePrefix={arXiv},
  primaryClass={cs.LG}
}
            

No Wrong Turns: The Simple Geometry Of Neural Networks Optimization Paths
Charles Guille-Escuret, Hiroki Naganuma, Kilian Fatras, Ioannis Mitliagkas
Preprint, 2023
Keywords: Optimization, Neural network landscape

ArXiv Code Bibtex
@misc{guilleescuret2023wrong,
  title={No Wrong Turns: The Simple Geometry Of Neural Networks Optimization Paths}, 
  author={Charles Guille-Escuret and Hiroki Naganuma and Kilian Fatras and Ioannis Mitliagkas},
  year={2023},
  eprint={2306.11922},
  archivePrefix={arXiv},
  primaryClass={cs.LG}
}
            

Simulation-free Schrödinger bridges via score and flow matching
Alexander Tong*, Nikolay Malkin*, Kilian Fatras*, Lazar Atanackovic, Yanlei Zhang, Guillaume Huguet, Guy Wolf, Yoshua Bengio
* Equal contribution
(to appear) AISTATS 2024 and ICML Workshop on New Frontiers in Learning, Control, and Dynamical Systems, 2023
Keywords: generative models, normalizing flows, optimal transport, single-cell dynamics

ArXiv Code Bibtex
            @misc{tong2023simulationfree,
              title={Simulation-free Schr\"odinger bridges via score and flow matching}, 
              author={Alexander Tong and Nikolay Malkin and Kilian Fatras and Lazar Atanackovic and Yanlei Zhang and Guillaume Huguet and Guy Wolf and Yoshua Bengio},
              year={2023},
              eprint={2307.03672},
              archivePrefix={arXiv},
              primaryClass={cs.LG}
        }
            

Unbalanced Optimal Transport meets Sliced-Wasserstein
Thibault Séjourné, Clément Bonet, Kilian Fatras, Kimia Nadjahi and Nicolas Courty
ICML Workshop on New Frontiers in Learning, Control, and Dynamical Systems, 2023
Keywords: optimal transport, climate change, Wasserstein barycenter

ArXiv Bibtex
@misc{séjourné2023unbalanced,
  title={Unbalanced Optimal Transport meets Sliced-Wasserstein}, 
  author={Thibault Séjourné and Clément Bonet and Kilian Fatras and Kimia Nadjahi and Nicolas Courty},
  year={2023},
  eprint={2306.07176},
  archivePrefix={arXiv},
  primaryClass={cs.LG}}
           

PopulAtion Parameter Averaging (PAPA)
Alexia Jolicoeur-Martineau, Emy Gervais, Kilian Fatras, Yan Zhang and Simon Lacoste-Julien
Preprint, 2023
Keywords: merging models, computer vision, remote sensing

ArXiv Code Bibtex
@misc{jolicoeurmartineau2023population,
  title={PopulAtion Parameter Averaging (PAPA)}, 
  author={Alexia Jolicoeur-Martineau and Emy Gervais and Kilian Fatras and Yan Zhang and Simon Lacoste-Julien},
  year={2023},
  eprint={2304.03094},
  archivePrefix={arXiv},
  primaryClass={cs.LG}}
           

Improving and Generalizing Flow-Based Generative Models with Minibatch Optimal Transport
Alexander Tong*, Kilian Fatras*, Nikolay Malkin*, Guillaume Huguet, Yanlei Zhang, Jarrid Rector-Brooks, Guy Wolf, Yoshua Bengio
TMLR, 2024
* Equal contribution
Keywords: generative models, normalizing flows, optimal transport, single-cell dynamics

ArXiv TMLR Code Bibtex
            @article{
              tong2024improving,
              title={Improving and generalizing flow-based generative models with minibatch optimal transport},
              author={Alexander Tong and Kilian FATRAS and Nikolay Malkin and Guillaume Huguet and Yanlei Zhang and Jarrid Rector-Brooks and Guy Wolf and Yoshua Bengio},
              journal={Transactions on Machine Learning Research},
              issn={2835-8856},
              year={2024},
              url={https://openreview.net/forum?id=CD9Snc73AW},
              note={Expert Certification}
              }
           

A Reproducible and Realistic Evaluation of Partial Domain Adaptation Methods
Tiago Salvador*, Kilian Fatras*, Ioannis Mitliagkas, Adam Oberman
* Equal contribution
Transactions on Machine Learning Research (TMLR), 2023
Keywords: Partial domain adaptation, reproducibility, benchmark

ArXiv Code Bibtex
            @article{
              salvador2023a,
              title={A Reproducible and Realistic Evaluation of Partial Domain Adaptation Methods},
              author={Tiago Salvador and Kilian FATRAS and Ioannis Mitliagkas and Adam M Oberman},
              journal={Transactions on Machine Learning Research},
              issn={2835-8856},
              year={2023},
              url={https://openreview.net/forum?id=XcVzIBXeRn},
              note={}
              }

          

On making optimal transport robust to all outliers
Kilian Fatras
Preprint, 2022
Keywords: Optimal Transport, Noisy labels, Generative models

ArXiv Bibtex
@misc{https://doi.org/10.48550/arxiv.2206.11988,
  doi = {10.48550/ARXIV.2206.11988},

  url = {https://arxiv.org/abs/2206.11988},

  author = {Fatras, Kilian},

  keywords = {Machine Learning (stat.ML), Machine Learning (cs.LG), Probability (math.PR), FOS: Computer and information sciences, FOS: Computer and information sciences, FOS: Mathematics, FOS: Mathematics},

  title = {On making optimal transport robust to all outliers},

  publisher = {arXiv},

  year = {2022},

  copyright = {Creative Commons Attribution 4.0 International}
}
  

Optimal transport meets noisy label robust loss and MixUp for domain adaptation
Kilian Fatras, Hiroki Naganuma, Ioannis Mitliagkas
Conference on Lifelong Learning Agents (CoLLAs), 2022
Keywords: Optimal Transport, Noisy labels, MixUp, Domain Adaptation

Paper ArXiv Bibtex

    @InProceedings{fatras22aMixOT,
      title = 	 {Optimal Transport meets Noisy Label Robust Loss and MixUp Regularization for Domain Adaptation},
      author =       {Fatras, Kilian and Naganuma, Hiroki and Mitliagkas, Ioannis},
      booktitle = 	 {Proceedings of The 1st Conference on Lifelong Learning Agents},
      pages = 	 {966--981},
      year = 	 {2022},
      editor = 	 {Chandar, Sarath and Pascanu, Razvan and Precup, Doina},
      volume = 	 {199},
      series = 	 {Proceedings of Machine Learning Research},
      month = 	 {22--24 Aug},
      publisher =    {PMLR},
      pdf = 	 {https://proceedings.mlr.press/v199/fatras22a/fatras22a.pdf},
      url = 	 {https://proceedings.mlr.press/v199/fatras22a.html},
      abstract = 	 {It is common in computer vision to be confronted with domain shift: images which have the same class but different acquisition conditions. In domain adaptation (DA), one wants to classify unlabeled target images using source labeled images. Unfortunately, deep neural networks trained on a source training set perform poorly on target images which do not belong to the training domain. One strategy to improve these performances is to align the source and target image distributions in an embedded space using optimal transport (OT). To compute OT, most methods use the minibatch optimal transport approximation which causes negative transfer, i.e. aligning samples with different labels, and leads to overfitting. In this work, we mitigate negative alignment by explaining it as a noisy label assignment to target images. We then mitigate its effect by appropriate regularization. We propose to couple the MixUp regularization with a loss that is robust to noisy labels in order to improve domain adaptation performance. We show in an extensive ablation study that a combination of the two techniques is critical to achieve improved performance. Finally, we evaluate our method, called mixunbot, on several benchmarks and real-world DA problems.}
    }
  

POT: Python Optimal Transport
Rémi Flamary, Nicolas Courty, Alexandre Gramfort, Mokhtar Z. Alaya, Aurélie Boisbunon, Stanislas Chambon, Laetitia Chapel, Adrien Corenflos, Kilian Fatras, Nemo Fournier, Léo Gautheron, Nathalie T.H. Gayraud, Hicham Janati, Alain Rakotomamonjy, Ievgen Redko, Antoine Rolet, Antony Schutz, Vivien Seguy, Danica J. Sutherland, Alexander Tong and Titouan Vayer
Journal of Machine Learning Research (JMLR) - Open Source Software, 2021

Paper Website Code Bibtex
  @article{JMLR:v22:20-451,
    author  = {R\'emi Flamary and Nicolas Courty and Alexandre Gramfort and Mokhtar Z. Alaya and Aur\'elie Boisbunon and Stanislas Chambon and Laetitia Chapel and Adrien Corenflos and Kilian Fatras and Nemo Fournier and L\'eo Gautheron and Nathalie T.H. Gayraud and Hicham Janati and Alain Rakotomamonjy and Ievgen Redko and Antoine Rolet and Antony Schutz and Vivien Seguy and Danica J. Sutherland and Romain Tavenard and Alexander Tong and Titouan Vayer},
    title   = {POT: Python Optimal Transport},
    journal = {Journal of Machine Learning Research},
    year    = {2021},
    volume  = {22},
    number  = {78},
    pages   = {1-8},
    url     = {http://jmlr.org/papers/v22/20-451.html}
  }

Unbalanced minibatch Optimal Transport; applications to Domain Adaptation
Kilian Fatras, Thibault Séjourné, Nicolas Courty and Rémi Flamary
International Conference on Machine Learning (ICML), 2021
Keywords: Unbalanced Optimal Transport, Minibatch, Concentration Bounds, (Partial) Domain Adaptation

Paper ArXiv Code Bibtex
@InProceedings{pmlr-v139-fatras21a,
title = 	 {Unbalanced minibatch Optimal Transport; applications to Domain Adaptation},
author =       {Fatras, Kilian and Sejourne, Thibault and Flamary, R{\'e}mi and Courty, Nicolas},
booktitle = 	 {Proceedings of the 38th International Conference on Machine Learning},
pages = 	 {3186--3197},
year = 	 {2021},
editor = 	 {Meila, Marina and Zhang, Tong},
volume = 	 {139},
series = 	 {Proceedings of Machine Learning Research},
month = 	 {18--24 Jul},
publisher =    {PMLR},
pdf = 	 {http://proceedings.mlr.press/v139/fatras21a/fatras21a.pdf},
url = 	 {http://proceedings.mlr.press/v139/fatras21a.html},
abstract = 	 {Optimal transport distances have found many applications in machine learning for their capacity to compare non-parametric probability distributions. Yet their algorithmic complexity generally prevents their direct use on large scale datasets. Among the possible strategies to alleviate this issue, practitioners can rely on computing estimates of these distances over subsets of data, i.e. minibatches. While computationally appealing, we highlight in this paper some limits of this strategy, arguing it can lead to undesirable smoothing effects. As an alternative, we suggest that the same minibatch strategy coupled with unbalanced optimal transport can yield more robust behaviors. We discuss the associated theoretical properties, such as unbiased estimators, existence of gradients and concentration bounds. Our experimental study shows that in challenging problems associated to domain adaptation, the use of unbalanced optimal transport leads to significantly better results, competing with or surpassing recent baselines.}
}

Minibatch Optimal Transport distances; analysis and applications
Kilian Fatras, Younes Zine, Szymon Majewski, Rémi Flamary, Rémi Gribonval and Nicolas Courty
Preprint, 2021
Keywords: Optimal Transport, Minibatch, Concentration Bounds, GANs, Sub-Gaussian data

ArXiv Code Bibtex
  @misc{fatras2021minibatch,
    title={Minibatch optimal transport distances; analysis and applications},
    author={Kilian Fatras and Younes Zine and Szymon Majewski and Rémi Flamary and Rémi Gribonval and Nicolas Courty},
    year={2021},
    eprint={2101.01792},
    archivePrefix={arXiv},
    primaryClass={stat.ML}
}

Generating natural adversarial Remote Sensing Images
Jean-Christophe Burnel, Kilian Fatras, Rémi Flamary and Nicolas Courty
IEEE Transactions on Geoscience and Remote Sensing (TGRS), 2021
Keywords: Optimal Transport, GANs, Adversarial Examples, Remote Sensing

Paper ArXiv/HAL Code Bibtex
@ARTICLE{burnel2021,
  author={Burnel, Jean-Christophe and Fatras, Kilian and Flamary, R{\'e}mi and Courty, Nicolas},
  journal={IEEE Transactions on Geoscience and Remote Sensing},
  title={Generating natural adversarial Remote Sensing Images},
  year={(to appear) 2021}}

Learning with minibatch Wasserstein: asymptotic and gradient properties
Kilian Fatras, Younes Zine, Rémi Flamary, Rémi Gribonval and Nicolas Courty
Proceedings of the 23nd International Conference on Artificial Intelligence and Statistics (AISTATS), 2020
Keywords: Optimal Transport, Minibatch, Concentration Bounds, Large Scale Color Transfer

Paper ArXiv Code Slides Poster Blog Bibtex
@InProceedings{pmlr-v108-fatras20a,
  title = 	 {Learning with minibatch Wasserstein  : asymptotic and gradient properties},
  author = 	 {Fatras, Kilian and Zine, Younes and Flamary, R\'emi and Gribonval, Remi and Courty, Nicolas},
  booktitle = 	 {Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics},
  pages = 	 {2131--2141},
  year = 	 {2020},
  editor = 	 {Chiappa, Silvia and Calandra, Roberto},
  volume = 	 {108},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Online},
  month = 	 {26--28 Aug},
  publisher = 	 {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v108/fatras20a/fatras20a.pdf},
  url = 	 {http://proceedings.mlr.press/v108/fatras20a.html},
  abstract = 	 {Optimal transport distances are powerful tools to compare probability distributions and have found many applications in machine learning. Yet their algorithmic complexity prevents their direct use on large scale datasets. To overcome this challenge, practitioners compute these distances on minibatches i.e., they average the outcome of several smaller optimal transport problems. We propose in this paper an analysis of this practice, which effects are not well understood so far. We notably argue that it is equivalent to an implicit regularization of the original problem, with appealing properties such as unbiased estimators, gradients and a concentration bound around the expectation, but also with defects such as loss of distance property. Along with this theoretical analysis, we also conduct empirical experiments on gradient flows, GANs or color transfer that highlight the practical interest of this strategy.}
}

Wasserstein Adversarial Regularization (WAR) on label noise
Kilian Fatras*, Bharath Damodaran*, Sylvain Lobry, Rémi Flamary, Devis Tuia and Nicolas Courty
* Equal contribution
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Keywords: Optimal Transport, Adversarial Training, label noise, Remote Sensing

Paper ArXiv Code Bibtex
@ARTICLE{Fatras2021WAR,
author={Fatras, Kilian and Damodaran, Bharath Bhushan and Lobry, Sylvain and Flamary, Remi and Tuia, Devis and Courty, Nicolas},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
title={Wasserstein Adversarial Regularization for learning with label noise},
year={2021},
doi={10.1109/TPAMI.2021.3094662}}

Proximal Splitting Meets Variance Reduction
Fabian Pedregosa, Kilian Fatras and Mattia Casotto.
Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics (AISTATS), 2019
Keywords: Proximal Splitting, Variance Reduction, Sparse Update

Paper ArXiv Code Bibtex
@InProceedings{pmlr-v89-pedregosa19a,
title = 	 {Proximal Splitting Meets Variance Reduction},
author =       {Pedregosa, Fabian and Fatras, Kilian and Casotto, Mattia},
booktitle = 	 {Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics},
pages = 	 {1--10},
year = 	 {2019},
editor = 	 {Chaudhuri, Kamalika and Sugiyama, Masashi},
volume = 	 {89},
series = 	 {Proceedings of Machine Learning Research},
month = 	 {16--18 Apr},
publisher =    {PMLR},
pdf = 	 {http://proceedings.mlr.press/v89/pedregosa19a/pedregosa19a.pdf},
url = 	 {http://proceedings.mlr.press/v89/pedregosa19a.html},
abstract = 	 {Despite the raise to fame of stochastic variance reduced methods like SAGA and ProxSVRG, their use in non-smooth optimization is still limited to a few simple cases. Existing methods require to compute the proximal operator of the non-smooth term at each iteration, which, for complex penalties like the total variation, overlapping group lasso or trend filtering, is an iterative process that becomes unfeasible for moderately large problems. In this work we propose and analyze VRTOS, a variance-reduced method to solve problems with an arbitrary number of non-smooth terms. Like other variance reduced methods, it only requires to evaluate one gradient per iteration and converges with a constant step size, and so is ideally suited for large scale applications. Unlike existing variance reduced methods, it admits multiple non-smooth terms whose proximal operator only needs to be evaluated once per iteration. We provide a convergence rate analysis for the proposed methods that achieves the same asymptotic rate as their full gradient variants and illustrate its computational advantage on 4 different large scale datasets.}
}


Workshops


A Reproducible and Realistic Evaluation of Partial Domain Adaptation Methods
Tiago Salvador*, Kilian Fatras*, Ioannis Mitliagkas, Adam Oberman
* Equal contributions
Workshop on Distribution Shifts, 36th Conference on Neural Processing Systems (NeurIPS 2022)
Keywords: Partial domain adaptation, reproducibility, benchmark

ArXiv Code Bibtex
                  @misc{https://doi.org/10.48550/arxiv.2210.01210,
                    doi = {10.48550/ARXIV.2210.01210},

                    url = {https://arxiv.org/abs/2210.01210},

                    author = {Salvador, Tiago and Fatras, Kilian and Mitliagkas, Ioannis and Oberman, Adam},

                    keywords = {Computer Vision and Pattern Recognition (cs.CV), Machine Learning (cs.LG), Machine Learning (stat.ML), FOS: Computer and information sciences, FOS: Computer and information sciences},

                    title = {A Reproducible and Realistic Evaluation of Partial Domain Adaptation Methods},

                    publisher = {arXiv},

                    year = {2022},

                    copyright = {arXiv.org perpetual, non-exclusive license}
                  }

                


Lectures


04/22 - Introduction to Optimal Transport - Université de Montréal and McGill University

Projects and Volunteering


Here is a list of my volunteering activities and the different projects I contribute to:

  1. 11/18/21 - Organization of GDR ISIS & MIA (in person) workshop on optimal transport and statistical learning !
  2. Python for Optimal Transport (POT) is an open source library for optimal transport in Python.
  3. Reviewer for JMLR, ICML, ECML, JOTA, IEEE TGRS, ICLR, AISTATS, NeurIPS.
  4. Best reviewer award ICLR 2022.

Contacts


I use several networks, do not hesitate to reach me out !