Paper comment: "MELLODDY: cross pharma federated learning at unprecedented scale unlocks benefits in QSAR without compromising proprietary information"
Intro I finally got my hand on carefully reading papers by the MELLODDY consortium led by my ex-supervisor (and fantastic visionary) - Hugo Ceulemans. I participated in some very-very early discussions and saw the painful process of the project's initiation. What Hugo did is colossal, and he helped to build a PRECEDENT! Honestly, the amount of PR the project got is too little! (Compared to PR of some other ML-for-chemistry companies). The project united ten big pharma companies to build a federated multi-task model for activity assays (activity on proteins, PK/PD, toxicity): 40000+ assays on 21+ million compounds with 2.6+ billion end-points . MELLODDY used federated learning to improve individual models of the consortium partners by confidential sharing of the "model" without sharing private data itself. An exciting approach that has a lot of tiny, small, and big challenges related to fairness, data security, MLOps, ML engineering, ML algorithms, and legal questions.