The football analytics industry has a problem: every data provider uses different formats. This creates unnecessary complexity for researchers, analysts, and clubs trying to work with multiple data sources.

Our new research paper introduces a Common Data Format (CDF) designed to standardize how football data is structured. This collaborative effort involved leading experts from academia and industry, including Jesse Davis (KU Leuven), Sam Robertson (Institute for Sports Tech Standards), Joshua Wyatt Smith (Wyatt Inc.), Matthias Kempe (University of Vienna), Pascal Bauer (DfB), Jan Van Haaren (Club Brugge), Gabriel Anzer (RB Leipzig), Nicolas Evans (FIFA), Kilian Arnsmeyer (DfB), and Ulf Brefeld (University Leuphana).

The CDF is designed as an evolving industry standard that will adapt as new needs emerge. Our goal is to establish a format that benefits the entire football analytics ecosystem—from data providers to researchers to clubs.

The research is ongoing, and we’re committed to continuously improving the format based on community feedback and industry developments.

To support further integration I’ve developed a Python validation package for data providers who want to implement the format. This tool ensures outputs comply with the schemas defined in our research, making adoption straightforward and reliable.

Readme Card


<
Previous Post
🔬 An Intuitive Measure for Pressing
>
Next Post
🎙️ GraphEPV at PyData London 2025