In our research, we develop efficient algorithms for data science tasks and use data science in settings of societal interest.

  • Efficient data science. We aim to optimize various aspects of data science pipelines. Our interests include: workload-aware model materialization, i.e., we look for the best ways to store the learned models so that we can use them as efficiently as possible for our application; data summaries, so that machine learning algorithms can be executed faster on smaller amounts of data; learned index structures, so that data queries are efficient for the specific data that we are dealing with; and end-to-end optimizations for data science pipelines.
  • Data science with societal applications. We work on network analysis, fair policy design, fair data summarization, and analyzing news content on the web and social media.

Selected publications

  • Yanhao Wang, Michael Mathioudakis, Yuchen Li, Kian-Lee Tan. "Minimum Coresets for Maxima Representation of Multidimensional Data". PODS 2021. (to appear)
  • Cigdem Aslay, Martino Ciaperoni, Aristides Gionis, and Michael Mathioudakis. "Workload-aware materialization for efficient variable elimination on Bayesian networks". ICDE 2021. (to appear / extended version at
  • Yanhao Wang, Francesco Fabbri, and Michael Mathioudakis. "Fair and Representative Subset Selection from Data Streams". WWW 2021. (to appear / pdf)
  • Michael Mathioudakis, Carlos Castillo, Giorgio Barnabo, and Sergio Celis. "Affirmative action policies for top-k candidates selection: with an application to the design of policies for university admissions". Symposium on Applied Computing SAC 2020. DOI:
  • Riku Laine, Antti Hyttinen, and Michael Mathioudakis. "Evaluating Decision Makers over Selectively Labelled Data: A Causal Modelling Approach." Springer Discovery Science 2020. DOI:
    Best paper award.

Video: Algorithmic Data Science - Research