Mauriana Pesaresi

Seminar Series 2024/2025

In this series of seminars dedicated to the memory of Mauriana Pesaresi, a doctoral student of the Computer Science Department of the University of Pisa, first-year PhD students in Computer Science will present an open research problem related to their field of study. After each seminar, a panel discussion will follow.

Browse past editions: 2022 Edition, 2023 Edition, 2024 Edition

For any further information, you can reach us out via email.

🚀 Upcoming

Marco Alessio

Products are NOT all you need

18th

April

15:00-16:00

Recommender systems (RecSys) are intelligent software components designed to suggest the most relevant items to users by leveraging statistical methods and machine learning algorithms. Due to their widespread use in practice and their demonstrated capabilities of creating value for both consumers and businesses, recommender systems can be seen as one of the most visible success stories of artificial intelligence. In this talk, we will first provide a concise overview of the recommendation task, outlining the key paradigms that underpin modern recommender systems, including collaborative filtering, content-based filtering, and hybrid approaches. We will then shift our focus to recent advancements in the field, particularly the emergence of Conversational Recommender Systems (CRS), which aim to enhance user engagement by enabling interactive, natural language dialogue-based recommendations. Despite the promise of CRS, current research and evaluation efforts face several open challenges. Among them are the limitations of current offline evaluation methodologies, including the scarcity of realistic and comprehensive high-quality datasets. These issues hinder progress in both system development and achieving naturalistic interactions from the user perspective. To address these gaps, we explore the novel paradigm of Conversational Search and Recommendation (CSR), which unifies search and recommendation in the same framework. We argue that this paradigm holds significant potential for more naturalistic and context-aware user experiences, and we outline open research questions and future directions in this evolving area.

Live Streaming Sala Riunioni Est

⌛️ Past Talks

24th

January

15:00-16:00

Calogero Turco

Trust Approaches in Self-Sovereign Identity

Slides

Digital Identity today is mainly centralized, with platforms like Google and Facebook controlling user's data. This lack of control has led to significant privacy concerns, such as the Cambridge Analytica scandal. Self-Sovereign Identity (SSI) is a paradigm shift that decentralizes Digital Identity by giving control back to the individuals it refers to. In SSI, individuals are called Holders and can decide what parts of their identity to expose. Each Holder has Verifiable Credentials (e.g., degree) in their digital wallet, issued by entities called Issuers (e.g., University). Holders present those Verifiable Credentials to any subject interested in information about them and who can act as a Verifier (e.g., recruiter). Unfortunately, no consolidated solutions exist to certify the Issuer's trustworthiness in issuing a specific credential or that the Holder can trust the Verifier in managing their sensitive data. How do we ensure that Issuers are trustworthy and Verifiers are competent to handle sensitive data responsibly? We will see some proposed methods for this and evaluate their suitability for different scenarios. Resolving these issues is essential to fully realizing the potential of SSI and creating a more secure, user-centric Digital Identity ecosystem.

7th

February

15:00-16:00

Michele Papucci

Toronto is the capital of Canada: Detecting and Preventing LLMs from Hallucinating

Slides

Large Language Models (LLMs) have revolutionized Natural Language Processing (NLP) applications, reaching new state-of-the-art performances in all kinds of tasks and domains, especially in content generation scenarios. However, these models are prone to generating Hallucinations, which are factually incorrect, misleading, and non-sensical outputs that are presented in a convincingly accurate manner. This phenomenon poses significant risks, particularly in fields such as healthcare, legal services, etc. This seminar will briefly present the nature of hallucinations in LLMs, their underlying causes, and the challenges in detecting them. By understanding and preventing hallucination, LLM reliability can be improved, ensuring safer and more trustworthy AI applications.

14th

February

14:00-15:00

Jack Bell

Bridging the ARC: From Intrigue to Innovation

Slides

The ARC (Abstract Reasoning Corpus) Prize, emphasising skill acquisition over task-specific performance, redefines how intelligence, both human and artificial, should be measured. It builds on insights challenging traditional benchmarks and posits that true intelligence lies in an agent’s ability to rapidly learn and generalise from minimal exposure, as opposed to performing well only on predefined tasks with extensive training data. The ARC Benchmark presents abstract, open-ended problems designed to test human-like reasoning and adaptability, while the ARC Prize incentivises interdisciplinary approaches that blend rigorous theoretical analysis with practical data-driven approaches. Together, they tackle challenges such as scalability, inherent randomness and non-determinism, catalysing a paradigm shift toward AI systems that embody generalisable intelligence.

21st

February

14:00-15:00

Elisa Salatti

Designing eHealth systems for people with disability: the trade-off between personalization and universality

Slides

Access to quality healthcare services is a recognized human right. Despite the use of technology, this right is not guaranteed for everyone, especially people with disabilities. Designing eHealth systems for people with disabilities requires an interdisciplinary approach and a balance between personalization and universality. We have technologies, methods and tools to develop systems that solve specific access problems in healthcare. However, to achieve universal products and reduce costs, the results of these efforts must be generalized. This seminar addresses the challenges and opportunities in developing inclusive eHealth systems and emphasizes the importance of a participatory approach to improve the accessibility and quality of healthcare for all.

28th

February

15:00-16:00

Ornela Danushi

On Carbon Efficient Software Systems

Slides

The ICT sector, responsible for 2% of global carbon emissions, is under scrutiny and requires methods and tools to design and develop software in an environmentally sustainable way. However, software engineering solutions for designing and developing carbon-efficient software are currently scattered in different literature, making it difficult to consult the body of knowledge on the topic. This seminar presents the results of a systematic literature review of state-of-the-art proposals for the design and development of carbon-efficient software. The analysis is based on the identification of 65 primary studies by classifying them through a taxonomy aimed at answering the 5W1H questions of carbon-efficient software design and development. Specifically, it provides a reasoned overview and discussion of the existing guidelines, reference models, measurement solutions and techniques for measuring, reducing, or minimising the carbon footprint of software. Finally, open challenges and research gaps are listed to provide insights for future work in this area.

7th

March

15:00-16:00

Malio Li

Teaching Machines to Dream: An Introduction to World Models

Slides

Humans and other animals learn from an early age how their surrounding environment evolves. Through the accumulation of knowledge about the external world, they gain the ability to predict what is happening around them. Similarly, world models enable machines to simulate, predict, and interact intelligently with their environments. These models serve as internal representations of the external world, allowing systems to understand environmental dynamics, anticipate outcomes, and make informed decisions. In this talk, we will explore what world models are, how they are built, why they are integral to advancing AI capabilities, and some open challenges. We will delve into popular architectures such as DeepMind’s Dreamer, DIAMOND, and GAIA, highlighting how these frameworks model complex environments. Additionally, we will examine the transformative applications of world models across diverse domains, including autonomous driving, robotics, 3D world simulations, and gaming. By connecting theory to real-world implementations, this talk aims to provide a comprehensive introduction to world models and their pivotal role in shaping the future of intelligent systems.

14th

March

15:00-16:00

Claudia Nanula

We’re Not Speaking the Same Language: Approaches and Challenges in the Personalization of LLMs

Slides

Large Language Models (LLMs) have revolutionized human-computer interaction, enabling increasingly natural and adaptive conversations. However, the challenge of personalization —tailoring these models to individual users while maintaining fairness, robustness, and inclusivity — remains a critical research topic. This seminar will explore the foundations of LLM personalization, going from a formal theoretical framework and a description of current techniques to evaluation metrics and real-world applications. At the same time, we will cover key ethical concerns, such as the risk of biases and stereotypes in the personalized outputs, privacy issues, and sensitive category protection. This session will analyze technical and societal perspectives to try to foster a critical discussion on balancing personalization with responsible AI development.

21st

March

15:00-16:00

Luca Miglior

Find your inner peace, minimize your Energy

Slides

Energy-Based Models (EBMs) provide a powerful and elegant framework for generative learning by modeling the compatibility between inputs and outputs through an energy function. Rather than explicitly modeling probability distributions, EBMs define a scalar energy for each possible configuration of variables, with learning aimed at assigning low energy to desirable outcomes. This approach enables flexible, expressive generative modeling, especially in high-dimensional or continuous domains. In this seminar, we introduce the principles of generative learning through energy functions, formalize the EBM framework, and explore how EBMs can be trained to model complex data distributions. We’ll discuss sampling techniques, training objectives, and practical examples that highlight the strengths of EBMs in generative tasks such as image synthesis and structured prediction. By the end, you'll gain a fresh perspective on generative modeling and a solid foundation for applying EBMs in your own work.

28th

March

15:00-16:00

Leena Aizdi

From Slow to Scalable: Strategies for Large-scale Mixed Integer Linear Programming Optimisation

Slides

Mixed Integer Linear Programming (MILP) has been widely used as a decision support tool in many applications such as transportation, logistics, healthcare, and telecommunications. However, with the increasing complexity of the problem application, solving large-scale MILP models to obtain optimal or near-optimal solutions becomes very challenging, thus remaining an active field of research. This seminar further explores this issue and highlights some strategies used in current research to tackle this problem including the exploitation of exact decomposition and heuristic approaches. Additionally, we look over the emerging role of Artificial Intelligence (AI) and Machine Learning (ML) in the field of combinatorial optimisation, in particular how AI/ML integrated optimisation is being used to improve the efficiency of solving MILP models. Finally, we briefly mention the possible future research directions, with an open-ended discussion with participants at the end.

4th

April

15:00-16:00

Matteo Ninniri

Node count problems in Graph Diffusion Models

Slides

In the last couple of years, we have witnessed a relevant increase in interest in generative models, mainly sparked by the emergence of Denoising Diffusion Probabilistic Models. While image generation remains the most popular task for these models, graph generation has also gained significant traction because of its applications in drug discovery. One major challenge exclusive to graph diffusion models is that, unlike images, the size of the sample (that is, the number of nodes) often plays a significant role in its properties. However, this problem is overlooked by most of the literature, which results in suboptimal results. In this presentation, we discuss some tasks where this overlook needs to be addressed, as well as some possible solutions.

11th

April

15:00-16:00

Francesco Gargiulo

Shrink Files, Win Cash: The Hutter Prize Explained

Slides

This presentation introduces the Hutter Prize, a unique competition launched in 2006 by Marcus Hutter that challenges participants to compress a 1GB Wikipedia corpus better than previous attempts. Rather than simply being a technical exercise in file size reduction, the prize examines Hutter's theory that the ability to compress data effectively is closely linked to acting intelligently. We explore the competition's criteria and significance as a benchmark at the intersection of information theory and artificial intelligence. As Hutter himself suggests, compression can serve as a meaningful metric for evaluating understanding, since effective compression requires identifying patterns and making accurate predictions. By reviewing the competition's history and the evolution of compression techniques, we will highlight how the Hutter Prize provides valuable insights into the connection between data compression and artificial intelligence.

11th

April

16:00-17:00

Khadija Javed

Cross-Project Defect Prediction: Scalable and Interpretable Domain Adaptation Approaches

Slides

Cross-project defect prediction (CPDP) aims to predict software defects in a target project domain by leveraging information from different source project domains, allowing testers to identify defective modules quickly. However, CPDP models often underperform due to different data distributions between source and target domains due to variations in coding practices, architectures, and development environments and class imbalances in both source and target projects. Additionally, standard features often fail to capture sufficient semantic and contextual information from the source project, leading to poor prediction performance in the target project. In this seminar we will explore current domain adaptation techniques that have emerged as a solution to bridge this gap, enabling better generalization and transferability of CPDP models. Additionally, we will see how machine learning models are optimized to enhance the interpretability and prediction performance of CPDP models. In the end we will discuss open challenges and research gaps to provide insights for future work in this area.