Speakers

Title: Cross-modal understanding and generation of multimodal content

Abstract: Video generation consists of generating a video sequence so that an object in a source image is animated according to some external information (a conditioning label, a driving video, a piece of text). In this talk I will present some of our recent achievements addressing generating videos without using any annotation or prior information about the specific object to animate. Once trained on a set of videos depicting objects of the same category (e.g. faces, human bodies), our method can be applied to any object of this class. Based on this, I will present a framework to train game-engine-like neural models, solely from monocular annotated videos. The result —a Learnable Game Engine (LGE)— maintains states of the scene, objects and agents in it, and enables rendering the environment from a controllable viewpoint. Similarly to a game engine, it models the logic of the game and the underlying rules of physics, to make it possible for a user to play the game by specifying both high- and low-level action sequences. The LGE can also unlock the director's mode, where the game is played by plotting behind the scenes, specifying high-level actions and goals for the agents in the form of language and desired states. This requires learning “game AI”, encapsulated by our animation model, to navigate the scene using high-level constraints, play against an adversary, devise the strategy to win a point.

Short Bio: Nicu Sebe is a professor in the University of Trento, Italy, where he is leading the research in the areas of multimedia information retrieval and human-computer interaction in computer vision applications. He received his PhD from the University of Leiden, The Netherlands and has been in the past with the University of Amsterdam, The Netherlands and the University of Illinois at Urbana-Champaign, USA. He was involved in the organization of the major conferences and workshops addressing the computer vision and human-centered aspects of multimedia information retrieval, among which as a General Co-Chair of the IEEE Automatic Face and Gesture Recognition Conference, FG 2008, ACM International Conference on Multimedia Retrieval (ICMR) 2017 and ACM Multimedia 2013. He was a program chair of ACM Multimedia 2011 and 2007, ECCV 2016, ICCV 2017, ICPR 2020 and a general chair of ACM Multimedia 2022. He is the Editor in Chief of Computer Vision and Image Understanding. He is a fellow of ELLIS, IAPR and a Senior member of ACM and IEEE.

Título: Exploring Robust Representations with Deep Learning for Biometrics and Surveillance

Resumo: In this lecture, we will address the use of Deep Learning in applications for Biometrics and Surveillance. More specifically, we will present results from recent work in two research projects. From the first project, supported by CAPES – Public Security and Forensic Science, developed in partnership with UFMG and the Federal Police, we discuss the use of super-resolution in two problems: (i) through diffusion models for facial recognition in uncontrolled environments, and (ii) using modified convolutional networks for license plate recognition. From the second project, supported by unico idTech, a Brazilian unicorn company, we present approaches for facial recognition using three-dimensional reconstruction and synthetic data, as well as a study on passive face anti-spoofing and the proposal of a dataset for active face anti-spoofing.

Short Bio: Full Professor in the Department of Informatics (DInf) at the Federal University of Paraná (UFPR) since September 2024. Holds a degree in Computer Engineering (2000) and a Master’s in Applied Informatics (2003) from the Pontifical Catholic University of Paraná (PUCPR). Earned a Ph.D. in Informatique (2008) from Université Paris-Est (France) and a Ph.D. in Computer Science (2008) from the Federal University of Minas Gerais (UFMG). Currently a Level 1D Research Productivity Fellow with CNPq. Previously served as Associate Professor at DInf/UFPR from July 2015 to September 2024, and as Assistant Professor in the Department of Computing (DECOM) at the Federal University of Ouro Preto (UFOP) from August 2008 to June 2015. He was also a Permanent Member of the Graduate Program in Computer Science at UFOP from September 2009 to July 2019. He continues to serve as a Collaborating Professor in the Graduate Programs in Computer Science at UFOP, UFMG, and the University of Campinas (UNICAMP). Completed a postdoctoral fellowship during his sabbatical year (from June 2013 to June 2014) at the Institute of Computing at UNICAMP. Has experience in the field of Computer Science, with an emphasis on Pattern Recognition, Computer Vision, Image Processing, and Machine Learning..

Title: Computer-aided Esophageal Cancer Identification: From Handcrafted to Deeply Learnable Features

Abstract: This talk discusses advances in computer-aided gastroenterology, particularly the detection of Esophageal Cancer and Barrett's Esophagus using handcrafted and deeply learnable features. We will show how to combine generative learning and explainable artificial intelligence to enhance predictive results and how to apply those approaches in practice.

Short Bio: João Paulo is a full professor at the Department of Computer Science, School of Sciences, São Paulo State University. He is a Fellow of the International Association for Pattern Recognition, the Asia-Pacific Artificial Intelligence Association, the Alexander von Humboldt Foundation, and the Brazilian National Council for Scientific and Technological Development. He was a visiting scholar at MIT (2024-2025) and Harvard University (2014-2015) and a Brazilian delegate at the International Association for Pattern Recognition. He is also the Coordinator of the eScience Program at the São Paulo Research Foundation and a member of the Advisory Committee in Computer Science, CNPq. His research interests include machine learning, optimization, computer vision, and language models.

Title: Computer Vision at Scale for Precision Agriculture

Abstract: Agriculture plays a fundamental role in Brazil’s economy, being one of the country's most innovative and dynamic sectors. With advancements in artificial intelligence, computer vision has become an essential tool for optimizing processes, reducing costs, and increasing productivity in agribusiness. This lecture will explore automated applications already adopted by major companies, such as field boundary detection, identification of planting lines and gaps, and weed monitoring. Despite these advancements, significant challenges remain, including the high variability of agricultural imagery, the need for adaptation across different domains (Domain Adaptation), and the importance of active learning strategies to continuously improve models. Finally, we will discuss the future prospects of computer vision in agriculture, including the impact of foundational and multimodal models, as well as the potential of reinforcement learning for more efficient decision-making.

Short Bio: Dr. Wesley Nunes Gonçalves is a CNPq productivity fellow and an associate professor at the Federal University of Mato Grosso do Sul (UFMS). He earned his Ph.D. at the São Carlos Institute of Physics of the University of São Paulo (IFSC-USP) and has been actively contributing to the field of computer vision applied to agribusiness. As the founder of two startups leveraging computer vision technology in agriculture, he combines academic research with innovation. He has received distinctions such as the 2023 Sul-Mato-Grossense Researcher Award and third place in the CONFAP National Award for Science, Technology, and Innovation in the Innovative Researcher category. His academic contributions include over 100 peer-reviewed journal articles, 70 conference proceedings papers, and four registered software technologies.

ChatGPT said: