Antonis Argyros 

Professor, Computer Science Department, University of Crete

Affiliated Research Fellow, Institute of Computer Science, FORTH

users.ics.forth.gr/~argyros/

Short bio

Antonis Argyros is a Professor of Computer Science at the Computer Science Department (CSD), University of Crete (UoC) and an affiliated research fellow at the Institute of Computer Science (ICS), Foundation for Research and Technology–Hellas (FORTH) in Heraklion, Crete, Greece. He earned his B.Sc. (1989), M.Sc. (1992), and Ph.D. (1996) in Computer Science from the University of Crete, followed by a postdoctoral position at the Computational Vision and Active Perception Lab at KTH, Stockholm.  His research interests are in the area of computer vision and robotics. His work is focused on developing AI-driven computer vision methods that accurately perceive and interpret human presence, covering the estimation of human pose and shape, the perception of hand articulation, facial analysis, as well as the understanding of gestures, actions, activities and intentions. In these areas he has authored over 250 scientific publications in leading journals and conferences. His work has received international recognition through awards and several invited and keynote presentations at international venues. He has served in leadership and editorial roles within international conferences and professional organizations. He maintains strong collaborations with academic and industrial partners worldwide and has strong involvement in numerous European and national research projects.

Talk Title: “Visual AI for Perceiving and Interpreting Human Presence”

Human-Centered Computer Vision (HCCV) focuses on endowing artificial systems with the ability to perceive, interpret, and reason about humans and their interactions with the surrounding world. This talk presents an overview of research advances from our Human-Centered Computer Vision group, covering work on visual perception of human presence. Starting from low-level observation tasks such as hand articulation, facial motion, and full-body 3D pose and shape estimation, the presentation progressively moves toward higher-level understanding, including action recognition, activity assessment, human–object interaction modeling, and anticipation of future actions and object state changes. The talk highlights methodological contributions that span model-based and learning-based approaches, emphasizing efficiency, real-time performance, and scalability to multi-person and unconstrained environments. Representative solutions illustrate how intermediate representations and unified multi-task architectures enable accurate, lightweight, and real-time human understanding from monocular RGB input. The presentation also discusses recent work on visual–linguistic reasoning, zero-shot object state recognition, explainable activity similarity, and anticipation of human actions and object state changes. Applications across robotics, healthcare, industry, culture, and everyday environments are showcased, demonstrating how human-centered vision enables intuitive human–robot collaboration, assistive technologies, skill learning, and intelligent sensing in real-world settings.