Learning representations of the visual world
Recent advances in machine learning have profoundly influenced our study of computer vision. Successes in this field have demonstrated the expressive power of learning representations directly from visual imagery — both in terms of practical utility and unexpected expressive abilities. In this talk I will discuss several contributions which have helped improve our ability to learn representations of images. First, I will describe recent advances for constructing models for extracting semantic information from images by leveraging transfer learning and meta-learning techniques. Such learned models outperform human-invented architectures and are readily scalable across a range of computational budgets. Second, I will highlight recent efforts focused on the converse problem of synthesizing images through the rich visual vocabulary of painting styles and visual textures. This work permits a unique exploration of visual space and offers a window on to the structure of the learned representation of visual imagery. My hope is that these works will highlight common threads in machine and human vision and point towards opportunities for future research.
Bio:
Jonathon Shlens received his Ph.D in computational neuroscience from UC San Diego in 2007 where his research focused on applying machine learning towards understanding visual processing in real biological systems. He was previously a research fellow at the Howard Hughes Medical Institute, a research engineer at Pixar Animation Studios and a Miller Fellow at UC Berkeley. He has been at Google Research since 2010 and is currently a research scientist focused on building scalable vision systems. During his time at Google, he has been a core contributor to deep learning systems including the recently open-sourced TensorFlow. His research interests have spanned the development of state-of-the-art image recognition systems and training algorithms for deep networks.