Juan Carlos Niebles received an Engineering degree in Electronics from Universidad del Norte (Colombia) in 2002, an M.Sc. degree in Electrical and Computer Engineering from University of Illinois at Urbana-Champaign in 2007, and a Ph.D. degree in Electrical Engineering from Princeton University in 2011. He is a Senior Research Scientist at the Stanford AI Lab and Associate Director of Research at the Stanford-Toyota Center for AI Research since 2015. He is also an Assistant Professor of Electrical and Electronic Engineering in Universidad del Norte (Colombia) since 2011. His research interests are in computer vision and machine learning, with a focus on visual recognition and understanding of human actions and activities, objects, scenes, and events. He is a recipient of a Google Faculty Research award (2015), the Microsoft Research Faculty Fellowship (2012), a Google Research award (2011) and a Fulbright Fellowship (2005).
Humans are probably the most important subject in the many hours of video that are recorded and consumed every minute. Computer vision technology for automatic recognition of human activities and actions has the potential to enable many applications by understanding the semantics of events and activities depicted in such videos. In this talk, I’ll give an overview of our work towards the next generation of activity understanding algorithms that are capable of recognizing a large number of activities, localizing them within long video sequences, parsing and describing complex events and even anticipating and predicting actions before they occur.
Dr. Chris Rowen is the founder and CEO of Cognite Ventures. Chris is a well-known Silicon Valley entrepreneur and technologist. He has served as CTO for Cadence’s IP Group, where he and his team develop new processor and memory for advanced applications in mobile, automotive, infrastructure, deep learning and IoT systems. The team has become one of the leading innovators in automated neural network optimization and ultra-efficient neural network processing engines for embedded systems. Chris joined Cadence after its acquisition of Tensilica, the company he founded to develop extensible processors. He led Tensilica as CEO and later, CTO, to become one of the leading embedded architectures, with more than 225 chip and system company licensees, who together ship more than 4 billion cores per year. Before founding Tensilica in 1997, he was VP and GM of the Design Reuse Group at Synopsys. Chris also was a pioneer in developing RISC architecture and helped found MIPS Computer Systems, where he was VP of Microprocessor Development. He holds an MSEE and PhD in electrical engineering from Stanford and a BA in physics from Harvard. He holds more than 40 US and international patents. He was named an IEEE Fellow in 2015 for his work in development of microprocessor technology. He started Cognite Ventures in 2016 to develop, advise and invest in new entrepreneurial ventures, especially around cognitive computing.
Speech Abstract: The dramatic proliferation of cameras doesn’t just change the quantity of image streams and applications, but drives fundamental qualitative shifts on the role of imaging in technology, business and society. This talk explores the most important changes as camera systems are designed and deployed just for computer vision, as the urban-area density of cameras sky-rockets, and as deep learning methods change the nature of insights we can extract, and as vision-inspired statistical computing supplants traditional computing models . Deep learning, in particular, is driving a new wave of innovation and of startups built to extra much more insight from existing surveillance, monitoring, human-machine- interface, robotic and automotive streams. The distribution of these innovators across geographies and application segments tells us a lot about how vision will change most. As a byproduct of studying the targets for startups, we can even derive useful lessons about the startup success formula in the vision space.
Dr. Jon Peddie is a recognized pioneer in the graphics industry, President of Jon Peddie Research and named one of the most influential analysts in the world. He lectures at numerous conferences and universities on topics pertaining to graphics technology and the emerging trends in digital media technology. Former President of Siggraph Pioneers, he serves on advisory board of several conferences, organizations, and companies, and contributes articles to numerous publications. In 2015, he was given the Life Time Achievement award from the CAAD society Dr. Peddie has published hundreds of papers, to date; and authored and contributed to eleven books, His most recent book is “Augmented Reality. Where We Will All Live” (2017).
Giving vision to robots, cars, and public places requires more than just high-definition cameras. Trainable, augmented visual and semi-autonomous systems with high-speed communications have to be behind the camera for them to be useful, see things a human might miss, and see things in hundreds to thousands of places at the same time that would be humanly impossible. Such a system uses a camera of course and then is backed up with a new galaxy of processors with a new type of processor in the center, a Visual Processing Unit, all of which feed, and in some case drive a CNN that has been trained by a deep-learning system. Yes, big brother is watching, and be thankful he is.
Silvio Savarese is an Associate Professor (with tenure) of Computer Science at Stanford University and director of the SAIL-Toyota Center for AI Research at Stanford. He earned his Ph.D. in Electrical Engineering from the California Institute of Technology in 2005 and was a Beckman Institute Fellow at the University of Illinois at Urbana-Champaign from 2005–2008. He joined Stanford in 2013 after being Assistant and then Associate Professor of Electrical and Computer Engineering at the University of Michigan, Ann Arbor, from 2008 to 2013. His research interests include computer vision, robotic perception and machine learning. He is recipient of several awards including a Best Student Paper Award at CVPR 2016, the James R. Croes Medal in 2013, a TRW Automotive Endowed Research Award in 2012, an NSF Career Award in 2011 and Google Research Award in 2010. In 2002 he was awarded the Walker von Brimer Award for outstanding research initiative.
Computers can now recognize objects from images, classify simple human activities or reconstruct the 3D geometry of an environment. However, these achievements are far from the kind of coherent and integrated interpretations that humans are capable of from just a quick glance of the complex 3D world. When we look at an environment, we don’t just recognize the objects in isolation, but rather perceive a rich scenery of the 3D space, its objects, the people and all the relations among them. This allows us to effortlessly navigate through the environment, or to interact with objects in the scene with amazing precision or to predict what is about to happen next. In this talk I will give an overview of the research from my group and discuss our latest work on designing visual models that can process different sensing modalities and enable intelligent understanding of the sensing data. I will also demonstrate that our models are potentially transformative in application areas related to autonomous or assisted navigation, smart environments, social robotics, augmented reality, and large scale information management.