Opening remarks – Reinforcement Learning


Topics: Reinforcement Learning, Application, Research, Microsoft, Navigate

Transcript Excerpt

Reinforcement learning has evolved from a fringe topic of AI research to a common occurrence in the machine learning waste community during the previous decade. Although still in its infancy, reinforcement learning has advanced rapidly in recent years to the point where techniques have demonstrated a promising capacity to learn in contexts where flexibility and imagination are critical, and we are seeing intriguing real-world applications. As we continue to apply reinforcement learning to real-world problems, it has the potential to change AI’s ability to navigate changing settings, make complicated decisions, and give collaboration and assistance to people. With a unique approach to reinforcement learning, Microsoft Research has been at the forefront of the field. We have interesting possibilities to engage with product teams to understand how this technology can make a difference in real-world applications, and each lab focuses on unique parts of reinforcement learning with the benefit of a global viewpoint and collaboration between labs. Microsoft developed some of the first real-world reinforcement learning applications. The topics discussed today will provide insight into how we can bridge the gap between present methodologies and our potential to innovate and unlock new reinforcement learning applications in the future. One of the paths we have followed at Microsoft Research is to provide a solid theoretical and practical foundation for reinforcement learning approaches. This means we are focusing our efforts in this field on how reinforcement learning may help AI support and cooperate with people in real-world tasks. The first half of today’s sessions will concentrate on human AI collaboration via reinforcement learning, with fast talks on learning from human preferences, safe reinforcement learning, and learning to navigate and play chess in more humanlike ways. Following these sessions, we will have the opportunity to learn more about the exciting possibilities for constructing more humanlike and human compatible AI entities. The real world not only serves as a foundation for our study, but it also serves as a vision for the future of reinforcement learning. In this approach, we must remember the eventual purpose of how reinforcement learning might assist AI in completing specific tasks. Tie-Yan Liu, Assistant Managing Director, and Lead of Machine Learning at Microsoft Research Asia will present deep insights from his team on key challenges for reinforcement learning in applications like logistics and supply chain management, as well as how they drive his team’s rock on continuous offline reinforcement learning, a new framework that has helped to solve these problems. Reinforcement learning agents are physically manifested in the real-world using robotics. However, our hopes for applications highlight some of the issues we are tackling in reinforcement learning research. The team behind Project Dexter will present their research goals and insights on how powerful deep learning models can achieve successful learning by employing offline reinforcement learning techniques, for example. To put it another way, reinforcement learning agents must rely on less data and learn from humans to interact with and learn from humans. Another set of sessions for the day will focus on rock, generalization, and representation learning, which can help us find ways to improve AI learning methods from imperfect and noisy sickness, whether it’s in robot scanning skills from humans to complete a real-world task, AI controlled in game characters that are simply better to human players to make a game more enjoyable, or industrial control scenarios where AI would interact with humans on a daily basis. A prominent panel of internal and external speakers will explore recent trends and future potential for reinforcement learning to round off the day.

Electric Cars to Autonomous Driving 

Hover Widgets – Using the Tracking State to Extend the Capabilities