The paper contains 2359 words and is expected to last 5 minutes

Photo credit: pexels.com/pixabay

Recently, the AWSre: MARS conference was held in Las Vegas, which focused on how machine learning, automation and robotics — including in space — will change the future. A lot of attention has been paid to Robert Downey Jr., but it’s the simulation and reinforcement learning that comes out of almost every keynote session that gets the most attention:

Day 1: Through reinforcement learning, Boston Dynamics’ robots have mastered back flips, window jumps and lifts. But Disney’s Imagineering project has taken this to a new level by having humanoid robots perform death-defying stunts.

Day 2: Amazon trains the model by simulating difficult scenarios in the Go store. Robots in Amazon’s distribution centers can also sort packages after being trained in intensive learning. Alexa uses simulated interactions to automatically learn conversation streams. Amazon Drone Express uses simulation data to train how to detect people beneath drones. And companies like Insitro are starting to solve biomedical problems by generating data about biological interactions.

Day 3: Ng calls on Yuan to learn. Hundreds of different simulators are being used to build more general reinforcement learning agents, arguably AI’s “next big thing.” Autonomous car companies Zoox and Aurora are using RL and meta-learning to address the complexities of driving in urban environments. Dexnet tries to build a huge database of 3D models through simulations to get a better grasp of the problem. Jeff Bezos agrees with Daphne Koller that RL bioengineering will be big in 10 years.

To sum up:

If a domain can be accurately modeled, reinforcement learning will dramatically improve the state of the art in the coming years.

So what does physics have to do with it?

A four-year-old enters the “why” phase of her life, when her brain shifts from simply recognizing things to wanting to understand everything in the world. This is a typical interaction between adults and children:

Draw using http://cmx.io

So what does this have to do with data science?

In his talk on deep learning at this year’s Google I/O conference, Dean noted that neural networks have been trained to produce results that are nearly 300,000 times faster than physical simulators, meaning that it’s possible to test 100 megabits of molecules over a lunch break.

Photo credit: Jeff Dean speaking at Google I/O 2019

This is a huge step forward because it allows us to use the remarkable reinforcement learning on RE: MARS to solve new kinds of problems. Prior to these advances, the cycle time required to run a complete physics simulator for each potential outcome was too long for RL to achieve a rewarding result. But now, RL can learn the physics of molecules to optimize what chemical engineers expect.

Photo credit: https://xkcd.com/435/

Given that everything can be reduced to physics, we can even imagine a world in which more schemes can be built from the most basic principles. Before the conference, many people thought that research related to analog biology was out of reach, but in fact, companies like Insitro are already addressing these issues.

RL will then be available for “higher level” sciences such as psychology:

1. Raw computing power: Google has released proprietary data for T3 TPU Pods, with over 100 floating-point operations per second, built to run neural network training architectures. With that kind of computing power, tasks like material analysis are easy to learn. In addition, Google is starting to design their own chips using RL, which is expected to lead to more advances over time.

2. Better reusability: DeepMind is used in multi-tier network architectures, while RL is responsible for choosing the right downstream network for the mission. Such RL agents can be trained to simplify difficult tasks by breaking them down and use transfer learning to solve multiple tasks.

3. Better generalization: The meta-learning techniques described above are being used to improve the ability of RL agents to deal with unencountered scenarios.

4. Better optimization: The MIT lottery hypothesis paper shows that neural networks can be further compressed by looking for paths to “winning tickets” and then training using only those paths.

5. Better training data generation: An Interface like AutoCad’s generate design can help designers/engineers find the specifications needed to make the RL agent work correctly. Each time a new person takes over, the self-driving car company generates a new training scenario.

What should you do?

Photo source:

https://en.wikipedia.org/wiki/Reinforcement_learning#/media/File:Reinforcement_learning_diagram.svg

First, you need to learn about reinforcement learning, which is a brief introduction to RL agents acquiring situational states, selecting an action to affect the environment, observing a new situation, and repeating the steps. If the action has a positive result and the agent is rewarded, it tends to give the same sequence of actions in similar situations in the future.

These steps were repeated a lot, and eventually, he got really good at getting rewards (and we trained him for that). The best way to enrich your experience is to use AWS Deep Racer, which is a scaled-down version of a racing car that provides a simulation environment, an RL training rig, and a piece of physics that corresponds to the simulation. You just need to adjust the reward system to train your racing agent.

Photo source:

https://www.semanticscholar.org/paper/OpenAI-Gym-Brockman-Cheung/2b10281297ee001a9f3f4ea1aa9bea6b638c27df/figure/0

Second, you need to actively look for ways to better simulate business systems. Any existing emulators are a good starting point, but newer emulators are more likely to have a significant impact. AWS offers a service called “RoboMaker” in this area, but there are many other alternatives, most of which are based on the open API Gym.

Finally, keep an eye on the new companies riding the technology wave. It is likely that we will eventually develop a series of open source simulators that build on each other, with neural networks that compress the learnable information at each layer. Until then, there are many areas where there are likely to be many proprietary solutions beyond the current state of the art. Over time, this technology will bring considerable benefits to science-based fields such as pharmaceuticals, materials science, medicine, oil and gas, and a variety of other fields.