A Robot That Takes Walks and Plays Tennis

A robot can’t decide to go for a walk on its own, said Rodney Brooks, an artificial intelligence pioneer and founder of Rethink Robotics, “It doesn’t have the intent a dog has.” (Rethink makes factory robots that don’t need cages, and can detect big changes in their work environment. “Is that scientifically hard? No. People in labs would have done that 20 years ago,” said Brooks. “But it’s gotta work 100 percent of the time.”)

Giving a machine intention is a difficult challenge. Software programmers can simulate the problem they’re trying to solve on computers, and progress doesn’t depend on physical movement–it’s about how fast a computer can simulate those movements.

Google’s DeepMind AI software played hundreds of thousands of rounds of the board game Go in a matter of months. It would take a lot longer to test drive robots taking hundreds of thousands of walks in the woods.

To develop robots, you have two options: You can either simulate an environment and robot with software and hope the results are accurate enough that you can load it into a machine and watch it walk. Or you can skip the simulation and tinker directly on a robot, hoping you can learn things from the real world– but that’s awfully slow.

Google faces this problem with its self-driving cars, and it tests them both ways. It has real cars drive a few thousand miles a week on real roads, and at the same time it simulates millions of miles a week driven by virtual cars on virtual roads. The idea is that the simulator can test out different scenarios to see how the cars react, and the real world can give Google data — and problems — that virtual cars don’t encounter. One time, a car confronted a man in a wheelchair chasing a turkey with a broom. This was not something Google had simulated.

The problem with robots is that they tend to be more advanced than cars. Instead of wheels, you have legs– and arms, necks, knee joints, and fingers. Simulating all of that accurately can be extremely difficult, but testing out all the different ways you can move the machine in flesh-and-blood reality takes years.

“Rosie the robot, you can’t have it knock over your furniture a hundred thousand times to learn,” said Gary Marcus, chief executive officer of a startup AI company called Geometric Intelligence.

Sergey Levine recently worked on a project to tackle this problem at Google. The company programmed 14 robotic arms to spend 3,000 hours learning to pick up different items, teaching each other as they went. The project was a success, but it took months, and it used robot arms rather than an entire body.

“In order to make AI work in the real world and handle all the diversity and complexity of realistic environments, we will need to think about how to get robots to learn continuously and for a long time, perhaps in cooperation with other robots,” said Levine. That’s probably the only way to get robots who can handle the randomness of everyday tasks.

Source: Bloomberg.

What is mentioned in the above article is called parallel learning. A couple of weeks ago, Sergey Levine writes in his post,

However, by linking learning with continuous feedback and control, we might begin to bridge that gap, and in so doing make it possible for robots to intelligently and reliably handle the complexities of the real world.

Consider for example this robot from KAIST, which won last year’s DARPA robotics challenge. The remarkably precise and deliberate motions are deeply impressive. But they are also quite… robotic. Why is that? What makes robot behavior so distinctly robotic compared to human behavior? At a high level, current robots typically follow a sense-plan-act paradigm, where the robot observes the world around it, formulates an internal model, constructs a plan of action, and then executes this plan. This approach is modular and often effective, but tends to break down in the kinds of cluttered natural environments that are typical of the real world. Here, perception is imprecise, all models are wrong in some way, and no plan survives first contact with reality.

In contrast, humans and animals move quickly, reflexively, and often with remarkably little advance planning, by relying on highly developed and intelligent feedback mechanisms that use sensory cues to correct mistakes and compensate for perturbations. For example, when serving a tennis ball, the player continually observes the ball and the racket, adjusting the motion of his hand so that they meet in the air. This kind of feedback is fast, efficient, and, crucially, can correct for mistakes or unexpected perturbations. Can we train robots to reliably handle complex real-world situations by using similar feedback mechanisms to handle perturbations and correct mistakes?

While serving and feedback control have been studied extensively in robotics, the question of how to define the right sensory cue remains exceptionally challenging, especially for rich modalities such as vision. So instead of choosing the cues by hand, we can program a robot to acquire them on its own from scratch, by learning from extensive experience in the real world. In our first experiments with real physical robots, we decided to tackle robotic grasping in clutter.

A human child is able to reliably grasp objects after one year, and takes around four years to acquire more sophisticated precision grasps. However, networked robots can instantaneously share their experience with one another, so if we dedicate 14 separate robots to the job of learning grasping in parallel, we can acquire the necessary experience much faster.

Vision modality, or sight intelligence is a different breed than the kind of intelligence used for playing chess or Go, and much more difficult to design. Unlike humans though, the ability to share whatever they’ve learnt in full and instantaneously will allow for exponential progress in time.

Maybe in my lifetime I can see two robots playing a tennis match against each other. Wouldn’t that be cool? Or a team of basketball or football robot players. I would pay good money to see that too.

Leave a comment