Home / Tech News / Teaching robots to learn about the world through touch

Teaching robots to learn about the world through touch

Slowly, but surely, the Baxter robot is learning. It starts as a series random grasps — the big, red robot pokes and prods clumsily at objects on the table in front of it. The process is pretty excruciating to us humans, with around 50,000 grasps unfolding over the course of a month of eight hour days. The robot is learning through tactile feedback and trial and error — or, as the Carnegie Mellon computer science team behind the project puts it, it’s learning about the world around it like a baby.

In a paper titled “The Curious Robot: Learning Visual Representations via Physical Interactions,” the team demonstrates how an artificial intelligence can be trained to learn about objects by repeatedly interacting with them. “For example,” the CMU students write, “babies push objects, poke them, put them in their mouth and throw them to learn representations. Towards this goal, we build one of the first systems on a Baxter platform that pushes, pokes, grasps and observes objects in a tabletop environment.”

By the time we arrive at the CMU campus, Baxter has, thankfully, already slogged through the process numerous times. Lab assistant Dhiraj Gandhi has set  out a demo for us. The robot stands behind a table and Gandhi lays out a strange cross-section of objects. There’s a pencil case, an off-brand Power Ranger, some car toys and a stuffed meerkat, among others — dollar bin tchotchkes selected for their diverse and complex shapes.

The demo is a combination of familiar and unfamiliar objects, and the contrast is immediately apparent. When the robot recognizes an object, it grasps it firmly, with a smile on its tablet-based face, dropping it into the appropriate box. If the object is unfamiliar, the face contorts, turning red and confused — but it’s nothing that another 50,000 or so additional grasps can’t solve.

The research marks a change from more traditional forms of computer vision learning, in which system are taught to recognize objects through a “supervised” process that involves the inputting of labels. CMU’s robot teaches itself all on its own. “ Right now what happens in computer vision is you have passive data,” explains Gandhi. “There is no interaction between the image and how you get the label. What we want is that you get active data as you interact with the objects. Through that data, we want to get learn features that will be useful for other vision tasks.”