Image bought from Istock.com: copyright
Personal assistants such as Alexa, Google Home, Siri and Cortana allow cell phones and computers to understand and respond to voice commands. However, these assistants have difficulty understanding the nuances of human language. Indeed, the semantic content of a sentence (what we are talking about, what we want to say) is closely related to the way it is said, making it harder for these assistants to understand the meaning.
The ComText System
A team from the Massachusetts Institute of Technology Computer Science and Artificial Intelligence Laboratory (CSAIL) developed a system that can provide context for human-machine commands: ComText means “commands in context.”
With ComText, the user can acquaint the robot with its environment and command tasks based on information that the system previously recorded and analyzed. To do this, all the user will have to do is talk to the robot using the ComText personal assistant. For example, if you put a tool, like a hammer, in a toolbox and ask a robot to take it, the ComText system will allow it to understand your request and respond to it. This way, the robot will be able to identify the object corresponding to the request (the hammer to be retrieved), recognize it in a complex environment (the toolbox) and take it.
This system uses the Alexa intelligent personal assistant developed by Amazon
It was tested on Baxter, the robot from Rethink Robotics. This robot correctly executed 90% of the tasks requested.
According to Rohan Paul, postdoctoral student at CSAIL, ComText will expand the semantic representations of personal assistants to make human-machine language exchanges more intelligent.
ComText Adds Episodic Memory to Robots
Remembering facts and events like birthdays is a task that involves declarative memory or explicit memory. There are two types of declarative memory: semantic memory based on general facts like “the sky is blue”, and episodic memory based on personal facts like remembering what happened at a party.
Most robot learning systems rely solely on semantic memory (for example, size, colour, and coordinates). ComText adds episodic memory to the robot learning system, allowing linguistic and visual information to be recorded and analyzed (for example, who owns the hammer in the toolbox).
The information gathered from these two types of memory expand the semantic representations of ComText and consequently allow it to understand and respond to complex commands (declarative and exclamatory statements). The CSAIL team’s research contribution is its success in creating the first mathematical representation that allows this type of learning to be programmed.
The team also plans to develop machine learning of more complex information such as multi-step commands, intentions, as well as using the properties of objects in order to interact with them more spontaneously.
Luke Zettlemoyer, Associate Professor of Computer Science at the University of Washington (he was not involved in the research) pointed out that this research will allow robots to better understand terms used throughout the world to name objects and to interpret instructions given with these terms to better perform requested tasks.
ComText also promises to revolutionize home automation and autonomous systems like self-driving vehicles.
This technology was introduced at the International Joint Conference on Artificial Intelligence (IJCAI) in Australia. The work was funded in part by the Toyota Research Institute, the National Science Foundation, the US Army Robotics Collaborative Technology Alliance, and the Air Force Research Laboratory.