Amazon’s AI uses meta learning to accomplish related tasks
In a paper scheduled to be presented at the upcoming International Conference on Learning Representations, Amazon researchers propose an AI approach that greatly improves performance on certain meta-learning tasks (i.e., tasks that involve both accomplishing related goals and learning how to learn to perform them). They say it can be adapted to new tasks with only a handful of labeled training examples, meaning a large corporation could use it to, for example, extract charts and captions from scanned paperwork.
In conventional machine learning, a model trains on a set of labeled data (a support set) and learns to correlate features with the labels. It’s then fed a separate set of test data (a query set) and evaluated on how well it predicts that set’s labels. By contrast, during meta-learning, an AI model learns to perform tasks with their own sets of training data and test data — the model sees both. In this way, the AI learns how particular ways of responding to the training data affect performance on the test data.
During a second stage called meta-testing, the model is trained on tasks that are related but not identical to the tasks it saw during meta-learning. For each task, the model once again sees both training and test data, but the labels are unknown and must be predicted; the model can access only the support set labels.
The researchers’ technique doesn’t learn a single global model during meta-training. Instead, it trains an auxiliary model to generate a local model for each task, drawing on the corresponding support set. Moreover, during meta-training, it preps an auxiliary network to leverage the unlabeled data of the query sets. And during meta-testing, it uses the query sets to fine-tune the aforementioned local models.
In experiments, the team reports that their system beat 16 baselines on the task of one-shot learning. In point of fact, it improved performance on one-shot learning, or learning a new object classification task from only a single labeled example, by 11% to 16% depending on the architectures of the underlying AI models.
That said, several baselines outperformed the model on five-shot learning, or learning with five examples per new task. But the researchers say those baselines were complementary to their approach, and that they believe combining approaches could yield lower error rates.
“In the past decade, deep-learning systems have proven remarkably successful at many artificial-intelligence tasks, but their applications tend to be narrow,” wrote Alexa Shopping applied scientist Pablo Garcia in a blog post explaining the work. “Meta-learning [can] turn machine learning systems into generalists … The idea is that it could then be adapted to new tasks with only a handful of labeled training examples, drastically reducing the need for labor-intensive data annotation.”
The paper’s publication follows that of a study by Google AI, the University of California, Berkely, and the University of Toronto proposing a benchmark for training and evaluating large-scale, diverse, and “more realistic” meta-learning models. The so-called Meta-Dataset incorporates leverages data from 10 different corpora, which span a variety of visual concepts natural and human-made and vary in the specificity of the class definition.