Transfer learning is a technique that’s risen to prominence in the AI and machine learning community over the past several decades. It refers to storing knowledge gained while solving one problem and applying it to a different, but related, problem. So far, transfer learning has been applied to cancer subtype discovery, video game playing, text classification, medical imaging, spam filtering, and more. Prominent computer scientist Andrew Ng said in 2016 that transfer learning will be one of the major drivers of machine learning commercial success.
Transfer learning has its benefits, chief among them allowing companies to repurpose machine learning models for new problems with less training data. But transfer learning is often simpler in theory than in execution. For example, models trained on one problem and applied to another can suffer from negative transfer, where the model becomes less accurate over time.
The origins of transfer learning lie in a study conducted by academics Stevo Bozinovski and Ante Fulgosi in 1976. In it, the coauthors proposed the use of transfer learning in neural networks during the model training process. Nearly a decade later, a report was given on the application of transfer learning in character recognition. But the technique isn’t thought to have entered the mainstream until around 1995, where it was presented at a workshop during the NIPS machine learning conference in Denver, Colorado.
Models are trained in two stages in transfer learning. First, there’s retraining, where the model is trained on a benchmark dataset representing a range of categories. Next is fine-tuning, where the model is further trained on a target task of interest. The pretraining step helps the model to learn general features that can be reused on the target task, boosting its accuracy.
Transfer learning has a wealth of use cases, particularly in image and speech recognition as well as natural language processing (NLP). For instance, a model trained for an autonomous car can likely be leveraged for an autonomous truck — at least in part. And a model that developed strategies while playing the Chinese board game Go –such as DeepMind’s AlphaZero — can likely be adapted to related games like chess.
Google and Amazon are using transfer learning in Google Translate and Alexa so that the insights gleaned through training on high-resource languages (e.g., French, German, and Spanish) can be applied to the translation of low-resource languages (Yoruba, Sindhi, and Hawaiian). Meanwhile, Yelp has used transfer learning to identify photos most likely to contain spam uploaded by users to business listings.
There’s several different kinds of transfer learning, each with their own upsides: inductive, unsupervised, and transductive transfer learning. With inductive transfer learning, the source and target domains are the same, yet the source and target tasks are different. Unsupervised learning involves different tasks in similar — but not identical — source and target domains without labeled data. As for transductive transfer learning, similarities exist between the source and target tasks, but the domains are different and only the target domain doesn’t have labeled data.
Transfer learning can be further categorized by the components of the model being transferred. Instance transfer reuses knowledge from the source domain to the target task, for example, while parameter transfer works on the assumption that the models for related tasks share some parameters. Parameters are the features internal to a model (including weights) that are learned from the training data.
Transfer learning has plenty in the way of advantages, namely that it speeds up the process of training on a new task. Whereas models like OpenAI’s GPT-3 and DeepMind’s AlphaStar might need powerful hardware and countless hours to train, a “fine-tuned” model created through transfer learning typically requires a fraction of the time and effort.
As PJ Kirk, digital marketing executive at data analytics firm Analytics Engines, points out, transfer learning can enable more organizations to incorporate AI and machine learning into their core business strategies. “The reduced financial, time, and infrastructural costs have made AI and machine learning more accessible than ever before,” he wrote in a blog post. “Organizations no longer need to create dedicated deep learning models and can instead capitalize upon the expertise and models of others to provide the foundation upon which their solution is built.
In good news on the explainability front, researchers at Google recently published a paper that shed light on transfer learning’s fundamentals. They found that features become more specialized the “denser” the model is, and that feature reuse is more prevalent in the parts of the model closer to the input data. Beyond this, they discovered that it’s possible to fine-tune pretrained models on a target task earlier than originally assumed, without sacrificing accuracy.
Work like Google’s illustrates that the challenges around transfer learning aren’t insurmountable. In any case, the benefits certainly appear to be worth it.
Kevin Dewalt, cofounder of AI consultancy Prolego, posits that transfer learning is in equal parts efficient and economical. “Suppose your CFO only approves enough budget to generate 1,000 pictures of meals labeled with calories — a mere 1% of what your data scientist requested. Before begging for more money, you [can generate] results through transfer learning,” he wrote in a Medium post. “Unless you’re Google or Facebook, getting labeled data can be prohibitively expensive. Transfer learning techniques provide two primary business benefits: Faster experiments [and] higher ROI, [because] transfer learning can reduce the cost of ongoing data managements and boost the ROI of any machine learning project.”
© 2021 LeackStat.com
2025 © Leackstat. All rights reserved