Resource retrieval
Get the pre-trained net:
NetModel parameters
This model consists of a family of individual nets, each identified by a specific architecture. Inspect the available parameters:
Pick a non-default net by specifying the parameters:
NetModel tokenizers
Pick the original tokenizer used to train the nets, which yields the most accurate, model-consistent token IDs. It finds a globally optimal segmentation:
Pick a faster tokenizer, which finds the longest matching token at each step. It can produce segmentations that differ from the "UnigramTokenizer":
Evaluation function
Write a function that preprocesses a list of input sentences:
Write an evaluation function to combine the encoder and decoder nets into a full generation pipeline:
Basic usage
Define a text to summarize:
Evaluate a net:
Evaluate a net using the "Greedy" tokenizer:
Feature extraction
Get the sentences:
Encode the input sentences:
The function returns "input_ids", which are the numerical token representations of the text; "attention_mask", which marks real tokens versus padding; and "token_type_ids", which indicate segment identifiers but are unused for T5:
Get the features:
Visualize the normalized aggregated embeddings in a feature space:
Downstream tasks
T5 is a general-purpose language model designed to handle many NLP tasks using a single unified format. Unlike traditional task-specific architectures (e.g. separate models for sentiment analysis, question answering and translation), T5 treats every NLP task as a text-to-text problem. Define task templates for different downstream tasks:
Write an evaluation function that runs a net on an input prompt and task:
The Stanford Sentiment Treebank (SST-2) template can be used to classify the sentiment of a single sentence as either positive or negative:
The Question-Answering Natural Language Inference (QNLI) template can be used to determine whether a given sentence contains the answer to a specific question:
The Stanford Question Answering Dataset (SQuAD) template can be used to answer factual questions by extracting the most relevant span of text directly from a provided context paragraph:
The Cable News Network (CNN) template generates a short, human-like summary that captures the essential information from a longer news article:
The WMTe2g template generates a translation from English to German:
The WMTe2f template generates a translation from English to French:
The WMTe2r template generates a translation from English to Romanian: