#
Wolfram Neural Net Repository

Immediate Computable Access to Neural Net Models

Generate text in English

This language model is based on a simple stack of gated recurrent layers. It was trained by Wolfram Research in 2018 using teacher forcing on sequences of length 80. English-language models are often used to improve the performance of other systems, such as speech-to-text applications.

Number of layers: 8 | Parameter count: 4,144,930 | Trained size: 17 MB

- Internal Wolfram training set, consisting of 1.5 GB of text from old novels and news articles.

Retrieve the resource object:

In[1]:= |

Out[1]= |

Get the pre-trained network:

In[2]:= |

Out[2]= |

Predict the next character of a given sequence:

In[3]:= |

Out[3]= |

Get the top 15 probabilities:

In[4]:= |

Out[4]= |

Plot the top 15 probabilities:

In[5]:= |

Out[5]= |

Generate text with temperature sampling. First split the net into two parts:

In[6]:= |

In[7]:= |

Define a function for efficient generation using NetStateObject:

In[8]:= |

Generate for 100 steps using “hello” as an initial string:

In[9]:= |

Out[9]= |

The third optional argument is a “temperature” parameter that scales the input to the final softmax. A high temperature flattens the distribution from which characters are sampled, increasing the probability of extracting less likely characters:

In[10]:= |

Out[10]= |

Decreasing the temperature sharpens the peaks of the sampling distribution, further decreasing the probability of extracting less likely characters:

In[11]:= |

Out[11]= |

Very high temperature settings are equivalent to random sampling:

In[12]:= |

Out[12]= |

Very low temperature settings are equivalent to always picking the character with maximum probability. It is typical for sampling to “get stuck in a loop”:

In[13]:= |

Out[13]= |

Define a function that takes a string and guesses the next character as it reads, showing the predictions in a grid. The input string is shown on top, while the top 5 predictions are aligned below each character, starting from more likely guesses. For each prediction, the intensity of the color is proportional to the probability:

In[14]:= |

In[15]:= |

Out[15]= |

In[16]:= |

Out[16]= |

In[17]:= |

Out[17]= |

Define a function to complete a partial word by sampling with the model. Keep generating until a non-letter character is found:

In[18]:= |

Autocomplete a list of words:

In[19]:= |

Out[19]= |

Create “fantasy” words:

In[20]:= |

Out[20]= |

Inspect the sizes of all arrays in the net:

In[21]:= |

Out[21]= |

Obtain the total number of parameters:

In[22]:= |

Out[22]= |

Obtain the layer type counts:

In[23]:= |

Out[23]= |

Display the summary graphic:

In[24]:= |

Out[24]= |

Wolfram Language 11.3 (March 2018) or above

- Wolfram Research (2018)