Wolfram LaTeX Character-Level Language Model V1

Generate LaTeX code

This language model is based on a simple stack of gated recurrent layers. It was trained by Wolfram Research in 2018 using teacher forcing on sequences of length 100.

Number of layers: 7 | Parameter count: 7,896,330 | Trained size: 32 MB |

Training Set Information

Internal Wolfram training set, consisting of over 5 GB of LaTeX code scraped from 150,000 articles from arXiv.

Examples

Download Example Notebook

Open in Wolfram Cloud

Resource retrieval

Get the pre-trained network:

In[1]:=

Out[1]=

Basic usage

Predict the next character of a given sequence:

In[2]:=

$NetModel["Wolfram LaTeX Character-Level Language Model V1"]["\\begin"]$

Out[2]=

Get the top 15 probabilities:

In[3]:=

$topProbs = NetModel["Wolfram LaTeX Character-Level Language Model V1"][ "\\begin", {"TopProbabilities", 15}]$

Out[3]=

Plot the top 15 probabilities:

In[4]:=

$BarChart[Thread@ Labeled[Values@topProbs, Keys[topProbs] /. {"\n" -> "\\n", "\t" -> "\\t"}], ScalingFunctions -> "Log"]$

Out[4]=

Generation

Generate text efficiently with NetStateObject. A built-in option for temperature sampling is available in Wolfram Language 12.0, while it has to be implemented explicitly in earlier versions.

In[5]:=

$generateSample[start_, len_, temp_ : 1] := Block[{net, score, sampler, obj}, net = NetModel["Wolfram LaTeX Character-Level Language Model V1"]; If[$VersionNumber < 12.0, score = NetTake[net, 6]; sampler = NetTake[net, -1]; obj = NetStateObject[score]; StringJoin@ NestList[sampler[obj[#]/temp, "RandomSample"] &, start, len], obj = NetStateObject[net]; StringJoin@ NestList[obj[#, {"RandomSample", "Temperature" -> temp}] &, start, len] ] ]$

Generate for 100 steps using “\begin” as an initial string:

In[6]:=

$generateSample["\\begin", 100]$

Out[6]=

The third optional argument is a “temperature” parameter that scales the input to the final softmax. A high temperature flattens the distribution from which characters are sampled, increasing the probability of extracting less likely characters:

In[7]:=

$generateSample["\\begin", 100, 1.1]$

Out[7]=

Decreasing the temperature sharpens the peaks of the sampling distribution, further decreasing the probability of extracting less likely characters:

In[8]:=

$generateSample["\\begin", 100, 0.4]$

Out[8]=

Very high temperature settings are equivalent to random sampling:

In[9]:=

$generateSample["\\begin", 100, 10]$

Out[9]=

Very low temperature settings are equivalent to always picking the character with maximum probability. It is typical for sampling to “get stuck in a loop”:

In[10]:=

$generateSample["\\begin", 200, 0.01]$

Out[10]=

Inspection of predictions

Define a function that takes a string and guesses the next character as it reads, showing the predictions in a grid. The input string is shown on top, while the top 5 predictions are aligned below each character, starting from more likely guesses. For each prediction, the intensity of the color is proportional to the probability:

In[11]:=

$inspectPredictions[string_] := Block[ {obj, chars, pred, predItems, charItems}, obj = NetStateObject[ NetModel["Wolfram LaTeX Character-Level Language Model V1"]]; chars = Characters[string]; pred = Map[obj[#, {"TopProbabilities", 5}] &, chars] /. {"\n" -> "\\n", "\t" -> "\\t"}; predItems = Map[Item[First[#], Background -> Opacity[Last[#], Darker[Green]]] &, pred, {2}]; predItems = Prepend[Most[predItems], Table[Item["", Background -> Gray], 5]]; charItems = Item[#, Background -> LightBlue] & /@ (chars /. {"\n" -> "\\n", "\t" -> "\\t"}); Grid[ Prepend[Transpose[predItems], charItems], Spacings -> {0.6, 0.2}, Dividers -> All, FrameStyle -> Gray ] ]$