Wolfram JavaScript Character-Level Language Model V1

Generate JavaScript code

This language model is based on a simple stack of gated recurrent layers. It was trained by Wolfram Research in 2018 using teacher forcing on sequences of length 100.

Number of layers: 7 | Parameter count: 2,570,530 | Trained size: 10 MB |

Training Set Information

Internal Wolfram training set, consisting of over 40 MB of JavaScript code scraped from GitHub.

Examples

Download Example Notebook

Open in Wolfram Cloud

Resource retrieval

Get the pre-trained network:

In[1]:=

Out[1]=

Basic usage

Predict the next character of a given sequence:

In[2]:=

$NetModel["Wolfram JavaScript Character-Level Language Model \ V1"]["alert"]$

Out[2]=

Get the top 15 probabilities:

In[3]:=

Out[3]=

Plot the top 15 probabilities:

In[4]:=

$BarChart[Thread@ Labeled[Values@topProbs, Keys[topProbs] /. {"\n" -> "\\n", "\t" -> "\\t"}], ScalingFunctions -> "Log"]$

Out[4]=

Generation

Generate text efficiently with NetStateObject. A built-in option for temperature sampling is available in Wolfram Language 12.0, while it has to be implemented explicitly in earlier versions.

In[5]:=

$generateSample[start_, len_, temp_ : 1] := Block[{net, score, sampler, obj}, net = NetModel[ "Wolfram JavaScript Character-Level Language Model V1"]; If[$VersionNumber < 12.0, score = NetTake[net, 6]; sampler = NetTake[net, -1]; obj = NetStateObject[score]; StringJoin@ NestList[sampler[obj[#]/temp, "RandomSample"] &, start, len], obj = NetStateObject[net]; StringJoin@ NestList[obj[#, {"RandomSample", "Temperature" -> temp}] &, start, len] ] ]$

Generate for 100 steps using “alert” as an initial string:

In[6]:=

Out[6]=

The third optional argument is a “temperature” parameter that scales the input to the final softmax. A high temperature flattens the distribution from which characters are sampled, increasing the probability of extracting less likely characters:

In[7]:=

Out[7]=

Decreasing the temperature sharpens the peaks of the sampling distribution, further decreasing the probability of extracting less likely characters:

In[8]:=

Out[8]=

Very high temperature settings are equivalent to random sampling:

In[9]:=

Out[9]=

Very low temperature settings are equivalent to always picking the character with maximum probability. It is typical for sampling to “get stuck in a loop”:

In[10]:=

Out[10]=

Inspection of predictions

Define a function that takes a string and guesses the next character as it reads, showing the predictions in a grid. The input string is shown on top, while the top 5 predictions are aligned below each character, starting from more likely guesses. For each prediction, the intensity of the color is proportional to the probability:

In[11]:=

$inspectPredictions[string_] := Block[ {obj, chars, pred, predItems, charItems}, obj = NetStateObject[ NetModel[ "Wolfram JavaScript Character-Level Language Model V1"]]; chars = Characters[string]; pred = Map[obj[#, {"TopProbabilities", 5}] &, chars] /. {"\n" -> "\\n", "\t" -> "\\t"}; predItems = Map[Item[First[#], Background -> Opacity[Last[#], Darker[Green]]] &, pred, {2}]; predItems = Prepend[Most[predItems], Table[Item["", Background -> Gray], 5]]; charItems = Item[#, Background -> LightBlue] & /@ (chars /. {"\n" -> "\\n", "\t" -> "\\t"}); Grid[ Prepend[Transpose[predItems], charItems], Spacings -> {0.6, 0.2}, Dividers -> All, FrameStyle -> Gray ] ]$