Wolfram Function Repository
Instant-use add-on functions for the Wolfram Language
Function Repository Resource:
Synthesize images using the Stable Diffusion neural network
ResourceFunction["StableDiffusionSynthesize"][prompt] synthesize an image given a string or explicit text embedding vector as prompt. | |
ResourceFunction["StableDiffusionSynthesize"][prompt→latent] use an initial image or a noise as latent for the diffusion starting point. | |
ResourceFunction["StableDiffusionSynthesize"][prompt→latent→guidanceScale] specify a guidance scale. | |
ResourceFunction["StableDiffusionSynthesize"][{negativeprompt,prompt}→latent→guidanceScale] specify a guidance scale with a negative prompt. | |
ResourceFunction["StableDiffusionSynthesize"][<|"Prompt"→…,"NegativePrompt"→…,"Latent"→…,"GuidanceScale"→…,…|>] provide an association with explicit arguments. | |
ResourceFunction["StableDiffusionSynthesize"][prompt,n] generate n instances for the same prompt specification. | |
ResourceFunction["StableDiffusionSynthesize"][{p1,p2,…}] generate multiple images. | |
ResourceFunction["StableDiffusionSynthesize"][{p1,p2,…},n] generate multiple images for each prompt. |
Generate an image by giving a text prompt:
In[1]:= | ![]() |
Out[1]= | ![]() |
Generate multiple images:
In[2]:= | ![]() |
Out[2]= | ![]() |
Guide an initial image with a prompt:
In[3]:= | ![]() |
Out[3]= | ![]() |
Use negative prompt for additional guidance (bottom row):
In[4]:= | ![]() |
Out[4]= | ![]() |
Use a precomputed text embedding:
In[5]:= | ![]() |
In[6]:= | ![]() |
Out[6]= | ![]() |
Use an explicit initial noise:
In[7]:= | ![]() |
In[8]:= | ![]() |
Out[8]= | ![]() |
A higher guidance scale encourages generation of images that are more closely linked to the prompt, usually at the expense of lower image quality:
In[9]:= | ![]() |
In[10]:= | ![]() |
Out[10]= | ![]() |
Specify encoding strength (how much to transform the reference image):
In[11]:= | ![]() |
In[12]:= | ![]() |
Out[12]= | ![]() |
Specify number of diffusion iterations (default is 50):
In[13]:= | ![]() |
Out[13]= | ![]() |
Default Automatic reporting shows latent images in the process of diffusion:
ProgressReporting→False disables it.
Return intermediate images for each diffusion iteration:
In[14]:= | ![]() |
Out[14]= | ![]() |
Return a pair with a list of latents and the result {latents,result}:
In[15]:= | ![]() |
Out[15]= | ![]() |
Specify custom neural network parts from a different trained checkpoints, modified by Text Inversion, LoRA or other techniques:
In[16]:= | ![]() |
Out[16]= | ![]() |
By default TargetDevice is "GPU" as the network is extremely slow and not recommended to be run on "CPU":
In[17]:= | ![]() |
Out[17]= | ![]() |
"UNetTargetDevice" and "UNetBatchSize" options can overwrite TargetDevice and BatchSize, which may be useful when Decoder can't handle the same BatchSize for decoding too many images:
Progressively modify the neural network to see how it gradually breaks down:
In[18]:= | ![]() |
In[19]:= | ![]() |
In[20]:= | ![]() |
In[21]:= | ![]() |
In[22]:= | ![]() |
Out[22]= | ![]() |
Wolfram Language 13.0 (December 2021) or above
This work is licensed under a Creative Commons Attribution 4.0 International License