Wolfram Function Repository
Instant-use add-on functions for the Wolfram Language
Function Repository Resource:
Synthesize images using the Stable Diffusion neural network
ResourceFunction["StableDiffusionSynthesize"][prompt] synthesize an image given a string or explicit text embedding vector as prompt. | |
ResourceFunction["StableDiffusionSynthesize"][prompt→latent] use an initial image or a noise as latent for the diffusion starting point. | |
ResourceFunction["StableDiffusionSynthesize"][prompt→latent→guidanceScale] specify a guidance scale. | |
ResourceFunction["StableDiffusionSynthesize"][{negativeprompt,prompt}→latent→guidanceScale] specify a guidance scale with a negative prompt. | |
ResourceFunction["StableDiffusionSynthesize"][<|"Prompt"→…,"NegativePrompt"→…,"Latent"→…,"GuidanceScale"→…,…|>] provide an association with explicit arguments. | |
ResourceFunction["StableDiffusionSynthesize"][prompt,n] generate n instances for the same prompt specification. | |
ResourceFunction["StableDiffusionSynthesize"][{p1,p2,…}] generate multiple images. | |
ResourceFunction["StableDiffusionSynthesize"][{p1,p2,…},n] generate multiple images for each prompt. |
Generate an image by giving a text prompt:
In[1]:= |
Out[1]= |
Generate multiple images:
In[2]:= |
Out[2]= |
Guide an initial image with a prompt:
In[3]:= |
Out[3]= |
Use negative prompt for additional guidance (bottom row):
In[4]:= |
Out[4]= |
Use a precomputed text embedding:
In[5]:= |
In[6]:= |
Out[6]= |
Use an explicit initial noise:
In[7]:= |
In[8]:= |
Out[8]= |
A higher guidance scale encourages generation of images that are more closely linked to the prompt, usually at the expense of lower image quality:
In[9]:= |
In[10]:= |
Out[10]= |
Specify encoding strength (how much to transform the reference image):
In[11]:= |
In[12]:= |
Out[12]= |
Specify number of diffusion iterations (default is 50):
In[13]:= |
Out[13]= |
Default Automatic reporting shows latent images in the process of diffusion:
ProgressReporting→False disables it.
Return intermediate images for each diffusion iteration:
In[14]:= |
Out[14]= |
Return a pair with a list of latents and the result {latents,result}:
In[15]:= |
Out[15]= |
Specify custom neural network parts from a different trained checkpoints, modified by Text Inversion, LoRA or other techniques:
In[16]:= |
Out[16]= |
By default TargetDevice is "GPU" as the network is extremely slow and not recommended to be run on "CPU":
In[17]:= |
Out[17]= |
"UNetTargetDevice" and "UNetBatchSize" options can overwrite TargetDevice and BatchSize, which may be useful when Decoder can't handle the same BatchSize for decoding too many images:
Progressively modify the neural network to see how it gradually breaks down:
In[18]:= |
In[19]:= |
In[20]:= |
In[21]:= |
In[22]:= |
Out[22]= |
Wolfram Language 13.0 (December 2021) or above
This work is licensed under a Creative Commons Attribution 4.0 International License