Wolfram Neural Net Repository
Immediate Computable Access to Neural Net Models
Search for all pages containing YOLO V8 Detect Trained on MS-COCO Data
Detect and localize objects in an image
YOLO (You Only Look Once) Version 8 by Ultralytics is the latest version of the YOLO models. Just like its predecessor, YOLO Version 5, YOLO Version 8 is an anchor-free model that was trained with mosaic augmentation. It features the use of new "C2f" blocks, which employ additional dense connections between bottleneck modules. YOLO Version 8 models outperform all models from previous versions at a similar size.
Get the pre-trained net:
In[1]:= | ![]() |
Out[2]= | ![]() |
This model consists of a family of individual nets, each identified by a specific parameter combination. Inspect the available parameters:
In[3]:= | ![]() |
Out[4]= | ![]() |
Pick a non-default net by specifying the parameters:
In[5]:= | ![]() |
Out[6]= | ![]() |
Pick a non-default uninitialized net:
In[7]:= | ![]() |
Out[8]= | ![]() |
Write an evaluation function to scale the result to the input image size and suppress the least probable detections:
In[9]:= | ![]() |
In[10]:= | ![]() |
Obtain the detected bounding boxes with their corresponding classes and confidences for a given image:
In[11]:= | ![]() |
In[12]:= | ![]() |
The model's output is an Association containing the detected "Boxes" and "Classes":
In[13]:= | ![]() |
Out[13]= | ![]() |
The "Boxes" key is a list of Rectangle expressions corresponding to the bounding boxes of the detected objects:
In[14]:= | ![]() |
Out[14]= | ![]() |
The "Classes" key contains the classes of the detected objects:
In[15]:= | ![]() |
Out[15]= | ![]() |
Visualize the detection:
In[16]:= | ![]() |
Out[16]= | ![]() |
The network computes eight thousand four hundred bounding boxes and the probability that the object is of any given class:
In[17]:= | ![]() |
In[18]:= | ![]() |
Out[18]= | ![]() |
Visualize the bounding boxes scaled by their class probabilities:
In[19]:= | ![]() |
Out[25]= | ![]() |
Visualize all the boxes scaled by the probability that they contain a person:
In[26]:= | ![]() |
Out[26]= | ![]() |
In[27]:= | ![]() |
Out[27]= | ![]() |
Superimpose the person predictions on top of the input received by the net:
In[28]:= | ![]() |
Out[28]= | ![]() |
Inspect the number of parameters of all arrays in the net:
In[29]:= | ![]() |
Out[30]= | ![]() |
Obtain the total number of parameters:
In[31]:= | ![]() |
Out[32]= | ![]() |
Obtain the layer type counts:
In[33]:= | ![]() |
Out[34]= | ![]() |
Display the summary graphic:
In[35]:= | ![]() |
Out[36]= | ![]() |
Export the net to the ONNX format:
In[37]:= | ![]() |
Out[38]= | ![]() |
Get the size of the ONNX file:
In[39]:= | ![]() |
Out[39]= | ![]() |
The size is similar to the byte count of the resource object:
In[40]:= | ![]() |
Out[41]= | ![]() |
Check some metadata of the ONNX model:
In[42]:= | ![]() |
Out[42]= | ![]() |
Import the model back into Wolfram Language. However, the NetEncoder and NetDecoder will be absent because they are not supported by ONNX:
In[43]:= | ![]() |
Out[43]= | ![]() |