# Wolfram Neural Net Repository

Immediate Computable Access to Neural Net Models

Detect and localize objects in an image

YOLO (You Only Look Once) Version 8 by Ultralytics is the latest version of the YOLO models. Just like its predecessor, YOLO Version 5, YOLO Version 8 is an anchor-free model that was trained with mosaic augmentation. It features the use of new "C2f" blocks, which employ additional dense connections between bottleneck modules. YOLO Version 8 models outperform all models from previous versions at a similar size.

- Microsoft COCO, a dataset for image recognition, segmentation, captioning, object detection and keypoint estimation, consisting of more than three hundred thousand images.

Get the pre-trained net:

In[1]:= |

Out[2]= |

This model consists of a family of individual nets, each identified by a specific parameter combination. Inspect the available parameters:

In[3]:= |

Out[4]= |

Pick a non-default net by specifying the parameters:

In[5]:= |

Out[6]= |

Pick a non-default uninitialized net:

In[7]:= |

Out[8]= |

Write an evaluation function to scale the result to the input image size and suppress the least probable detections:

In[9]:= |

In[10]:= |

Obtain the detected bounding boxes with their corresponding classes and confidences for a given image:

In[11]:= |

In[12]:= |

The model's output is an Association containing the detected "Boxes" and "Classes":

In[13]:= |

Out[13]= |

The "Boxes" key is a list of Rectangle expressions corresponding to the bounding boxes of the detected objects:

In[14]:= |

Out[14]= |

The "Classes" key contains the classes of the detected objects:

In[15]:= |

Out[15]= |

Visualize the detection:

In[16]:= |

Out[16]= |

The network computes eight thousand four hundred bounding boxes and the probability that the object is of any given class:

In[17]:= |

In[18]:= |

Out[18]= |

Visualize the bounding boxes scaled by their class probabilities:

In[19]:= |

Out[25]= |

Visualize all the boxes scaled by the probability that they contain a person:

In[26]:= |

Out[26]= |

In[27]:= |

Out[27]= |

Superimpose the person predictions on top of the input received by the net:

In[28]:= |

Out[28]= |

Inspect the number of parameters of all arrays in the net:

In[29]:= |

Out[30]= |

Obtain the total number of parameters:

In[31]:= |

Out[32]= |

Obtain the layer type counts:

In[33]:= |

Out[34]= |

Display the summary graphic:

In[35]:= |

Out[36]= |

Export the net to the ONNX format:

In[37]:= |

Out[38]= |

Get the size of the ONNX file:

In[39]:= |

Out[39]= |

The size is similar to the byte count of the resource object:

In[40]:= |

Out[41]= |

Check some metadata of the ONNX model:

In[42]:= |

Out[42]= |

Import the model back into Wolfram Language. However, the NetEncoder and NetDecoder will be absent because they are not supported by ONNX:

In[43]:= |

Out[43]= |

- G. Jocher, A. Chaurasia, J. Qiu, "YOLO by Ultralytics."
- Available from: https://github.com/ultralytics/ultralytics
- Rights: GNU General Public License