Identifying people in photos using Python and Neural Networks

Facebook AI Research’s collection of state-of-the-art object detection algorithm is called Detectron. This is based on the Mask R-CNN benchmark and is written in Python and PyTorch. It supports bounding box detection, densepose detection, instance segmentation, keypoint detection, and other computer vision tools.

Examples of its usage and research papers are located here.


Unfortunately, installation is limited to Unix operating systems. Although if you have installed the Linux subsystem on Windows, this should not be a problem.

conda install -c conda-forge detectron2;
pip install -U iopath==0.1.4 omegaconf opencv-python;

To make use of detectron we begin by creating a python script made up from the following:


As usual, we start by importing the relevant libraries.

import detectron2,cv2
from detectron2 import model_zoo
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2.utils.visualizer import Visualizer
from import MetadataCatalog, DatasetCatalog
import matplotlib.pyplot as plt

Predictor Configuration

Next we need to load the pre-trained model: Here we chose the Mask R-CNN model and trained weights from the COCO-Instance Segmentation library and set the match score threshold.

cfg = get_cfg()
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
cfg.MODEL.DEVICE = "cpu" # we use a CPU Detectron copy
# create predictor
predictor = DefaultPredictor(cfg)

Reading the Image File

Finally we need to read our photographs. For this we leverage the OpenCV library.

file = 'filename.jpeg'
frame = cv2.imread(file)
image = cv2.cvtColor(frame, cv2.IMREAD_COLOR)
# predict categories
output = predictor(image)

Once we have run the predictor, we can check our results. To do this we extract each individual matched element and overlay them on the original image:

v = Visualizer(image[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=1.2)out = v.draw_instance_predictions(output["instances"].to("cpu"))

Once we have each segment, we use matplotlib to plot the result:

plt.imshow(out.get_image()[..., ::-1][..., ::-1])

This produces the following:

Person matching AI example.
Source: Program output of image

Using this method it is also possible to explore each match individually. We start by creating a list of prediction categories and a bounding box per match.

# create 2 element list of class number and bounding boxes
matches = zip(output['instances'].get('pred_classes'),output['instances'].get( 'pred_boxes'))

In iterating through the values and selecting only those which correspond to people (category 0) …

for l,j in matches:    # skip all categories which do not correspond to people
if int(l)!=0 : continue
# get bounding box for person
i = [int(k) for k in j

…we can use the bounding box to crop the original image and save each individual match should we wish:

    # crop the original image using the bounding box
img = image[i[1]:i[3], i[0]:i[2]] #crop to bb
A single person from the above selection.
Example Selection of a Person

As with all machine learning algorithms, we cannot expect 100% accuracy, especially since many of our photos may be of poor quality, or contain unconventional angles/positions.

An example is seen below — here the boat helm is misidentified due to being presented in an unusual position (with respect to the algorithms’ training dataset anyway).

A classified image of a boat with people.

In this article we looked at how to use Facebook’s detectron2 R-CNN to detect people in an image. This means that we can now auto describe contents or identify images we may need to check before publication.

We did however mention, that for completeness it is probably still a good idea to glance over before discarding unmatched photos — just to make sure.

(If you found this useful, please click the ‘clap’ button. And remember you can click it more than once! Maybe even 50 times?)

Read the full article here

Leave a Reply

Your email address will not be published.

Product Managers: this is your mental health check

Product Managers: this is your mental health check

How you FEEL during your days is how you feel about your life

Neural Image Assessment: ranking photographs by aesthetic and technical quality

Neural Image Assessment: ranking photographs by aesthetic and technical quality

Table of Contents Hide Using Deep Learning to categorize photographsWhat is

You May Also Like