Check if your python interpreter is linked to jupyter notebook by printing a simple statement.

In [1]:
print('HOG-descriptor-from-scratch')
HOG-descriptor-from-scratch

The path where this ipython script is present also contains 'Data' folder. The data folder has following structure:

data

->pedestrians128x64

->pedestrians_neg

->img_test

You can download data folder from:

https://drive.google.com/file/d/1YCXkb2muHz-m-nqNWtxCfuxa817DaWB6/view?usp=sharing

In [2]:
datadir = "data"
dataset = "pedestrians128x64"
datafile = "%s/%s.tar.gz" % (datadir, dataset)
In [3]:
extractdir = "%s/%s" % (datadir, dataset)
In [4]:
import cv2
import matplotlib.pyplot as plt
%matplotlib inline
In [5]:
for i in range(5):
    filename = "%s/per0010%d.ppm" % (extractdir, i)
    img = cv2.imread(filename)
    plt.subplot(1, 5, i + 1)
    plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
    plt.axis('off')

HOG descriptor is used in open cv by calling cv2.HOGDescriptor function this function requires the following input arguments.

  • detection window size
  • block size
  • cell size
  • cell stride
In [6]:
win_size = (48, 96)
block_size = (16, 16)
block_stride = (8, 8)
cell_size = (8, 8)
num_bins = 9
hog = cv2.HOGDescriptor(win_size, block_size, block_stride,
                        cell_size, num_bins)

X-pos list contains randomly picked positive samples of pedestrians. Then apply hog descriptor to them

In [7]:
import numpy as np
import random
random.seed(42)
X_pos = []
for i in random.sample(range(900), 400):
    filename = "%s/per%05d.ppm" % (extractdir, i)
    img = cv2.imread(filename)
    if img is None:
        print('Could not find image %s' % filename)
        continue
    X_pos.append(hog.compute(img, (64, 64)))
Could not find image data/pedestrians128x64/per00000.ppm

I picked 399 training images each of them have 1980 hog feature values.

In [8]:
X_pos = np.array(X_pos, dtype=np.float32)
y_pos = np.ones(X_pos.shape[0], dtype=np.int32)
X_pos.shape, y_pos.shape
Out[8]:
((399, 1980, 1), (399,))
In [9]:
negdir = "%s/pedestrians_neg" % datadir

I loop through all the negative images in the directory using os.listdir() and cut out a 64 x 128 region of interest (ROI):

In [10]:
import os
hroi = 128
wroi = 64
X_neg = []
for negfile in os.listdir(negdir):
    filename = '%s/%s' % (negdir, negfile)
    img = cv2.imread(filename)
    img = cv2.resize(img, (512, 512))
    for j in range(5):
        rand_y = random.randint(0, img.shape[0] - hroi)
        rand_x = random.randint(0, img.shape[1] - wroi)
        roi = img[rand_y:rand_y + hroi, rand_x:rand_x + wroi, :]
        X_neg.append(hog.compute(roi, (64, 64)))
In [11]:
X_neg = np.array(X_neg, dtype=np.float32)
y_neg = -np.ones(X_neg.shape[0], dtype=np.int32)
X_neg.shape, y_neg.shape
Out[11]:
((250, 1980, 1), (250,))
In [12]:
X = np.concatenate((X_pos, X_neg))
y = np.concatenate((y_pos, y_neg))
In [13]:
from sklearn import model_selection as ms
X_train, X_test, y_train, y_test = ms.train_test_split(
    X, y, test_size=0.2, random_state=42
)

Here I train SVM

In [14]:
def train_svm(X_train, y_train):
    svm = cv2.ml.SVM_create()
    svm.train(X_train, cv2.ml.ROW_SAMPLE, y_train)
    return svm
In [15]:
def score_svm(svm, X, y):
    from sklearn import metrics
    _, y_pred = svm.predict(X)
    return metrics.accuracy_score(y, y_pred)

After training SVM I find training scores and testing scores

In [16]:
svm = train_svm(X_train, y_train)
score_svm(svm, X_train, y_train)
Out[16]:
1.0
In [17]:
score_svm(svm, X_test, y_test)
Out[17]:
0.6461538461538462

The training score is much higher than the testing score, this is a problem of overfitting. To tackle this problem of overfitting I find fasle positive in the test set once they are found I append them to the training set. And repet this for 2 rounds as you can see I got 64% accuracy in the first round and 100% accuracy in the second round.

In [18]:
score_train = []
score_test = []
for j in range(3):
    svm = train_svm(X_train, y_train)
    score_train.append(score_svm(svm, X_train, y_train))
    score_test.append(score_svm(svm, X_test, y_test))
    _, y_pred = svm.predict(X_test)
    false_pos = np.logical_and((y_test.ravel() == -1),
                               (y_pred.ravel() == 1))
    if not np.any(false_pos):
        print('no more false positives: done')
        break
    X_train = np.concatenate((X_train,
                              X_test[false_pos, :]),
                             axis=0)
    y_train = np.concatenate((y_train, y_test[false_pos]),
                             axis=0)
no more false positives: done
In [19]:
score_train
Out[19]:
[1.0, 1.0]
In [20]:
score_test
Out[20]:
[0.6461538461538462, 1.0]

Now that I have made a fairly accurate model I should move on to detection. Divide image into small squares and find out if each square contains pedestrian or not. For this I use 'stride' feature, stride is nothing but number of pixels. I made sure I dont cross boundry of image by using

if ystart + hroi > img_test.shape[0]:

if xstart + wroi > img_test.shape[1]:

After this I find region of interest. If the ROI is classified as a pedestrian then I add it to list of successes using this code

if np.allclose(ypred, 1): found.append((ystart, xstart, hroi, wroi))

In [21]:
stride = 16
found = []
img_test = cv2.imread('img_test.jpg')
for ystart in np.arange(0, img_test.shape[0], stride):
    for xstart in np.arange(0, img_test.shape[1], stride):
        if ystart + hroi > img_test.shape[0]:
            continue
        if xstart + wroi > img_test.shape[1]:
             continue
        roi = img_test[ystart:ystart + hroi,
                       xstart:xstart + wroi, :]
        feat = np.array([hog.compute(roi, (64, 64))])
        _, ypred = svm.predict(feat)
        
        if np.allclose(ypred, 1):
            found.append((ystart, xstart, hroi, wroi))

I pass all svm parameters to the 'hog' object

In [23]:
rho, _, _ = svm.getDecisionFunction(0)
sv = svm.getSupportVectors()
hog.setSVMDetector(np.append(sv.ravel(), rho))
---------------------------------------------------------------------------
error                                     Traceback (most recent call last)
<ipython-input-23-445de3dc92d7> in <module>
      1 rho, _, _ = svm.getDecisionFunction(0)
      2 sv = svm.getSupportVectors()
----> 3 hog.setSVMDetector(np.append(sv.ravel(), rho))

error: OpenCV(4.0.0) /Users/travis/build/skvark/opencv-python/opencv/modules/objdetect/src/hog.cpp:115: error: (-215:Assertion failed) checkDetectorSize() in function 'setSVMDetector'

The size of people can vary hence I use detectMultiScale

In [24]:
found = hog.detectMultiScale(img_test)
In [25]:
from matplotlib import patches
fig = plt.figure()
ax = fig.add_subplot(111)
ax.imshow(cv2.cvtColor(img_test, cv2.COLOR_BGR2RGB))

for f in found:
    ax.add_patch(patches.Rectangle((f[0], f[1]), f[2], f[3],
                                   color='y', linewidth=3,
                                   fill=False))
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-25-000bf8d4fd69> in <module>
      5 
      6 for f in found:
----> 7     ax.add_patch(patches.Rectangle((f[0], f[1]), f[2], f[3],
      8                                    color='y', linewidth=3,
      9                                    fill=False))

IndexError: index 1 is out of bounds for axis 0 with size 1
In [ ]:
 
In [ ]: