Convolutional neural network with The Simpsons

recognizing Simpson characters - walkthrough

Posted on January 24, 2018

Convolutional Neural Network(CNN) is a type of neural network especially useful for image classification tasks. I applied CNN on thousands of Simpsons images training the classifier to recognise 10 characters from the TV show with an accuracy of more than 90 percent.

Getting the dataset

You can download the images from Kaggle. The dataset contains images for more than 20 Simpson characters. I picked the ones with a minimum of 1000 sample images and chose 10 characters to train my model.

Splitting images to training set and test set

All chosen characters have thousand sample images. And I split them into 800 for the training set and 200 for the test set.

The training_set and test_set folders each have 10 folders(1 per character), consisting 800 and 200 images respectively.


And similarly for the test_set

Installing Tensorflow and Keras

Keras is a very good python package for neural networks. To use Keras, you need tensorflow

Follow the instruction on the Tensorflow website to install depending on your OS. I installed the gpu version for faster computation.

Convolutional Neural Network

In machine learning, a convolutional neural network is a class of deep, feed-forward artificial neural networks that has successfully been applied to analyzing visual imagery. CNNs use a variation of multilayer perceptrons designed to require minimal preprocessing. Convolutional networks were inspired by biological processes in which the connectivity pattern between neurons is inspired by the organization of the animal visual cortex.

CNNs use relatively little pre-processing compared to other image classification algorithms. This means that the network learns the filters that in traditional algorithms were hand-engineered. This independence from prior knowledge and human effort in feature design is a major advantage. They have applications in image and video recognition, recommender systems and natural language processing

Dog cnn

The Simpsons Classifier

Importing libraries
# Importing the Keras libraries and packages
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense
from keras.preprocessing.image import ImageDataGenerator
from timeit import default_timer as timer
from keras.preprocessing import image
import numpy as np
import os
Setting Image size
image_height = 128
image_width = 128

I resized all my images to 128x128. You can choose a size that's greater or lower depending upon the computational power of your system.

Constructing the neural network

A CNN consists of an input and an output layer, as well as multiple hidden layers. I constructed 4 hidden layers for my neural network.

Convolutional layers apply a convolution operation to the input, passing the result to the next layer. The convolution emulates the response of an individual neuron to visual stimuli. Each convolutional neuron processes data only for its receptive field.

Max pooling uses the maximum value from each of a cluster of neurons at the prior layer, greatly reducing the size of the image after feature extraction.

# Initialising the CNN
predator = Sequential()
# Step 1 - Convolution
predator.add(Conv2D(64, (3, 3), activation="relu", input_shape=(image_height, image_width, 3)))
# Step 2 - Pooling
predator.add(MaxPooling2D(pool_size = (2, 2)))
# Adding a second convolutional layer
predator.add(Conv2D(128, (3, 3), activation="relu"))
predator.add(MaxPooling2D(pool_size = (2, 2)))
# Adding a third convolutional layer
predator.add(Conv2D(256, (3, 3), activation="relu"))
predator.add(MaxPooling2D(pool_size = (2, 2)))
# Adding a fourth convolutional layer
predator.add(Conv2D(128, (3, 3), activation="relu"))
predator.add(MaxPooling2D(pool_size = (2, 2)))
# Adding a fifth convolutional layer
predator.add(Conv2D(64, (3, 3), activation="relu"))
predator.add(MaxPooling2D(pool_size = (2, 2)))
# Step 3 - Flattening
predator.add(Flatten())    #flattens the 3D image array to a single row array
# Step 4 - Full connection
predator.add(Dense(units=32, activation="relu"))
predator.add(Dense(units=10, activation="softmax"))   #output layer with 10 neurons.. each corresponding to a character
# Compiling the CNN
predator.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
Defining the training set and test set

We have placed our training and test images in the folders mentioned above. We now have to tell python where to find them.

train_datagen = ImageDataGenerator(rescale = 1./255,
                                   shear_range = 0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip = True)

#ImageDataGenerator augments images, creating multiple versions of the same image
test_datagen = ImageDataGenerator(rescale = 1./255)
training_set = train_datagen.flow_from_directory('simpsons/training_set',
                                                 target_size = (image_height, image_width),
                                                 batch_size = 50,
                                                 class_mode = 'categorical')
test_set = test_datagen.flow_from_directory('simpsons/test_set',
                                            target_size = (image_height, image_width),
                                            batch_size = 50,
                                            class_mode = 'categorical')
Fitting the CNN to the images
                         steps_per_epoch = 16,
                         epochs = 100,
                         validation_data = test_set,
                         validation_steps = 4)

This starts the training process. It took me close to 10 minutes for running 100 epochs to return an accuracy of more than 90 percent.

Prediction on a new picture

The kaggle dataset also has an independent folder, predict, containing close to 500 images of the 10 characters.

# prediction on a new picture 
result = []
path = 'simpsons/predict'
files = os.listdir(path)

for file in files:
    test_image = image.load_img('simpsons/predict/'+file, target_size=(image_height, image_width))
    test_image = image.img_to_array(test_image)
    test_image = np.expand_dims(test_image, axis=0)

    pred = predator.predict_on_batch(test_image)

result = np.asarray(result)
Bart Homer

Not bad :)

Generating a csv file with the results
index = files
import pandas as pd
predictions = result[:, [0]][:,0]
df = pd.DataFrame(index=index)

df['bart'] = predictions[:,0]
df['charles'] = predictions[:,1]
df['homer'] = predictions[:,2]
df['krusty'] = predictions[:,3]
df['lisa'] = predictions[:,4]
df['marge'] = predictions[:,5]
df['milhouse'] = predictions[:,6]
df['moe'] = predictions[:,7]
df['ned'] = predictions[:,8]
df['principal'] = predictions[:,9]

df = df.astype(int)

Save the trained model

If you are happy with the predicted results, you can save the trained model with the below code

#save weight


You can download my trained weights here.