Convolutional Neural Network(CNN) is a type of neural network especially useful for image classification tasks. I applied CNN on thousands of Simpsons images training the classifier to recognise 10 characters from the TV show with an accuracy of more than 90 percent.
Getting the dataset
You can download the images from Kaggle. The dataset contains images for more than 20 Simpson characters. I picked the ones with a minimum of 1000 sample images and chose 10 characters to train my model.
Splitting images to training set and test set
All chosen characters have thousand sample images. And I split them into 800 for the training set and 200 for the test set.
The training_set and test_set folders each have 10 folders(1 per character), consisting 800 and 200 images respectively.
And similarly for the test_set
Installing Tensorflow and Keras
Keras is a very good python package for neural networks. To use Keras, you need tensorflow
Follow the instruction on the Tensorflow website to install depending on your OS. I installed the gpu version for faster computation.
Convolutional Neural Network
In machine learning, a convolutional neural network is a class of deep, feed-forward artificial neural networks that has successfully been applied to analyzing visual imagery. CNNs use a variation of multilayer perceptrons designed to require minimal preprocessing. Convolutional networks were inspired by biological processes in which the connectivity pattern between neurons is inspired by the organization of the animal visual cortex.
CNNs use relatively little pre-processing compared to other image classification algorithms. This means that the network learns the filters that in traditional algorithms were hand-engineered. This independence from prior knowledge and human effort in feature design is a major advantage. They have applications in image and video recognition, recommender systems and natural language processing
The Simpsons Classifier
# Importing the Keras libraries and packages from keras.models import Sequential from keras.layers import Conv2D from keras.layers import MaxPooling2D from keras.layers import Flatten from keras.layers import Dense from keras.preprocessing.image import ImageDataGenerator from timeit import default_timer as timer from keras.preprocessing import image import numpy as np import os
Setting Image size
image_height = 128 image_width = 128
I resized all my images to 128x128. You can choose a size that's greater or lower depending upon the computational power of your system.
Constructing the neural network
A CNN consists of an input and an output layer, as well as multiple hidden layers. I constructed 4 hidden layers for my neural network.
Convolutional layers apply a convolution operation to the input, passing the result to the next layer. The convolution emulates the response of an individual neuron to visual stimuli. Each convolutional neuron processes data only for its receptive field.
Max pooling uses the maximum value from each of a cluster of neurons at the prior layer, greatly reducing the size of the image after feature extraction.
# Initialising the CNN predator = Sequential() # Step 1 - Convolution predator.add(Conv2D(64, (3, 3), activation="relu", input_shape=(image_height, image_width, 3))) # Step 2 - Pooling predator.add(MaxPooling2D(pool_size = (2, 2))) # Adding a second convolutional layer predator.add(Conv2D(128, (3, 3), activation="relu")) predator.add(MaxPooling2D(pool_size = (2, 2))) # Adding a third convolutional layer predator.add(Conv2D(256, (3, 3), activation="relu")) predator.add(MaxPooling2D(pool_size = (2, 2))) # Adding a fourth convolutional layer predator.add(Conv2D(128, (3, 3), activation="relu")) predator.add(MaxPooling2D(pool_size = (2, 2))) # Adding a fifth convolutional layer predator.add(Conv2D(64, (3, 3), activation="relu")) predator.add(MaxPooling2D(pool_size = (2, 2))) # Step 3 - Flattening predator.add(Flatten()) #flattens the 3D image array to a single row array # Step 4 - Full connection predator.add(Dense(units=32, activation="relu")) predator.add(Dense(units=10, activation="softmax")) #output layer with 10 neurons.. each corresponding to a character # Compiling the CNN predator.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
Defining the training set and test set
We have placed our training and test images in the folders mentioned above. We now have to tell python where to find them.
train_datagen = ImageDataGenerator(rescale = 1./255, shear_range = 0.2, zoom_range = 0.2, horizontal_flip = True) #ImageDataGenerator augments images, creating multiple versions of the same image test_datagen = ImageDataGenerator(rescale = 1./255) training_set = train_datagen.flow_from_directory('simpsons/training_set', target_size = (image_height, image_width), batch_size = 50, class_mode = 'categorical') test_set = test_datagen.flow_from_directory('simpsons/test_set', target_size = (image_height, image_width), batch_size = 50, class_mode = 'categorical')
Fitting the CNN to the images
predator.fit_generator(training_set, steps_per_epoch = 16, epochs = 100, validation_data = test_set, validation_steps = 4)
This starts the training process. It took me close to 10 minutes for running 100 epochs to return an accuracy of more than 90 percent.
Prediction on a new picture
The kaggle dataset also has an independent folder, predict, containing close to 500 images of the 10 characters.
# prediction on a new picture result =  path = 'simpsons/predict' files = os.listdir(path) for file in files: test_image = image.load_img('simpsons/predict/'+file, target_size=(image_height, image_width)) test_image = image.img_to_array(test_image) test_image = np.expand_dims(test_image, axis=0) pred = predator.predict_on_batch(test_image) result.append(pred) result = np.asarray(result)
Not bad :)
Generating a csv file with the results
index = files import pandas as pd predictions = result[:, ][:,0] df = pd.DataFrame(index=index) df['bart'] = predictions[:,0] df['charles'] = predictions[:,1] df['homer'] = predictions[:,2] df['krusty'] = predictions[:,3] df['lisa'] = predictions[:,4] df['marge'] = predictions[:,5] df['milhouse'] = predictions[:,6] df['moe'] = predictions[:,7] df['ned'] = predictions[:,8] df['principal'] = predictions[:,9] df = df.astype(int) df.to_csv('simpsons/predictions.csv')
Save the trained model
If you are happy with the predicted results, you can save the trained model with the below code
#save weight predator.save_weights('simpsons/blog/simpsons-CNN.hdf5')
You can download my trained weights here.