Truth n Dare Game with Coding in Python

c. Create a variable with name truths and attach its first block — set truths to — with the list block. b. Now, create a new variable with name choice & attach the set choice to block with the take…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




Can a machine learn how to classify dog breeds?

With the superpower from Convolutional Neural Networks (CNN), the answer is yes! But the most interesting part is to understand how it works. This post will help yourself familiar with CNN architecture through a practical dog classifier project.

The project aims to identify various dog breeds based on image. And if the image is not a dog, let’s say a human, then the detector will predict a dog breed that is most similar to that person.

Sample output of the model

To have this problem solved, we separate the project into several steps:

We use accuracy metric to evaluate model performance. To calculate metrics, we will use the model to predict the dog breed of all images in the test set, and then find the percentage of images the model correctly predicted. Because this is a multi-class problem, accuracy metric is a good choice.

Before diving into project details, it’s good to understand inputs we have. The below figure displays the number of train images for each breed, sort by alphabet order of breed name.

It seems like we don’t have a balanced dataset. Some breeds have higher inputs than others. The breed with highest count has 77 files while the breed with lowest only has 26. However, it’s good that most breeds have sufficient sample size to train. We don’t need to eliminate any breed before implementing the project.

For this dataset, we have:

The source for training is quite small; however, it’s a reasonable size for studying and figuring out CNN architecture (usually don’t take too much time to train)

The cascade classifiers will draw a bouding box if it detects a human face in the image:

Detected faces will be highlighted with a blue bounding box

From our 100 provide human images, this cascade model can predict 99% of all cases. On the other hand, the model is only able to detect 12 dog faces out of 100 images. This result is predictable as this model is not built with the purposing of detecting dog.

There is only 1 failed case for human images because the person from that image doesn’t have a clearly presented face.

In my opinion, it’s fine to expect people to look directly at the camera. However, we can also use combine multiple face detector models (for example, haarcascade and mtcnn ) to improve the performance if necessary.

This model worked well as:

Model Selection:

When it comes to Deep Learning, we also have Multilayer-Perceptron (MLP) model. Both MLP and CNN model can be used to classify images. However, MLP takes vector as input and CNN takes tensor as input so CNN can understand spatial relation between nearby pixels of images. From this point of view, CNN seems to be a better approach to detect real-world complex images that are sometimes messy and confused.

For this specific project, some dog breeds are quite challenging, even for human, to detect. A below example represents two dog breeds that are quite similar:

Model Pre-Processing:

This function pre-processes input to:

Then, we rescale the images by dividing every pixel in every image by 255, the range for each element in the tensor now will have range from 0 to 1.

Model Architecture:

Following the best practices, our first CNN model is built with the following architecture:

CNN model from sratch

For this architecture:

Model Results:

Fit history of train and validation files, throughout 20 epochs

Due to relatively small train datasets, our model from scratch is hard to achieve a higher accuracy. In this step, we apply transfer learning to reduce the training time as well as to improve the accuracy.

Model Architecture:

The model uses the the pre-trained VGG-16 model as a fixed feature extractor, where the last convolutional output of VGG-16 is fed as input to our model. We only add a global average pooling layer and a fully connected layer, where the latter contains one node for each dog category and is equipped with a softmax.

Model Results:

Compared to the previous model, this approach is significantly much better.

What if we try another model to see if the accuracy can be improved further? The following approaches are considered:

Model Architecture:

Model Results:

Using the third model, we will make predictions for images having dogs, humans and other species to assess the performance. Below are some examples of our prediction model.

In summary, we have the following results of test accuracy:

Add a comment

Related posts:

got it

Semenjak kejadian Jeremy memberi respon sesuai dengan permintaan Kenzy yang dimana diangguki langsung oleh Khail (dan itu tentu saja membuat Jeremy maupun Kenzy sedikit terkejut karena mereka…

A short introduction to distance measures in Machine Learning

Finding similarity among data-points is an important task when we try to find clusters from a given data. Unsupervised learning algorithms like K-means believe on the theory — ‘closer the points more…