Bird Classification

This is a bird image classifier using python and machine learning.

Engineer School Area of Interest Grade
Linxi Wei The Affiliated High School of Peking University Computer Science Incoming Senior


First Milestone

My first milestone is getting to know some basic knowledge in machine learning, setting up the Raspberry Pi, and finding a dataset for birds.

1.Machine learning

Machine learning is like the opposite process of traditional programing. In machine learning, the computer should figure out the rules based on the answers and data you gave it.

For a start, I learned Tensorflow, a library created by Google to implement machine learning models, and built a sample program that can recognise numbers in mnist. Here is the code:

import tensorflow as tf
import matplotlib.pyplot as plt
(image_train, label_train), (image_test, label_test) = mnist.load_data()
 tf.keras.layers.Dense(10, activation='softmax')
              metrics=['accuracy']), label_train, epochs=5)

 model.evaluate(image_test, label_test, verbose=2)

2.Setting up the Raspberry Pi

I successfully connected the Raspberry Pi with the monitor, the keyboard, and the mouse. Now it is ready to recieve my model and start working.



I found a very good dataset for bird classification ( It contains 275 bird species——39364 training images, 1375 test images(5 per species), and 1375 validation images.

I uploaded the dataset to Google Colab from Google Drive and did some data pre-processing work. For the next step, I will transform the data and build my classifier. I decided to use Pytorch and the pre-trained VGG16 model.

Useful codes to upload dataset:

from google.colab import drive
drive.mount('/content/gdrive', force_remount=True)

Second Milestone

My second milestone is building up my machine learning model. Since the dataset I chose is too big, I randomly selected 20 bird species to form my training array. I used vgg16 as my base model and built my own model on top of it. My final model is like this:

Model: "model"
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         [(None, 224, 224, 3)]     0         
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0         
block3_conv1 (Conv2D)        (None, 56, 56, 256)       295168    
block3_conv2 (Conv2D)        (None, 56, 56, 256)       590080    
block3_conv3 (Conv2D)        (None, 56, 56, 256)       590080    
block3_pool (MaxPooling2D)   (None, 28, 28, 256)       0         
block4_conv1 (Conv2D)        (None, 28, 28, 512)       1180160   
block4_conv2 (Conv2D)        (None, 28, 28, 512)       2359808   
block4_conv3 (Conv2D)        (None, 28, 28, 512)       2359808   
block4_pool (MaxPooling2D)   (None, 14, 14, 512)       0         
block5_conv1 (Conv2D)        (None, 14, 14, 512)       2359808   
block5_conv2 (Conv2D)        (None, 14, 14, 512)       2359808   
block5_conv3 (Conv2D)        (None, 14, 14, 512)       2359808   
block5_pool (MaxPooling2D)   (None, 7, 7, 512)         0         
flatten (Flatten)            (None, 25088)             0         
softmax (Dense)              (None, 20)                501780    
Total params: 15,216,468
Trainable params: 501,780
Non-trainable params: 14,714,688

After building the model, I began to train it. I did training and testing at the same time, and I used 5 epochs. To make it done more quickly, I used GPU to run it. After training, my model reached an accuracy of 98%.


Final Milestone

My final milestone is a complete bird image classifier that I presented in the video below. I made a shell for my model by defining the input and output data. Now the model can work successfully. You can see my codes on GitHub.