What better way to impress your significant other?

Build a Lego Classifier with fastai

!pip install kaggle

We want to save the dataset into the folder /notebooks/storage/data/Lego-Classification:

!kaggle datasets download -d ihelon/lego-minifigures-classification -p /notebooks/storage/data/Lego-Classification

Unzip data using Pythons pathlib library

from pathlib import Path
p = Path('/notebooks/storage/data/Lego-Classification')
filename = Path('/notebooks/storage/data/Lego-Classification/lego-minifigures-classification.zip')

Just as before, we can use bash commands from within Jupyter Notebook. So let's do that to unzip our data. -q is quiet mode, -d points to the direction where to unzip the data. Just see how well Pythons pathlib and bash work together!

!unzip -q {str(filename)} -d {str(p/"train")}

Imports

from fastbook import *
from fastai.vision.widgets import *
import pandas as pd

Let's now use fastai's "get_image_files()" function to see how the unzipped data looks like in our destination path:

fns = get_image_files(p/"train")
fns
(#316) [Path('/notebooks/storage/data/Lego-Classification/train/jurassic-world/0002/009.jpg'),Path('/notebooks/storage/data/Lego-Classification/train/jurassic-world/0002/010.jpg'),Path('/notebooks/storage/data/Lego-Classification/train/jurassic-world/0002/006.jpg'),Path('/notebooks/storage/data/Lego-Classification/train/jurassic-world/0002/001.jpg'),Path('/notebooks/storage/data/Lego-Classification/train/jurassic-world/0002/011.jpg'),Path('/notebooks/storage/data/Lego-Classification/train/jurassic-world/0002/005.jpg'),Path('/notebooks/storage/data/Lego-Classification/train/jurassic-world/0002/004.jpg'),Path('/notebooks/storage/data/Lego-Classification/train/jurassic-world/0002/013.jpg'),Path('/notebooks/storage/data/Lego-Classification/train/jurassic-world/0002/007.jpg'),Path('/notebooks/storage/data/Lego-Classification/train/jurassic-world/0002/008.jpg')...]

Remember, we put the data into our directory '/notebooks/storage/data/Lego-Classification'. After having a quick look at our data it looks like the data is stored as follows: first the genre of our image (marvel/jurassic-world), then the classification of the figure (0001/0002 etc.). Within these folders we find many different pictures of that figure (001.jpg/002.jpg and so on). Let's confirm this by looking at the metadata.

df = pd.read_csv(f'{p}/index.csv', index_col=0)
df.tail(5)
path class_id train-valid
311 marvel/0014/008.jpg 28 valid
312 marvel/0014/009.jpg 28 valid
313 marvel/0014/010.jpg 28 valid
314 marvel/0014/011.jpg 28 valid
315 marvel/0014/012.jpg 28 valid
df_metadata = pd.read_csv(f'{p}/metadata.csv', usecols=['class_id', 'lego_names', 'minifigure_name'])
df_metadata.head()
class_id lego_names minifigure_name
0 1 ['Spider Mech vs. Venom'] SPIDER-MAN
1 2 ['Spider Mech vs. Venom'] VENOM
2 3 ['Spider Mech vs. Venom'] AUNT MAY
3 4 ['Spider Mech vs. Venom'] GHOST SPIDER
4 5 ["Yoda's Hut"] YODA

Indeed, that's how this dataset is structured. What we want is a data structure with which fastai's data block can easily work with. So what we need is something that gives us the filename, the label and a label which data is for training and which one is for validation. Luckily we can get exactly this by combining the meta-data:

datablock_df = pd.merge(df, df_metadata, left_on='class_id', right_on='class_id').loc[:,['path', 'class_id', 'minifigure_name', 'train-valid']]
datablock_df['is_valid'] = datablock_df['train-valid']=='valid'
datablock_df.head()
path class_id minifigure_name train-valid is_valid
0 marvel/0001/001.jpg 1 SPIDER-MAN train False
1 marvel/0001/002.jpg 1 SPIDER-MAN valid True
2 marvel/0001/003.jpg 1 SPIDER-MAN train False
3 marvel/0001/004.jpg 1 SPIDER-MAN train False
4 marvel/0001/005.jpg 1 SPIDER-MAN train False

fastai gives us a brief overview of what to check before we can make optimal use of the datablock:

what are the types of our inputs and targets? Images and labels.
where is the data? In a dataframe.
how do we know if a sample is in the training or the validation set? A column of our dataframe.
how do we get an image? By looking at the column path.
how do we know the label of an image? By looking at the column minifigure_name.
do we want to apply a function to a given sample? Yes, we need to resize everything to a given size.
do we want to apply a function to a batch after it's created? Yes, we want data augmentation.
lego_block = DataBlock(blocks=(ImageBlock, CategoryBlock),
                   splitter=ColSplitter(),
                   get_x=lambda x:p/"train"/f'{x[0]}',
                   get_y=lambda x:x[2],
                   item_tfms=Resize(224),
                   batch_tfms=aug_transforms())

Now our datablock is called lego_block. See how it perfectly matches together?

Let me briefly explain what the different steps within our lego_block are doing: first we tell the lego_block on what we want to split our data on (the default here is col='is_valid'), then we simply put our path column (x[0]) and combine it with our path p and the folder 'train' in which is is located in. get_y tells the lego_block where to find the labels in our dataset (x[2]), we then make all of our images the same size and apply transformation on them (checkout fastai for more information).

dls = lego_block.dataloaders(datablock_df)
dls.show_batch()

Glorious!

fastai tries to make our life easier. This blog is intended to show you guys how to easily and quickly manage to get a great classifier with it. In the upcoming blogs I will try to better explain what is going on behind the scenes. But for now, let's enjoy how fast we can build our classifier with fastai!

learn = cnn_learner(dls, resnet34, metrics=error_rate)
learn.fine_tune(20)
epoch train_loss valid_loss error_rate time
0 5.044340 7.069784 0.980263 00:04
epoch train_loss valid_loss error_rate time
0 4.333549 5.407881 0.960526 00:05
1 4.193551 4.404543 0.947368 00:04
2 3.994802 3.605752 0.907895 00:04
3 3.721432 2.941005 0.802632 00:04
4 3.428093 2.336661 0.638158 00:05
5 3.092875 1.841397 0.532895 00:04
6 2.779624 1.470707 0.434211 00:05
7 2.484196 1.200388 0.348684 00:05
8 2.216630 1.011226 0.263158 00:05
9 1.988927 0.901224 0.236842 00:04
10 1.794780 0.819948 0.217105 00:05
11 1.624516 0.756500 0.184211 00:04
12 1.478024 0.716581 0.157895 00:05
13 1.354157 0.688189 0.164474 00:04
14 1.244102 0.673431 0.164474 00:05
15 1.149104 0.662224 0.164474 00:04
16 1.062147 0.652154 0.164474 00:04
17 0.985224 0.654423 0.177632 00:04
18 0.917764 0.653068 0.177632 00:05
19 0.858115 0.652137 0.184211 00:04
interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix(figsize=(12,12), dpi=60)
interp.most_confused(min_val=2)
[('RON WEASLEY', 'HARRY POTTER', 2), ('SPIDER-MAN', 'FIREFIGHTER', 2)]

Not to bad I would say. However, seeing an image of Ronald Weasley and predicting it to be Harry Potter - I'm not sure how much this will impress your significant other. On the other hand, Captain America is correctly predicted 100%.

But we can still try to improve our model by unfreezing the weights, to make the model even better. Let's check this out:

learn.fit_one_cycle(3, 3e-3)
epoch train_loss valid_loss error_rate time
0 0.061034 0.536347 0.151316 00:05
1 0.046480 0.631286 0.157895 00:04
2 0.039729 0.572698 0.157895 00:05

Then we will unfreeze the parameters and learn at a slightly lower learning rate:

learn.unfreeze()
learn.fit_one_cycle(2, lr_max=1e-5)
epoch train_loss valid_loss error_rate time
0 0.014817 0.483588 0.125000 00:05
1 0.016834 0.425858 0.098684 00:04

Wow! Down to only 10% error rate. I think that's quite impressive! Let's see the confusion matrix:

interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix(figsize=(12,12), dpi=60)

With these results I am sure you will impress your significant other!

In one of the next posts I will show you how to use Jupyter to easily set up a small Web App with the help of Binder. So stay tuned!

Lasse