How to classify Lego figures
!pip install kaggle
We want to save the dataset into the folder /notebooks/storage/data/Lego-Classification:
!kaggle datasets download -d ihelon/lego-minifigures-classification -p /notebooks/storage/data/Lego-Classification
from pathlib import Path
p = Path('/notebooks/storage/data/Lego-Classification')
filename = Path('/notebooks/storage/data/Lego-Classification/lego-minifigures-classification.zip')
Just as before, we can use bash commands from within Jupyter Notebook. So let's do that to unzip our data. -q is quiet mode, -d points to the direction where to unzip the data. Just see how well Pythons pathlib and bash work together!
!unzip -q {str(filename)} -d {str(p/"train")}
from fastbook import *
from fastai.vision.widgets import *
import pandas as pd
Let's now use fastai's "get_image_files()" function to see how the unzipped data looks like in our destination path:
fns = get_image_files(p/"train")
fns
Remember, we put the data into our directory '/notebooks/storage/data/Lego-Classification'. After having a quick look at our data it looks like the data is stored as follows: first the genre of our image (marvel/jurassic-world), then the classification of the figure (0001/0002 etc.). Within these folders we find many different pictures of that figure (001.jpg/002.jpg and so on). Let's confirm this by looking at the metadata.
df = pd.read_csv(f'{p}/index.csv', index_col=0)
df.tail(5)
df_metadata = pd.read_csv(f'{p}/metadata.csv', usecols=['class_id', 'lego_names', 'minifigure_name'])
df_metadata.head()
Indeed, that's how this dataset is structured. What we want is a data structure with which fastai's data block can easily work with. So what we need is something that gives us the filename, the label and a label which data is for training and which one is for validation. Luckily we can get exactly this by combining the meta-data:
datablock_df = pd.merge(df, df_metadata, left_on='class_id', right_on='class_id').loc[:,['path', 'class_id', 'minifigure_name', 'train-valid']]
datablock_df['is_valid'] = datablock_df['train-valid']=='valid'
datablock_df.head()
fastai gives us a brief overview of what to check before we can make optimal use of the datablock:
what are the types of our inputs and targets? Images and labels.
where is the data? In a dataframe.
how do we know if a sample is in the training or the validation set? A column of our dataframe.
how do we get an image? By looking at the column path.
how do we know the label of an image? By looking at the column minifigure_name.
do we want to apply a function to a given sample? Yes, we need to resize everything to a given size.
do we want to apply a function to a batch after it's created? Yes, we want data augmentation.
lego_block = DataBlock(blocks=(ImageBlock, CategoryBlock),
splitter=ColSplitter(),
get_x=lambda x:p/"train"/f'{x[0]}',
get_y=lambda x:x[2],
item_tfms=Resize(224),
batch_tfms=aug_transforms())
Now our datablock is called lego_block. See how it perfectly matches together?
Let me briefly explain what the different steps within our lego_block are doing: first we tell the lego_block on what we want to split our data on (the default here is col='is_valid'), then we simply put our path column (x[0]) and combine it with our path p and the folder 'train' in which is is located in. get_y tells the lego_block where to find the labels in our dataset (x[2]), we then make all of our images the same size and apply transformation on them (checkout fastai for more information).
dls = lego_block.dataloaders(datablock_df)
dls.show_batch()
Glorious!
fastai tries to make our life easier. This blog is intended to show you guys how to easily and quickly manage to get a great classifier with it. In the upcoming blogs I will try to better explain what is going on behind the scenes. But for now, let's enjoy how fast we can build our classifier with fastai!
learn = cnn_learner(dls, resnet34, metrics=error_rate)
learn.fine_tune(20)
interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix(figsize=(12,12), dpi=60)
interp.most_confused(min_val=2)
Not to bad I would say. However, seeing an image of Ronald Weasley and predicting it to be Harry Potter - I'm not sure how much this will impress your significant other. On the other hand, Captain America is correctly predicted 100%.
But we can still try to improve our model by unfreezing the weights, to make the model even better. Let's check this out:
learn.fit_one_cycle(3, 3e-3)
Then we will unfreeze the parameters and learn at a slightly lower learning rate:
learn.unfreeze()
learn.fit_one_cycle(2, lr_max=1e-5)
Wow! Down to only 10% error rate. I think that's quite impressive! Let's see the confusion matrix:
interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix(figsize=(12,12), dpi=60)
With these results I am sure you will impress your significant other!
In one of the next posts I will show you how to use Jupyter to easily set up a small Web App with the help of Binder. So stay tuned!
Lasse