Getting Started - Creating Categorical Image Predictor Models
My First three FastAI models in under 20 mins.
- Overview of this blog
- Step 1 : Installing fast's latest version
- Step 2 : Defining an Image Search Function
- Step 3 : Downloading & Viewing a Searched image
- Cleaning the directories
- Step 4 - Organising the data
- Step 5 - Creating the DataBlock
- Step 6 - Creating the Vision Learner
- Step 7 - Predictions
- Step 8 - Using a different dataset and training the model
- Observations from all models
- End
- Appendix
Overview of this blog
This blog post contains a simple fast AI model that I created by following the lesson 1 of the latest Fast AI course. I have used a different dataset (person's emotion dataset fetched using duckduckgo) here and have almost replicated the steps used by Jeremy to train and predict an image classifier model. I'm new to Deep Learning, as well as blogging. Hence,
Note :
- This Blog post is written with an intention of learning how to use Jupyter Notebooks with fastpages to create blog posts.
- This is not a very verbose tutorial, as my aim was to just get a hands on into the FastAi's code, and create a simple blog post out of it.
- Future blogs that I'll write will focus more on the actual definitions of concepts/techniques and code by code walkthrough of the notebook that I'll create as I'll follow the course along.
- I have added "Appendix" section at the bottom of the blog, to provide some useful (and can be basic) commands that I have learned as part of the notebook.
pip install -Uqq fastbook
from fastcore.all import *
import time
def search_images(term, max_images=200):
url = 'https://duckduckgo.com/'
res = urlread(url,data={'q':term})
searchObj = re.search(r'vqd=([\d-]+)\&', res)
requestUrl = url + 'i.js'
params = dict(l='us-en', o='json', q=term, vqd=searchObj.group(1), f=',,,', p='1', v7exp='a')
urls,data = set(),{'next':1}
while len(urls)<max_images and 'next' in data:
data = urljson(requestUrl,data=params)
urls.update(L(data['results']).itemgot('image'))
requestUrl = url + data['next']
time.sleep(0.2)
return L(urls)[:max_images]
from fastdownload import download_url
dest = 'human_sad.jpg'
download_url(urls[0], dest, show_progress=False)
from fastai.vision.all import *
im = Image.open(dest)
im.to_thumb(256,256)
download_url(search_images('happy human photos', max_images=1)[0], 'human_happy.jpg', show_progress=False)
Image.open('human_happy.jpg').to_thumb(256,256)
import shutil
downloaded_path = Path('happy_sad_angry_downloaded')
resized_path = Path("happy_sad_angry_resized")
shutil.rmtree(downloaded_path)
shutil.rmtree(resized_path)
Step 4 - Organising the data
- Download the different categories of images in happy_sad_angry_downloaded directory
- Resizing all the images downloaded and saving the resized images in happy_sad_angry_resized directory
- Some photos might not download correctly which could cause our model training to fail, hence remove them:
searches = 'happy human','sad human','angry human'
downloaded_path = Path('happy_sad_angry_downloaded')
resized_path = Path("happy_sad_angry_resized")
for o in searches:
dest_downloaded = (downloaded_path/o)
dest_downloaded.mkdir(exist_ok=True, parents=True)
download_images(dest_downloaded, urls=search_images(f'{o} photo'))
resize_images(downloaded_path/o, max_size=400, dest=resized_path/o)
failed = verify_images(get_image_files(resized_path))
failed.map(Path.unlink)
len(failed)
Step 5 - Creating the DataBlock
To train a model, we'll need DataLoaders, which is an object that contains a :
- Training set (the images used to create a model) and a ;
- Validation set (the images used to check the accuracy of a model -- not used during training). In fastai we can create that easily using a DataBlock, and view sample images from it:
DataBlock API
- The inputs are going to be images “ImageBlock” and the outputs are going to be categories “CategoryBlock”.
-
get_image_files
is used to get the items we require - We define a splitter to split the dataset into Training & Validation Set. In this case, we are using a
RandomSplitter
with 20% data for validation -
get_y
takes the label for the images. Here,parent_label
is the name of the parent (or folder) for each image, i.e., happy person, sad person, angry person - Before training, resize each image to 192x192 pixels by "squishing" it (as opposed to cropping it).
dls = DataBlock(
blocks=(ImageBlock, CategoryBlock),
get_items=get_image_files,
splitter=RandomSplitter(valid_pct=0.2, seed=42),
get_y=parent_label,
item_tfms=[Resize(192, method='squish')]
).dataloaders(resized_path)
dls.show_batch(max_n=6)
Step 6 - Creating the Vision Learner
Here is where the actual magic happens. Yes, we're not aware about the fun calculations underneath at the moment, hence, let's call it as magic
- We're now training the model using the dataloader that we created in the previous step.
- We define error_rate as our metrics, which is nothing but the mean squared error
- We provide
resnet18
as the architecture(pre-trained model) to train our model. This is the basis of transfer learning, which will be covered in the later blogs.
learn = vision_learner(dls, resnet18, metrics=error_rate)
learn.fine_tune(3)
Observations:
-
The error_rate is currently ~0.4 ( " " signifies followed by any digits) in the model that I trained. However, in lesson 1 official notebooks, I saw that the error_rate is (0.01*) after 3 epochs. The difference between both the losses look noteworthy. I believe, that the dataset isn’t quite good (or augmented) yet for model to train ?
-
I have seen that the error_rate reduces at 2nd epoch, and then increases in the 3rd epoch. The loss should be ideally unidirectional (i.e., decreasing after each epoch), or accuracy must be increasing. What does this pattern above imply ?
I'll wait for the subsequent lectures to get the answer to the above questions!
is_happy,x,probs = learn.predict(PILImage.create('human_happy.jpg'))
print(f"This is a: {is_happy}.")
print(f"Probability the person is happy: {probs[1]:.4f}")
print(learn.predict(PILImage.create('human_happy.jpg')))
is_sad,_,probs = learn.predict(PILImage.create('human_sad.jpg'))
print(f"This is a: {is_happy}.")
print(f"Probability the person is sad: {probs[2]:.4f}")
print(learn.predict(PILImage.create('human_sad.jpg')))
Displaying the Tiger Image Downloaded for prediction
# Downloading a tiger image to predict
from fastdownload import download_url
urls = search_images('tiger photos', max_images=10)
dest = 'tiger.jpg'
download_url(urls[0], dest, show_progress=False)
from fastai.vision.all import *
im = Image.open(dest)
im.to_thumb(256,256)
Displaying the Cat Image Downloaded for prediction
# Downloading a cat image to predict
download_url(search_images('cat photos', max_images=1)[0], 'cat.jpg', show_progress=False)
Image.open('cat.jpg').to_thumb(256,256)
Downloading all the Tiger & Cat Images in the respective parent folder and displaying 6 images
searches = 'tiger','cat'
downloaded_path = Path('tiger_cat_downloaded')
resized_path = Path("tiger_cat_resized")
for o in searches:
dest_downloaded = (downloaded_path/o)
dest_downloaded.mkdir(exist_ok=True, parents=True)
download_images(dest_downloaded, urls=search_images(f'{o} photo'))
resize_images(downloaded_path/o, max_size=400, dest=resized_path/o)
failed = verify_images(get_image_files(resized_path))
failed.map(Path.unlink)
len(failed)
dls = DataBlock(
blocks=(ImageBlock, CategoryBlock),
get_items=get_image_files,
splitter=RandomSplitter(valid_pct=0.2, seed=42),
get_y=parent_label,
item_tfms=[Resize(192, method='squish')]
).dataloaders(resized_path)
dls.show_batch(max_n=6)
learn = vision_learner(dls, resnet18, metrics=error_rate)
learn.fine_tune(3)
Image.open('tiger.jpg')
im.to_thumb(256,256)
print(learn.predict(PILImage.create('tiger.jpg')))
is_tiger,_,probs = learn.predict(PILImage.create('tiger.jpg'))
print(f"This is a: {is_tiger}.")
print(f"Probability : {probs[1]:.4f}")
Image.open('cat.jpg').to_thumb(256,256)
print(learn.predict(PILImage.create('cat.jpg')))
is_tiger,_,probs = learn.predict(PILImage.create('cat.jpg'))
print(f"This is a: {is_tiger}.")
print(f"Probability : {probs[0]:.4f}")
Displaying the BackPack Image Downloaded for prediction
# Downloading a backpack image to predict
from fastdownload import download_url
urls = search_images('ladies backpack', max_images=10)
dest = 'backpack.jpg'
download_url(urls[0], dest, show_progress=False)
from fastai.vision.all import *
im = Image.open(dest)
im.to_thumb(256,256)
Displaying the Purse Image Downloaded for prediction
# Downloading a cat image to predict
download_url(search_images('ladies purse', max_images=1)[0], 'purse.jpg', show_progress=False)
Image.open('purse.jpg').to_thumb(256,256)
- As you can see, the purse image has multuple purses and a lady as well as part of the image. Since, this notebook doesn't deal with any explicit data cleaning, we'll let this data be as it is and proceed further
Downloading all the BackPack & Purse Images in the respective parent folder and displaying 6 images
searches = 'ladies backpack','ladies purse'
downloaded_path = Path('backpack_purse_downloaded')
resized_path = Path("backpack_purse_resized")
for o in searches:
dest_downloaded = (downloaded_path/o)
dest_downloaded.mkdir(exist_ok=True, parents=True)
download_images(dest_downloaded, urls=search_images(f'{o} photo'))
resize_images(downloaded_path/o, max_size=400, dest=resized_path/o)
failed = verify_images(get_image_files(resized_path))
failed.map(Path.unlink)
len(failed)
dls = DataBlock(
blocks=(ImageBlock, CategoryBlock),
get_items=get_image_files,
splitter=RandomSplitter(valid_pct=0.2, seed=42),
get_y=parent_label,
item_tfms=[Resize(192, method='squish')]
).dataloaders(resized_path)
dls.show_batch(max_n=6)
learn = vision_learner(dls, resnet18, metrics=error_rate)
learn.fine_tune(3)
Image.open('backpack.jpg').to_thumb(256,256)
print(learn.predict(PILImage.create('backpack.jpg')))
is_backpack,_,probs = learn.predict(PILImage.create('backpack.jpg'))
print(f"This is a: {is_backpack}.")
print(f"Probability : {probs[0]:.4f}")
Image.open('purse.jpg').to_thumb(256,256)
print(learn.predict(PILImage.create('purse.jpg')))
is_backpack,_,probs = learn.predict(PILImage.create('purse.jpg'))
print(f"This is a: {is_backpack}.")
print(f"Probability : {probs[1]:.4f}")
Observations from all models
- Even though the underlying code remained identical for all three datasets, it is seen that the model trained had significantly different error_rates and valid_loss for different images. It can be reasoned out, why the accuracy in "Tiger Vs car" model was the highest, in comparison to "Backpack Vs Purse" or "Angry Vs Sad Vs Happy Emotion" models. The primary reason behind the anomoly is that we almost did no data cleaning in any of the images, and while the Cats/Tigers images were already pretty good for the model to train, the other images required augmentation and futher preprocessing. For E.g., there are multiple humans within a single image for a "happy person" dataset, and multiple purses along with a lady present in the "purse" images.
- All the models predicted correctly with almost perfect accuracy, with almost no data cleaning or feature engineering.
End
As you can see, we have successfully trained three models and predicted results in well under 20 mins. A couple of models aren't quite good enough yet. What can be the possible reasons ? I'll wait for the subsequent lectures to find out! This was a very brief introduction blog post to the image classification model in the fastAI, with almost no tweaks in the parameters, or pre-trained models, or data augmentations. The below section is completely optional, and just provide some additional commands that are tend to be useful in general, or which I have learned as completely new while doing this project.
- This cell can be run if you want to delete the whole folder along with the contents. This will delete any/all directories and files that are present inside path object.
import shutil
path = Path('resized_fruits')
shutil.rmtree(path)
- Deleting only Files in a directory
flag_search = 'happy person'
path = Path('happy_sad_angry')
dest_flag_search = (path/flag_search) # Path to the "happy person" folder
files = os.listdir(dest_flag_search)
for fi in files:
print(fi)
os.unlink(dest_flag_search/fi)
- Deleting a Folder (Note: This will only work when the folders are empty)
searches = 'happy person','sad person','angry person'
path = Path('happy_sad_angry')
for o in searches:
os.rmdir(path/o)
Challenges/Errors Faced
- Same File error - See post : https://forums.fast.ai/t/same-file-path-error-while-resizing-images-lesson-1/97601
Note: While running the resize_images() method, the resized images were created with the same file name and path as of downloaded images, and hence the error was producing. Although, I’m not sure, why this error didn’t appear in the original notebook for lesson 1.
Useful Commands learned
OS Specific
- To create a path to a folder or file :-
path = Path(happy_sad_angry/abc.jpg)
- To delete a directory :-
os.rmdir(<path>)
- To delete files in a folder :-
os.unlink(<filepath>)
- To list directories and files in a folder
os.listdir(<path>)
- To install latest versions of a library
!pip install -Uqq fastbook
- To get the os environmentYou can also use
os.environ()
os.environ().get(<key>)
FastAI Specific
- To download using URL
from fastdownload import download_url download_url(<url>, <dest_file>, show_progress=False)
- FastAI's vision imports
from fastai.vision.all import *
- To download and resize images to equal resolution
download_images(<download_path>, urls= <list of url>) resize_images(<downloaded_path>, max_size=400, dest=<resized_path>)
- To open an image from the path
PILImage.create('person_sad.jpg')
- To predict provided an item : Returns label(or category), index to look from the probability tensor, probabilities for all category (as a tensor)Output : ('sad person', TensorBase(2), TensorBase([3.2509e-03, 1.1917e-04, 9.9663e-01]))
learn.predict(<item>)
Pythonic Image
- To open an image in lazy manner (i.e., it identifies the file, but the file remains open and the actual image data is not read from the file until you try to process the data) and set thumbnail to 256*256
im = Image.open(<dest_file>) im.to_thumb(256,256)
Pythonic Strings
- Formatted StringsOutput : happy person photo
o = 'happy person f'{o} photo'