Member-only story

How to scrape Google for Images to train your Machine Learning classifiers on

JJ Deng
5 min readFeb 7, 2019

--

One of the most tedious parts of training an image classifier or working on any computer vision project is actually gathering the images that you’ll be training your model on. In this article, I’ll guide you through a simple process for gathering and filtering your training images very quickly. Keep in mind that you don’t need to follow my process exactly; there are many alternative tools that may be able to do the same job but this is how I go about things, at least on my own Windows laptop. This also assume that you have the Fast.ai v1.0 library installed as well.

Step 1: Scrape Google Images

Here are the detailed steps, as per Fast.ai and Pyimagesearch:

  1. Disable your Ad-Blockers!!
  2. Search for images in Google Images using Chrome (haven’t tested this in Firefox or other browsers.)
  3. Scroll down until there are no more images. You may need to click on the dialogues a few times to get even more images.
  4. Use Ctrl-Shift-J to open the Console.
  5. Invoke the following command:
urls = Array.from(document.querySelectorAll('.rg_di .rg_meta')).map(el=>JSON.parse(el.textContent).ou);
window.open('data:text/csv;charset=utf-8,' +…

--

--

JJ Deng
JJ Deng

Written by JJ Deng

Machine Learning Engineer, INTP, 5w6

Responses (3)