Understanding Convolution: A Key Concept in Image Processing and Machine Learning — Machine Learning Site

Machine Learning Site
5 min readNov 30, 2023

--

When we process images, we are altering or analyzing the pixel values within the image. Whether it is image blurring, image sharpening, edge detection, or even object recognition ( heard of YOLO?), the pixels are analyzed and treated to yield us desired result. Whatever the transformation is, there is one common principle that plays an important role in these image-processing tasks: Convolution! Take a quick look here to see the capabilities of convolution and how you can use it on images.

If you have worked with image data, then you might be familiar with the term “convolution”! As per the definition, convolution is a mathematical process where the integral of the product of two functions yields a third function. Mathematically, it is described as follows:

Now this would be the reaction of many of you:

The formula does seem complicated and is challenging to wrap your head around the process. So let us keep the formula aside for a moment, and understand convolution via an example.

Understanding convolution through an example

[The example considered here is inspired by this article, so many thanks to its author].

Consider that you are working on a task in your company.
- You spend 6 hours on the first day of the task and 3 hours on the second day. So the task has the working hour sequence [6 3].
- Now, on the second day itself, you get another task. We assume that the sequence of working hours remains the same, i.e., [6 3]. So on the second day, you are working 3 hours on the first task and 6 hours on the second task. So a total of 9 hours.
-Further, on the third day, you get TWO new tasks. So now you have 6+6 hours of the 2 new tasks and 3 pending hours of the previous task. From now on, maybe most of us will simply lose the count! So there must be a simpler way to calculate this, and there is: Convolution.

Let us form a row vector of the incoming tasks and call it f = [1 1 2 1 3]. This means that on day 1, you get 1 task. On day 2, you again get 1 task. 2 more tasks add up on day 3. On day 4, there is a single task and on the last day, you are piled up with 3 tasks. The working pattern for one single task remains the same, i.e., [6 3]. We’ll denote it by g.

So we calculate the convolution between and . For that, we reverse the order of elements of , i.e.[6 3] will be [3 6]. Now we slide this reversed step by step under , and calculate the product of every overlapping element, and sum up all the products:

We ultimately get [3 9 12 15 15 18] as our end result. As the tasks pile up, you will be working for 18 hours on the last day. Don’t worry, this is just a hypothetical scenario!

What we just did is convolution in one dimension. In the process, we found a pattern of increasing working tasks/hours in our vector with the help of .

When it comes to images, we call our function as filters. Note that the terms “filter” and “kernel” are interchangeable. We use these filters to recognize the pattern in the input images (). The process of convolution remains the same, i.e., the filter slides over the image pixels, and for every step, we calculate the sum of the products of the image and filter elements. The following GIF demonstrates convolution process between input image and a filter:

In this way, the filters convolved with the input images, learn the values of the images and try to understand the pattern within.

Popular Image Filters in Convolution

Let us have a look at some of the popular filter for image processing:

Edge Detection Filter
Gaussian Blur Filter
Sharpening Filter

If you want hands-on exercise with image processing and convolution using Python, I’ve got you covered. You can click on the following article to directly get started with image processing using Python:

https://machinelearningsite.com/introduction-to-opencv/

If you happen to have any issues understanding the code, feel free to DM me on Instagram: @machinelearningsite.

Conclusion

In conclusion, convolution is a fundamental concept in the field of image recognition and machine learning. Through the process of convolution, mathematical operations are applied to input data, enabling the extraction of meaningful features and patterns. This technique is widely used in object recognition models to detect and classify objects within images.

Convolution allows for the identification of specific features in an image by utilizing filters or kernels. These filters act as templates that highlight certain characteristics such as edges, textures, or shapes. By convolving these filters with the input image, the model can capture and analyze important visual information.

Your support motivates me to create interesting content. All I ask is you to follow me on social media:

Instagram: instagram.com/machinelearningsite
Facebook: facebook.com/machinelearningsite

Already interested in more topics? Join my newsletter today. Don’t worry, you won’t get spammed everyday with unnecessary emails.

Originally published at https://machinelearningsite.com on November 30, 2023.

--

--

Machine Learning Site
Machine Learning Site

Written by Machine Learning Site

Automobile engineer / Software Developer in Germany | Specialized in Driver Assistance Systems | Working with Python, Machine Learning, Artificial Intelligence

No responses yet