Sign in

If you’ve heard about the transposed convolution and got confused what it actually means, this article is written for you.

The content of this article is as follows:

  • The Need for Up-sampling
  • Why Transposed Convolution?
  • Convolution Operation
  • Going Backward
  • Convolution Matrix
  • Transposed Convolution Matrix
  • Summary

The Need for Up-sampling

When we use neural networks to generate images, it usually involves up-sampling from low resolution to high resolution.

There are various methods to conduct up-sampling operation:

  • Nearest neighbor interpolation
  • Bi-linear interpolation
  • Bi-cubic interpolation

Understanding the Maximum Entropy Principle

Have you ever wondered why we often use the normal distribution?

How do we derive it anyway?

Why do many probability distributions have the exponential term?

Are they related to each other?

If any of the above questions make you wonder, you are in the right place.

I will demystify it for you.

Fine or Not Fine

Suppose we want to predict if the weather of some place is fine or not.

How to derive the Euler-Lagrange equation

We use the calculus of variations to optimize functionals.

You read it right: functionals not functions.

But what are functionals? What does a functional really look like?

Moreover, there is this thing called the Euler-Lagrange equation.

What is it? How is it useful?

How do we derive such equation?

If you have any of the above questions, you are in the right place.

I’ll demystify it for you.

The shortest path problem

Suppose we want to find out the shortest path from the point A to the point B.

Image via under license to Naoki Shibuya

Understanding why it works

Have you ever wondered why we use the Lagrange multiplier to solve constrained optimization problems?

Is it just a clever technique?

Since it is very easy to use, we learn it like a basic arithmetic by practicing it until we can do it by heart.

But have you ever wondered why it works? Does it always work? If not, why not?

If you want to know the answers to these questions, you are in the right place.

I’ll demystify it for you.

An example constrained optimization problem

In case you are not familiar with what constrained optimizations are, I have written an article that explains…

Photo by Nuno Silva on Unsplash

Explained with a simple example

Have you ever wondered what constrained optimization problems are?

Often times, the word constrained optimization is used as everyone knows what it is.

But it may not be so obvious for people who have not been exposed to such terminology before.

If you like to understand what the constrained optimization is and how to approach such problems, you are in the right place.

I’ll demystify it for you.

What is a constrained optimization problem?

Suppose you are driving a car on a mountain road. You want to climb as high as possible to have a better view of the moon. …

What does KL stand for? Is it a distance measure? What does it mean to measure the similarity of two probability distributions?


If you want to intuitively understand what the KL divergence is, you are in the right place, I’ll demystify the KL divergence for you.

As I’m going to explain the KL divergence from the information theory point of view, it is required to know the entropy and the cross-entropy concepts to fully apprehend this article. …

What is it? Is there any relation to the entropy concept? Why is it used for classification loss? What about the binary cross-entropy?

Some of us might have used the cross-entropy for calculating classification losses and wondered why we use the natural logarithm. Some might have seen the binary cross-entropy and wondered whether it is fundamentally different from the cross-entropy or not. If so, reading this article should help to demystify those questions.

The word “cross-entropy” has “cross” and “entropy” in it, and it helps to understand the “entropy” part to understand the “cross” part.

So, let’s review the entropy…

Is it a disorder, uncertainty or surprise?

The idea of entropy is confusing at first because so many words are used to describe it: disorder, uncertainty, surprise, unpredictability, amount of information and so on. If you’ve got confused with the word “entropy”, you are in the right place. I am going to demystify it for you.

Who Invented Entropy and Why?

In 1948, Claude Shannon introduced the concept of information entropy in his paper “A Mathematical Theory of Communication”.

Claude Shannon


Shannon was looking for a way to efficiently send messages without losing any information.

This article show Deep Convolutional Generative Adversarial Networks — a.k.a DCGAN examples using different image data sets such as MNIST, SVHN, and CelebA.


If you see the above image and it does not make much sense, this article is written for you. I explain how GAN works using a simple project that generates hand-written digit images.

I use Keras on TensorFlow and the notebook code is available in my Github.


GAN (Generative Adversarial Network) is a framework proposed by Ian Goodfellow, Yoshua Bengio and others in 2014.

A GAN can be trained to generate images from random noises. For example, we can train a GAN to generate digit images that look like hand-written digit images from MNIST database.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store