• It helps to reduce the internal covariate shift (ICS) so the distribution of inputs to the activations remains more stable.
  • BN makes us less careful about the scale of the parameters and their initialization.
  • It allows us to…

  • But now the question comes to mind why are we using this technique?
  • What are the benefits of using such techniques in the neural architecture we build?

What drives the AlphaFold to battle for a 50 year’s old grand challenge of biology?

Source of the image

A mathematical explanation of optimization of the linearly separable classifier using quadratic programming.

Source: Image


Ajinkya Jadhav

Machine Learning and Deep Learning Practitioner

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store