Don’t allow defaults to control your gradients, because gradients are precious for deep learning😬

Image grabbed from https://losslandscape.com/gallery/

Adam is a decent optimization algorithm that was initially published in 2014, but with the advances in architectures and the compute Adam is no longer a perfect choice. Just to emphasize the purpose of better optimization strategies consider the billionaire GPT-3(in the algorithmic world this guy must be a billionaire if it ever exists😅) where a single batch size runs in millions. Now the question is can Adam handle such a large size? I seriously doubt that(well, you cannot validate my statement unless you're a billionaire…


Class Agnostic Video Repetition Counting (Repeating Net😵)

Neural networks are proven to be powerful and became an industry stranded computer application in this decade. Transfer learning is very well adopted in the fields of vision which is a step towards generalization. The bar for the neural networks will definitely go up in the coming decade and more complex cognitive tasks should be ready to challenge the neural networks. This paper is one of such, a team from Google AI and Deep mind tried to solve the problem of counting the repetitions in a cycle(Heart beats, planetary rotations). …


Deploying deep learning models in production can be challenging, as it is far beyond training models with good performance. Several distinct components need to be designed and developed in order to deploy a production level deep learning system (seen below):

This post aims to be an engineering guideline for building production-level deep learning systems that will be deployed in real-world applications.

The material presented here is borrowed from Full Stack Deep Learning Bootcamp (by Pieter Abbeel at UC Berkeley, Josh Tobin at OpenAI, and Sergey Karayev at Turnitin), TFX workshop by Robert Crowe, and Pipeline.ai’s


Single Headed Attention RNN(Stephen Merity Has An RNN 🤷‍♀️)

Many researchers and practitioners put an end to RNN and it’s variants after the advent of Transformers but not the author. This piece of research is an eye-opener for many who think compute is the only way. SoTA results are achievable under 24 hours on a single GPU “as the author is impatient 😎“.

Irrational as it seems I didn’t want to use a cluster in the cloud somewhere, watching the dollars leave my bank account — Author

This paper is not solely about the architectures and achieving SoTA, but questioning…


Transformers are widely known for their accomplishments in the field of NLP. Recent investigations prove that transformers have the inherent ability to generalize and fit into many tasks, this inherent capability of the architecture has become the primary reason to adopt the transformers in the vision field

What is DETR? Why DETR?

well, DETR stands for “detecting transformers”. This was proposed by FAIR and the benchmarks are slightly better than Faster-RCNN

Existing object detection frameworks are carefully crafted with different design principles. Two-stage detectors(Faster RCNN) predict boxes w.r.t. proposals, whereas single-stage methods(YOLO) make predictions w.r.t. anchors or a grid of possible object centers. The performance…


Electric motors are essential components in most industrial processes

With the high productivity levels at industrial plants, any unscheduled shutdown due to failure can be very disruptive to the production process

In industries like nuclear power and petrochemical, techniques able to detect the fault’s early onset could avoid more serious problems. In this sense, there are many studies focused on early fault detection.

•Several artificial intelligence techniques have been developed and applied in the monitoring processes of faults

•Artificial Neural Networks (ANNs)

• Fuzzy Logic (FL) and

•Support Vector Machines (SVM)

It is known that different methods for induction motor…


The first step in implementing Speech recognition is understanding how audio data works?

sampling frequency

The sampling frequency (or sample rate) is the number of samples per second in a Sound. For example: if the sampling frequency is 44100 hertz, a recording with a duration of 60 seconds will contain 2,646,000 samples

All audio files are sampled at a sampling frequency of 44100

Reading Audio File

Generally, audio files are treated as wave files and while reading an audio file we get sampling frequency and the actual audio

Let’s read an audio file which is 3 seconds long

import numpy as np
import matplotlib.pyplot as plt
from scipy.io import wavfile
sampling_freq, audio = wavfile.read('./input_read.wav')
print( '\nShape:', audio.shape)
print ('Datatype:', audio.dtype)
print ('Duration:', round(audio.shape[0] /…

Maharshi Yeluri

Data Scientist

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store