Eric Iacutone - Learn Stochastic Gradient Descent in 30 Minutes
Leave comments at: https://elixirforum.com/t/elixirconf-2023-eric-iacutone-learn-stochastic-gradient-descent-in-30-minutes/58723 I’ve struggled to understand the internals of stochastic gradient descent or SGD. In this talk, we will explore SGD via an interactive Livebook example. SGD is the building block of neural networks. To understand how a neural network learns, we need to understand SGD. The Micrograd framework, by Andrej Karparthy, helped build my intuition about SGD. From the README, “…is a small and lightweight automatic differentiation library written in Python. It provides a simple implementation of gradient-based optimization algorithms, including stochastic gradient descent. It allows users to define and train simple computational graphs, compute gradients, and optimize parameters using SGD.” We will port this framework to Elixir. We will visualize how SGD works interactively from Livebook graphs by applying our Elixir-fied Micrograd framework. We will explore a process known as backpropagation. How does a derivative work? What is a derivative measuring, and what is it telling us? How do the input values to a function respond by changing the value of “h” in the derivative? Then, we will complete the SGD loop with a forward pass which updates a loss function–training our network. By the end of the talk, we will use what we have learned to solve for a linear function with SGD. You will come away with a fundamental understanding of how SGD works to optimize a loss function and train a neural network.