Abstract
The main problem of distributed learning is credit assignment, which was solved in the 80s with the invention of error backpropagation. 30 years later, Backprop, along with a few more recent tricks, is the major workhorse underlying machine learning and remains state-of-the-art for supervised learning. However, weight updates under Backprop depend on recursive computations that require distinct output and error signals — features not shared by biological neurons, that are perhaps unnecessary. In this talk, I revisit Backprop and the credit assignment problem. The main results are: (1) that Backprop decomposes into a collection of local learning algorithms; (2) regret bounds for these sub-algorithms; and (3) a factorization of Backprop’s error signals. Using these results, I derive a new algorithm for nonparametric regression, Kickback, that is significantly simpler than Backprop. Finally, I provide a sufficient condition for Kickback to follow error gradients, and show that Kickback matches Backprop’s performance on real-world regression benchmarks.