http://neuralnetworksanddeeplearning.com/chap4.html
In essence, we're using our single-layer neural networks to build a lookup table for the function. And we'll be able to build on this idea to provide a general proof of universality.