ML theory papers

Posted on Jun 21, 2023 | 173 words | ~1mins

Mert Pilanci.
Any finite two-layer neural network with ReLU has an equivalent convex problem.
Talk at Stanford. An audience member said that you want SDG to converge to a local minumum (for generalization?). That doesn’t sound right?
A previous result showed that an infinite-width network is convex. This paper on the other hand gives a convex problem that can be implemented and solved.

Mert Pilanci.
This paper from arxiv looks similar (same authors).
The paper characterizes all local minima of the said NN.
There appear to be precedent of “No spurious local minima”-type results.
This article by different authors also discuss “all local minima are global”-type results.

The Betti number reduces as the data passes through the layers. An explanation for the success of many-layer networks.
Compares ReLU with sigmoid. Topology changes more quickly with ReLU. Non-smoothness assists in changing the topology. Maybe this is why quantization has helped.