Minimum width for universal approximation using ReLU networks on compact domain

Part of International Conference on Representation Learning 2024 (ICLR 2024) Conference

Bibtex Paper

Authors

Namjun Kim, Chanho Min, Sejun Park

Abstract

It has been shown that deep neural networks of a large enough width are universal approximators but they are not if the width is too small.There were several attempts to characterize the minimum width $w_{\min}$ enabling the universal approximation property; however, only a few of them found the exact values.In this work, we show that the minimum width for $L^p$ approximation of $L^p$ functions from $[0,1]^{d_x}$ to $\mathbb R^{d_y}$ is exactly $\max\\{d_x,d_y,2\\}$ if an activation function is ReLU-Like (e.g., ReLU, GELU, Softplus).Compared to the known result for ReLU networks, $w_{\min}=\max\\{d_x+1,d_y\\}$ when the domain is ${\mathbb R^{d_x}}$, our result first shows that approximation on a compact domain requires smaller width than on ${\mathbb R^{d_x}}$.We next prove a lower bound on $w_{\min}$ for uniform approximation using general activation functions including ReLU: $w_{\min}\ge d_y+1$ if $d_x