PLDI 2024
Mon 24 - Fri 28 June 2024 Copenhagen, Denmark

In this article, we present a new method for implementing a neural network whose weights are floating-point numbers on a fixed-point architecture. The originality of our approach is that fixed-point formats are computed by solving an integer optimization problem derived from the neural network model and concerning the accuracy of computations and results at each point of the network. Therefore, we can bound mathematically the error between the results returned by the floating-point and fixed-point versions of the network. In addition to a formal description of our method, we describe a prototype that implements it. Our tool accepts the most common neural network layers (fully connected, convolutional, max-pooling, etc.), uses an optimizing SMT solver to compute fixed-point formats and synthesizes fixed-point C code from the Tensorflow model of the network. Experimental results show that our tool is able to achieve performance while keeping the relative numerical error below the given tolerance threshold. Furthermore, the results show that our fixed-point synthesized neural networks consume less time and energy when considering a typical embedded platform using an STM32 Nucleo board.