A common subexpression elimination-based compression method for the constant matrix multipication
Loading...
Date
2022
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2022.
Abstract
The execution time, resource and energy costs of deep learning applications become much more important as their popularity grows. The Constant Matrix Multi plication has been studied for a long time and takes place in deep learning applications. Reducing the computation cost of those applications is a highly active research topic. The weights are pruned or quantized while satisfying the desired accuracy requirement. The pruned matrices are compressed into one-dimensional arrays without data loss. Matrix multiplication is performed by processing those arrays without decompression. Processing one-dimensional arrays to perform matrix multiplication is deployed on vari ous hardware platforms that employ Central Processing Unit, Graphics Processor Unit and Field-Programmable Gate Array. The deployments can also be supported with common subexpression elimination methods to reduce the number of multiplications, additions and storage size. However, the state-of-the-art methods do not scale well for the large constant matrices as they reach hours for extracting common subexpressions in a 200 × 200 matrix. In this thesis, a random search-based common subexpression elimination method is constructed to reduce the run-time of the algorithm. The algo rithm produces an adder tree for a 1000 × 1000 matrix in a minute. The Compressed Sparse Row format is extended to build a one-dimensional compression notation for the proposed method. Simulations for a single-core embedded system show that the latency is reduced by 80% for a given 100×100 matrix compared to the state-of-the- art methods. The storage size of the sparse matrices is also reduced by more than half in the experiments compared to the Compressed Sparse Row format.