Papers
arxiv:2310.05218

Accelerating Machine Learning Primitives on Commodity Hardware

Published on Oct 8, 2023
Authors:

Abstract

Sliding Window Sum algorithms have been successfully used for training and inference of Deep Neural Networks. We have shown before how both pooling and convolution 1-D primitives could be expressed as sliding sums and evaluated by the compute kernels with a shared structure. In this paper, we present an extensive study of the Sliding Window convolution technique as a more efficient alternative to the commonly used General Matrix Multiplication (GEMM) based convolution in Deep Neural Networks (DNNs). The Sliding Window technique addresses the memory bloating problem and demonstrates a significant speedup in 2-D convolution. We explore the performance of this technique on a range of implementations, including custom kernels for specific filter sizes. Our results suggest that the Sliding Window computation kernels can outperform GEMM-based convolution on a CPU and even on dedicated hardware accelerators. This could promote a wider adoption of AI on low-power and low-memory devices without the need for specialized hardware. We also discuss the compatibility of model compression methods and optimized network architectures with the Sliding Window technique, encouraging further research in these areas.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2310.05218 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2310.05218 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2310.05218 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.