Transmitting updates of high-dimensional neural network (NN) models between client IoT devices and the central aggregating server has always been a bottleneck in collaborative learning - especially in uncertain real-world IoT networks where congestion, latency, bandwidth issues are common. In this scenario, gradient quantization is an effective way to reduce bits count when transmitting each model update, but with a trade-off of having an elevated error floor due to higher variance of the stochastic gradients. In this paper, we propose ElastiCL, an elastic quantization strategy that achieves transmission efficiency plus a low error floor by dynamically altering the number of quantization levels during training on distributed IoT devices. Experiments on training ResNet-18, Vanilla CNN shows that ElastiCL can converge in much fewer transmitted bits than fixed quantization level setups, with little or no compromise on training and test accuracy.
The paper was published at ACM SenSys 2021.