Abstract
Devices with limited computing resources use smaller AI models to achieve low-latency inferencing. However, model accuracy is typically much lower than the accuracy of a bigger model that is trained and deployed in places where the computing resources are relatively abundant. We describe DyCo, a novel system that ensures privacy of stream data and dynamically improves the accuracy of small models used in devices. Unlike knowledge distillation or federated learning, DyCo treats AI models as black boxes. DyCo uses a semi-supervised approach to leverage existing training frameworks and network model architectures to periodically train contextualized, smaller models for resource-constrained devices. DyCo uses a bigger, highly accurate model in the edge-cloud to auto-label data received from each sensor stream. Training in the edge-cloud (as opposed to the public cloud) ensures data privacy, and bespoke models for thousands of live data streams can be designed in parallel by using multiple edge-clouds. DyCo uses the auto-labeled data to periodically re-train, stream-specific, bespoke small models. To reduce the periodic training costs, DyCo uses different policies that are based on stride, accuracy, and confidence information.
We evaluate our system, and the contextualized models, by using two object detection models for vehicles and people, and two datasets (a public benchmark and another real-world proprietary dataset). Our results show that DyCo increases the mAP accuracy measure of small models by an average of 16.3% (and up to 20%) for the public benchmark and an average of 19.0% (and up to 64.9%) for the real-world dataset. DyCo also decreases the training costs for contextualized models by more than an order of magnitude.
- [1] . 2016. IoT-based smart cities: A survey. In IEEE 16th International Conference on Environment and Electrical Engineering (EEEIC). IEEE, 1–6.Google Scholar
Cross Ref
- [2] . 2020. 5G network slicing using SDN and NFV: A survey of taxonomy, architectures and future challenges. Comput. Netw. 167 (
Feb. 2020).DOI :Google ScholarDigital Library
- [3] . 2019. Towards federated learning at scale: System design. arXiv preprint arXiv:1902.01046 (2019).Google Scholar
- [4] . 2017. Smart airport: An IoT-based airport management system. In International Conference on Future Networks and Distributed Systems. 1–7.Google Scholar
Digital Library
- [5] . 2009. Semi-supervised learning. IEEE Trans. Neural Netw. 20, 3 (2009), 542–542.Google Scholar
Digital Library
- [6] . 2017. Learning efficient object detection models with knowledge distillation. In Conference on Advances in Neural Information Processing Systems. 742–751.Google Scholar
- [7] . 2017. A caffe implementation of MobileNet-SSD detection network. Retrieved from https://github.com/chuanqi305/MobileNet-SSD.Google Scholar
- [8] . 2019. Cartel: A system for collaborative transfer learning at the edge. In ACM Symposium on Cloud Computing. 25–37.Google Scholar
- [9] . 2020. Mobile cloudization storytelling: Current issues from optimization perspective. IEEE Internet Comput. PP (
1 2020), 1–1.DOI :Google ScholarCross Ref
- [10] . 2020. Mot20: A benchmark for multi object tracking in crowded scenes. arXiv preprint arXiv:2003.09003 (2020).Google Scholar
- [11] . 2020. FlexDNN: Input-adaptive on-device deep learning for efficient mobile vision. In 5th ACM/IEEE Symposium on Edge Computing (SEC).Google Scholar
- [12] . 2018. Detectron. Retrieved from https://github.com/facebookresearch/detectron.Google Scholar
- [13] . 2018. DeltaFrame-BP: An algorithm using frame difference for deep convolutional neural networks training and inference on video data. IEEE Trans. Multi-scale Comput. Syst. 4, 4 (2018), 624–634.Google Scholar
Cross Ref
- [14] . 2020. Knowledge distillation for mobile edge computation offloading. ZTE Commun. 18, 2 (2020), 40–48.Google Scholar
- [15] . 2016. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition. 770–778.Google Scholar
Cross Ref
- [16] . 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015).Google Scholar
- [17] . 2017. MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017).Google Scholar
- [18] . 2017. Gaia: Geo-distributed machine learning approaching \(\lbrace\)LAN\(\rbrace\) speeds. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI’17). 629–647.Google Scholar
- [19] . 2017. Neurosurgeon: Collaborative intelligence between the cloud and mobile edge. ACM SIGARCH Comput. Archit. News 45, 1 (2017), 615–629.Google Scholar
Digital Library
- [20] . 2019. Edge AI: On-demand accelerating deep neural network inference via edge computing. IEEE Trans. Wirel. Commun. 19, 1 (2019), 447–457.Google Scholar
Cross Ref
- [21] . 2015. Microsoft COCO: Common Objects in Context. arXiv:cs.CV/1405.0312.Google Scholar
- [22] . 2016. SSD: Single shot multibox detector. In European Conference on Computer Vision. Springer, 21–37.Google Scholar
Cross Ref
- [23] . 1998. Machine-vision systems for intelligent transportation systems. IEEE Intell. Syst. Applic. 13, 6 (1998), 24–31.Google Scholar
Digital Library
- [24] . 2018. maskrcnn-benchmark: Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch. Retrieved from https://github.com/facebookresearch/maskrcnn-benchmark. (2018).Google Scholar
- [25] . 2019. Distilled split deep neural networks for edge-assisted real-time systems. In Workshop on Hot Topics in Video Analytics and Intelligent Edges. 21–26.Google Scholar
- [26] . 2019. Online model distillation for efficient video inference. In IEEE International Conference on Computer Vision. 3573–3582.Google Scholar
Cross Ref
- [27] . 2018. Steel: Simplified development and deployment of edge-cloud applications. In 10th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud’18).Google Scholar
- [28] . 2011. Efficient HOG human detection. Sig. Process. 91, 4 (2011), 773–781.Google Scholar
Digital Library
- [29] . 2019. PyTorch: An imperative style, high-performance deep learning library. In Conference on Advances in Neural Information Processing Systems. 8024–8035.Google Scholar
- [30] . 2020. AI-Powered Camera Sensors. Retrieved from https://www.gyrfalcontech.ai/ai-powered-camera-sensors-whitepaper/. (2020).Google Scholar
- [31] . 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In Conference on Advances in Neural Information Processing Systems. 91–99.Google Scholar
- [32] . 2019. Real-time smart attendance system using face recognition techniques. In 9th International Conference on Cloud Computing, Data Science & Engineering (Confluence). IEEE, 522–525.Google Scholar
Cross Ref
- [33] . 2015. FaceNet: A unified embedding for face recognition and clustering. In IEEE Conference on Computer Vision and Pattern Recognition. 815–823.Google Scholar
Cross Ref
- [34] . 2017. Towards closing the energy gap between HOG and CNN features for embedded vision. In IEEE International Symposium on Circuits and Systems (ISCAS). 1–4.
DOI :Google ScholarCross Ref
- [35] . 2014. Data security and privacy in cloud computing. Int. J. Distrib. Sensor Netw. 10, 7 (2014), 190903.Google Scholar
Cross Ref
- [36] . 2013. Security and privacy in mobile cloud computing. In 9th International Wireless Communications and Mobile Computing Conference (IWCMC). IEEE, 655–659.Google Scholar
Cross Ref
- [37] . 2007. Co-tracking using semi-supervised support vector machines. In IEEE 11th International Conference on Computer Vision. IEEE, 1–8.Google Scholar
Cross Ref
- [38] . 2017. Distributed deep neural networks over the cloud, the edge and end devices. In IEEE 37th International Conference on Distributed Computing Systems (ICDCS). IEEE, 328–339.Google Scholar
Cross Ref
- [39] . 2018. A brief introduction to weakly supervised learning. Nat. Sci. Rev. 5, 1 (2018), 44–53.Google Scholar
Cross Ref
- [40] . 2005. Semi-supervised regression with co-training. In International Joint Conferences on Artificial Intelligence, Vol. 5. 908–913.Google Scholar
- [41] . 2005. Semi-supervised Learning Literature Survey.
Technical Report . University of Wisconsin-Madison Department of Computer Sciences.Google Scholar
Index Terms
DyCo: Dynamic, Contextualized AI Models
Recommendations
Transductive Multilabel Learning via Label Set Propagation
The problem of multilabel classification has attracted great interest in the last decade, where each instance can be assigned with a set of multiple class labels simultaneously. It has a wide variety of real-world applications, e.g., automatic image ...
Exploiting Large-Scale Teacher-Student Training for On-Device Acoustic Models
Text, Speech, and DialogueAbstractWe present results from Alexa speech teams on semi-supervised learning (SSL) of acoustic models (AM) with experiments spanning over 3000 h of GPU time, making our study one of the largest of its kind. We discuss SSL for AMs in a small footprint ...
Semi-supervised robust deep neural networks for multi-label image classification
Highlights- Large-scale data includes many noisily labeled and unlabeled examples.
- With ...
AbstractThis paper introduces a robust method for semi-supervised training of deep neural networks for multi-label image classification. To this end, a ramp loss is utilized since it is more robust against noisy and incomplete image labels ...






Comments