Figures and Tables from this paper
- figure 1
- table 1
- figure 2
- table 2
- figure 3
- figure 4
- table 4
- figure 5
- table 5
- figure 6
Topics
Single Instruction Multiple Data (opens in a new tab)Caffe (opens in a new tab)TensorFlow (opens in a new tab)Deep Convolutional Neural Networks (opens in a new tab)Graphical Processing Units (opens in a new tab)Torch (opens in a new tab)Convolutional Neural Network (opens in a new tab)Neural Network (opens in a new tab)Multiply-accumulate (opens in a new tab)Hardware Resources (opens in a new tab)
29 Citations
- Agathe ArchetNicolas VentrouxNicolas GacFrançois Orieux
- 2023
Computer Science, Engineering
2023 26th Euromicro Conference on Digital System…
This paper studies deep neural network design and inference options for each accelerator ineterogeneous system-on-chips, and forms guidelines to specifically make the best use of the computing and energy-efficiency capabilities published by manufacturers with the default Tensor Rt mapping.
- Stephan HollyAlexander WendtMartin Lechner
- 2020
Computer Science, Engineering
2020 11th International Green and Sustainable…
This work provides a measurement base for power estimation on NVIDIA Jetson devices, and analyzes the effects of different CPU and GPU settings on power consumption, latency, and energy for complete DNNs as well as for individual layers.
- 14
- PDF
- Chunrong YaoWantao Liu Wei Jiang
- 2021
Computer Science, Engineering
Concurr. Comput. Pract. Exp.
This paper conducts a comprehensive study on the model‐level and layer‐level energy efficiency of popular CNN models and proposes a revenue model to allow an optimal trade‐off between energy efficiency and latency.
- 17
- Crefeda Faviola RodriguesG. RileyM. Luján
- 2020
Computer Science, Engineering
ArXiv
This work provides a comprehensive analysis of building regression-based predictive models for deep learning on mobile devices, based on empirical measurements gathered from the SyNERGY framework, and shows that simple layer-type features achieve a model complexity of 4 to 32 times less for convolutional layer predictions for a similar accuracy compared to predictive models using more complex features adopted by previous approaches.
- 2
- Highly Influenced[PDF]
- Charles Edison TrippJ. Perr-Sauer Erik A. Bensen
- 2024
Computer Science, Engineering
ArXiv
This work introduces the BUTTER-E dataset, an augmentation to the BUTTER Empirical Deep Learning dataset, containing energy consumption and performance data from 63,527 individual experimental runs spanning 30,582 distinct configurations, and proposes a straightforward and effective energy model that accounts for network size, computing, and memory hierarchy.
- Ramyad HadidiJiashen CaoYilun XieBahar AsgariT. KrishnaHyesoon Kim
- 2019
Computer Science, Engineering
2019 IEEE International Symposium on Workload…
This paper characterizes several commercial edge devices on popular frameworks using well-known convolution neural networks (CNNs), a type of DNN, and analyzes the impact of frameworks, their software stack, and their implemented optimizations on the final performance.
- 90
- Highly Influenced
- PDF
- Radosvet DesislavovFernando Mart'inez-PlumedJos'e Hern'andez-Orallo
- 2023
Computer Science, Engineering
Sustain. Comput. Informatics Syst.
- 61 [PDF]
- Chen LiA. TsourdosWeisi Guo
- 2024
Engineering, Computer Science
IEEE Transactions on Artificial Intelligence
This article is the first to develop a bottom-up transistor operations (TOs) approach to expose the role of nonlinear activation functions and neural network structure and statistically model the energy scaling laws as opposed to absolute consumption values.
- Jiaju RenZhiwen YuTao XingHelei CuiYaxing ChenBin Guo
- 2023
Computer Science, Engineering
2023 IEEE Smart World Congress (SWC)
The design and implementation of EnergySense is reported, which is an energy-efficient scheduling framework that improves the computing power of low-power ubiquitous sensors and provides a longer life cycle of the feature map extraction and improves the computing power of low-power ubiquitous sensors.
- Jaron FontaineA. ShahidRobbe ElsasAmina SeferagićI. MoermanE. D. Poorter
- 2020
Computer Science, Engineering
2020 IEEE 92nd Vehicular Technology Conference…
This work aims to propose a deep learning solution using convolutional neural networks, cheap software defined radios and efficient embedded platforms such as NVIDIA’s Jetson Nano to enable smart spectrum management without the need of expensive and high power consuming hardware.
- 10
- PDF
...
...
21 References
- Crefeda Faviola RodriguesG. RileyM. Luján
- 2017
Computer Science, Engineering
2017 IEEE International Symposium on Workload…
This work presents a novel evaluation framework for measuring energy and performance for deep neural networks using ARMs Streamline Performance Analyser integrated with standard deep learning frameworks such as Caffe and CuDNNv5.
- 23 [PDF]
- Xiaqing LiGuangyan ZhangH. Howie HuangZhufan WangWeimin Zheng
- 2016
Computer Science
2016 45th International Conference on Parallel…
A comprehensive comparison of these implementations of convolutional neural networks over a wide range of parameter configurations is conducted, investigate potential performance bottlenecks and point out a number of opportunities for further optimization.
- 116
- PDF
- Tien-Ju YangYu-hsin ChenV. Sze
- 2017
Computer Science, Engineering
2017 IEEE Conference on Computer Vision and…
This work proposes an energy-aware pruning algorithm for CNNs that directly uses the energy consumption of a CNN to guide the pruning process, and shows that reducing the number of target classes in AlexNet greatly decreases thenumber of weights, but has a limited impact on energy consumption.
- 666 [PDF]
- Yangqing JiaEvan Shelhamer Trevor Darrell
- 2014
Computer Science
ACM Multimedia
Caffe provides multimedia scientists and practitioners with a clean and modifiable framework for state-of-the-art deep learning algorithms and a collection of reference models for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures.
- 14,475 [PDF]
- Robert AdolfSaketh RamaBrandon ReagenGu-Yeon WeiD. Brooks
- 2016
Computer Science, Engineering
2016 IEEE International Symposium on Workload…
This paper assembles Fathom: a collection of eight archetypal deep learning workloads, ranging from the familiar deep convolutional neural network of Krizhevsky et al., to the more exotic memory networks from Facebook's AI research group, and focuses on understanding the fundamental performance characteristics of each model.
- 173 [PDF]
- A. CanzianiAdam PaszkeE. Culurciello
- 2016
Computer Science
ArXiv
This work presents a comprehensive analysis of important metrics in practical applications: accuracy, memory footprint, parameters, operations count, inference time and power consumption and believes it provides a compelling set of information that helps design and engineer efficient DNNs.
- 1,088
- Highly Influential[PDF]
- M. VerhelstBert Moons
- 2017
Computer Science, Engineering
IEEE Solid-State Circuits Magazine
Evaluating the powerful but large deep neural networks with power budgets in the milliwatt or even microwatt range requires a significant improvement in processing energy efficiency.
- 115
- Highly Influential
- PDF
- Jinhua TaoZidong Du Tianshi Chen
- 2018
Computer Science, Engineering
Journal of Computer Science and Technology
This paper proposes BenchIP, a benchmark suite and benchmarking methodology for intelligence processors that is utilized for evaluating various hardware platforms, including CPUs, GPUs, and accelerators and will be open-sourced soon.
- 30 [PDF]
- Song HanHuizi MaoW. Dally
- 2016
Computer Science, Engineering
ICLR
This work introduces "deep compression", a three stage pipeline: pruning, trained quantization and Huffman coding, that work together to reduce the storage requirement of neural networks by 35x to 49x without affecting their accuracy.
- 7,847 [PDF]
- F. Iandola
- 2016
Computer Science, Engineering
ArXiv
This dissertation develops a methodology that enables systematic exploration of the design space of CNNs and develops an effective methodology for discovering the “right” CNN architectures to meet the needs of practical applications.
- 16
- Highly Influential[PDF]
...
...
Related Papers
Showing 1 through 3 of 0 Related Papers