Top 3 Arxiv Papers Today in Hardware Architecture


2.006 Mikeys
#1. NeuMMU: Architectural Support for Efficient Address Translations in Neural Processing Units
Bongjoon Hyun, Youngeun Kwon, Yujeong Choi, John Kim, Minsoo Rhu
To satisfy the compute and memory demands of deep neural networks, neural processing units (NPUs) are widely being utilized for accelerating deep learning algorithms. Similar to how GPUs have evolved from a slave device into a mainstream processor architecture, it is likely that NPUs will become first class citizens in this fast-evolving heterogeneous architecture space. This paper makes a case for enabling address translation in NPUs to decouple the virtual and physical memory address space. Through a careful data-driven application characterization study, we root-cause several limitations of prior GPU-centric address translation schemes and propose a memory management unit (MMU) that is tailored for NPUs. Compared to an oracular MMU design point, our proposal incurs only an average 0.06% performance overhead.
more | pdf | html
Figures
Tweets
arxiv_org: NeuMMU: Architectural Support for Efficient Address Translations in Neural Processing Units. https://t.co/FJiBVKdYcY https://t.co/RCkPNZjycJ
BrundageBot: NeuMMU: Architectural Support for Efficient Address Translations in Neural Processing Units. Bongjoon Hyun, Youngeun Kwon, Yujeong Choi, John Kim, and Minsoo Rhu https://t.co/tVRzx2GYvw
Underfox3: KAIST researchers proposed a throughput-centric NPU MMU design that effectively handles high burst of translation requests, achieving only an average 0.06% performance overhead while allowing CPUs and NPUs to share a unified global address space. https://t.co/j4DZ8Sd5WN https://t.co/F6wCYnM6vL
arxivml: "NeuMMU: Architectural Support for Efficient Address Translations in Neural Processing Units", Bongjoon Hyun, Young… https://t.co/hZOTLtFFf8
Memoirs: NeuMMU: Architectural Support for Efficient Address Translations in Neural Processing Units. https://t.co/gBkckjJq5L
ElectronNest: RT @arxiv_org: NeuMMU: Architectural Support for Efficient Address Translations in Neural Processing Units. https://t.co/FJiBVKdYcY https:/…
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 5
Total Words: 13061
Unqiue Words: 3506

2.002 Mikeys
#2. Stream Semantic Registers: A Lightweight RISC-V ISA Extension Achieving Full Compute Utilization in Single-Issue Cores
Fabian Schuiki, Florian Zaruba, Torsten Hoefler, Luca Benini
Single-issue processor cores are very energy efficient but suffer from the von Neumann bottleneck, in that they must explicitly fetch and issue the loads/storse necessary to feed their ALU/FPU. Each instruction spent on moving data is a cycle not spent on computation, limiting ALU/FPU utilization to 33% on reductions. We propose "Stream Semantic Registers" to boost utilization and increase energy efficiency. SSR is a lightweight, non-invasive RISC-V ISA extension which implicitly encodes memory accesses as register reads/writes, eliminating a large number of loads/stores. We implement the proposed extension in the RTL of an existing multi-core cluster and synthesize the design for a modern 22nm technology. Our extension provides a significant, 2x to 5x, architectural speedup across different kernels at a small 11% increase in core area. Sequential code runs 3x faster on a single core, and 3x fewer cores are needed in a cluster to achieve the same performance. The utilization increase to almost 100% in leads to a 2x energy...
more | pdf | html
Figures
Tweets
Underfox3: Researchers have proposed Stream Semantic Registers: A lightweight, non-invasive RISC-V ISA extension which implicitly encodes memory accesses as register reads/writes, eliminating a large number of loads/stores. #RISCV https://t.co/35247gJBFW https://t.co/T38eqDDJhI
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 4
Total Words: 13835
Unqiue Words: 3569

2.001 Mikeys
#3. FPGA Energy Efficiency by Leveraging Thermal Margin
Behnam Khaleghi, Sahand Salamat, Mohsen Imani, Tajana Rosing
Cutting edge FPGAs are not energy efficient as conventionally presumed to be, and therefore, aggressive power-saving techniques have become imperative. The clock rate of an FPGA-mapped design is set based on worst-case conditions to ensure reliable operation under all circumstances. This usually leaves a considerable timing margin that can be exploited to reduce power consumption by scaling voltage without lowering clock frequency. There are hurdles for such opportunistic voltage scaling in FPGAs because (a) critical paths change with designs, making timing evaluation difficult as voltage changes, (b) each FPGA resource has particular power-delay trade-off with voltage, (c) data corruption of configuration cells and memory blocks further hampers voltage scaling. In this paper, we propose a systematical approach to leverage the available thermal headroom of FPGA-mapped designs for power and energy improvement. By comprehensively analyzing the timing and power consumption of FPGA building blocks under varying temperatures and...
more | pdf | html
Figures
Tweets
Underfox3: In this paper is proposed a power and energy optimization techniques for FPGAs using a thermal-aware voltage scaling, showing up to 36% power reduction with the same performance, and 66% total energy saving when energy is the optimization target. https://t.co/mUEl7Pklmy https://t.co/wqeZ1uHq3I
Github
None.
Youtube
None.
Other stats
Sample Sizes : None.
Authors: 4
Total Words: 9645
Unqiue Words: 2963

About

Assert is a website where the best academic papers on arXiv (computer science, math, physics), bioRxiv (biology), BITSS (reproducibility), EarthArXiv (earth science), engrXiv (engineering), LawArXiv (law), PsyArXiv (psychology), SocArXiv (social science), and SportRxiv (sport research) bubble to the top each day.

Papers are scored (in real-time) based on how verifiable they are (as determined by their Github repos) and how interesting they are (based on Twitter).

To see top papers, follow us on twitter @assertpub_ (arXiv), @assert_pub (bioRxiv), and @assertpub_dev (everything else).

To see beautiful figures extracted from papers, follow us on Instagram.

Tracking 225,721 papers.

Search
Sort results based on if they are interesting or reproducible.
Interesting
Reproducible
Categories
All
Astrophysics
Cosmology and Nongalactic Astrophysics
Earth and Planetary Astrophysics
Astrophysics of Galaxies
High Energy Astrophysical Phenomena
Instrumentation and Methods for Astrophysics
Solar and Stellar Astrophysics
Condensed Matter
Disordered Systems and Neural Networks
Mesoscale and Nanoscale Physics
Materials Science
Other Condensed Matter
Quantum Gases
Soft Condensed Matter
Statistical Mechanics
Strongly Correlated Electrons
Superconductivity
Computer Science
Artificial Intelligence
Hardware Architecture
Computational Complexity
Computational Engineering, Finance, and Science
Computational Geometry
Computation and Language
Cryptography and Security
Computer Vision and Pattern Recognition
Computers and Society
Databases
Distributed, Parallel, and Cluster Computing
Digital Libraries
Discrete Mathematics
Data Structures and Algorithms
Emerging Technologies
Formal Languages and Automata Theory
General Literature
Graphics
Computer Science and Game Theory
Human-Computer Interaction
Information Retrieval
Information Theory
Machine Learning
Logic in Computer Science
Multiagent Systems
Multimedia
Mathematical Software
Numerical Analysis
Neural and Evolutionary Computing
Networking and Internet Architecture
Other Computer Science
Operating Systems
Performance
Programming Languages
Robotics
Symbolic Computation
Sound
Software Engineering
Social and Information Networks
Systems and Control
Economics
Econometrics
General Economics
Theoretical Economics
Electrical Engineering and Systems Science
Audio and Speech Processing
Image and Video Processing
Signal Processing
General Relativity and Quantum Cosmology
General Relativity and Quantum Cosmology
High Energy Physics - Experiment
High Energy Physics - Experiment
High Energy Physics - Lattice
High Energy Physics - Lattice
High Energy Physics - Phenomenology
High Energy Physics - Phenomenology
High Energy Physics - Theory
High Energy Physics - Theory
Mathematics
Commutative Algebra
Algebraic Geometry
Analysis of PDEs
Algebraic Topology
Classical Analysis and ODEs
Combinatorics
Category Theory
Complex Variables
Differential Geometry
Dynamical Systems
Functional Analysis
General Mathematics
General Topology
Group Theory
Geometric Topology
History and Overview
Information Theory
K-Theory and Homology
Logic
Metric Geometry
Mathematical Physics
Numerical Analysis
Number Theory
Operator Algebras
Optimization and Control
Probability
Quantum Algebra
Rings and Algebras
Representation Theory
Symplectic Geometry
Spectral Theory
Statistics Theory
Mathematical Physics
Mathematical Physics
Nonlinear Sciences
Adaptation and Self-Organizing Systems
Chaotic Dynamics
Cellular Automata and Lattice Gases
Pattern Formation and Solitons
Exactly Solvable and Integrable Systems
Nuclear Experiment
Nuclear Experiment
Nuclear Theory
Nuclear Theory
Physics
Accelerator Physics
Atmospheric and Oceanic Physics
Applied Physics
Atomic and Molecular Clusters
Atomic Physics
Biological Physics
Chemical Physics
Classical Physics
Computational Physics
Data Analysis, Statistics and Probability
Physics Education
Fluid Dynamics
General Physics
Geophysics
History and Philosophy of Physics
Instrumentation and Detectors
Medical Physics
Optics
Plasma Physics
Popular Physics
Physics and Society
Space Physics
Quantitative Biology
Biomolecules
Cell Behavior
Genomics
Molecular Networks
Neurons and Cognition
Other Quantitative Biology
Populations and Evolution
Quantitative Methods
Subcellular Processes
Tissues and Organs
Quantitative Finance
Computational Finance
Economics
General Finance
Mathematical Finance
Portfolio Management
Pricing of Securities
Risk Management
Statistical Finance
Trading and Market Microstructure
Quantum Physics
Quantum Physics
Statistics
Applications
Computation
Methodology
Machine Learning
Other Statistics
Statistics Theory
Feedback
Online
Stats
Tracking 225,721 papers.