PLDI 2024
Mon 24 - Fri 28 June 2024 Copenhagen, Denmark
Mon 24 Jun 2024 10:40 - 10:55 at Iceland - Optimization Chair(s): Aviral Shrivastava

User-mode Dynamic Binary Translation (DBT) has recently received renewed interest, not least due to Apple’s transition towards the Arm ISA, supported by a DBT compatibility layer for x86 legacy applications. While receiving praise for its performance, execution of legacy applications through Apple’s Rosetta 2 technology still incurs a performance penalty when compared to direct host execution. A particular limitation of Rosetta 2 is that code is either executed exclusively as native Arm code, or as translated Arm code. In particular, mixed mode execution of native Arm code and translated code is not possible. This is a missed opportunity, especially in the case of shared libraries where both optimized x86 and Arm versions of the same library are available. In this paper, we develop mixed mode execution capabilities for shared libraries in a DBT system, eliminating the need to translate code where a highly optimised native version already exists. Our novel execution model intercepts calls to shared library functions in the DBT system and automatically redirects them to their faster host counterparts, making better use of the underlying host ISA. To ease the burden for the developer, we make use of an Interface Description Language (IDL) to capture library function signatures, from which relevant stubs and data marshalling code are generated automatically. We have implemented our novel mixed mode execution approach in the open-source QEMU DBT system, and demonstrate both ease of use and performance benefits for three popular libraries (standard C Math library, SQLite, and OpenSSL). Our evaluation confirms that with minimal developer effort, accelerated host execution of shared library functionality results in speedups between 2.7x and 6.3x on average, and up to 28x for x86 legacy applications on an Arm host system.

Mon 24 Jun

Displayed time zone: Windhoek change

10:40 - 12:20
OptimizationLCTES at Iceland
Chair(s): Aviral Shrivastava Arizona State University
10:40
15m
Talk
Accelerating Shared Library Execution in a DBT
LCTES
Tom Spink University of St Andrews, Björn Franke University of Edinburgh
10:55
15m
Talk
Efficient Implementation of Neural Networks Usual Layers on Fixed-Point Architectures
LCTES
Dorra Ben Khalifa University of Toulouse - ENAC, Matthieu Martel Université de Perpignan Via Domitia
11:10
15m
Talk
TinySeg: Model Optimizing Framework for Image Segmentation on Tiny Embedded Systems
LCTES
Byungchul Chae Kyung Hee University, Jiae Kim Kyung Hee University, Seonyeong Heo Kyung Hee University
11:25
10m
Break
Break - 10 minutes
LCTES

11:35
15m
Talk
MixPert: Optimizing Mixed-Precision Floating-Point Emulation on GPU Integer Tensor Cores
LCTES
Zejia Lin Sun Yat-sen University, Aoyuan Sun Sun Yat-sen University, Xianwei Zhang Sun Yat-sen University, Yutong Lu Sun Yat-sen University
11:50
15m
Talk
Optimistic and Scalable Global Function Merging
LCTES
12:05
15m
Talk
(Invited paper) Language-Based Deployment Optimization for Random Forest
LCTES
Jannik Malcher TU Dortmund University, Daniel Biebert TU Dortmund University, Kuan-Hsun Chen University of Twente, Sebastian Buschjäger TU Dortmund University, Christian Hakert TU Dortmund University, Jian-Jia Chen TU Dortmund University