Accelerating Shared Library Execution in a DBT (LCTES 2024 - Languages, Compilers, Tools and Theory of Embedded Systems)

Who

Tom Spink, Björn Franke

Track

LCTES 2024

Time Zone

The program is currently displayed in (GMT+02:00) Windhoek.

Use conference time zone: (GMT+02:00) WindhoekSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Mon 24 Jun 2024 10:40 - 10:55 at Iceland - Optimization Chair(s): Aviral Shrivastava

Abstract

User-mode Dynamic Binary Translation (DBT) has recently received renewed interest, not least due to Apple’s transition towards the Arm ISA, supported by a DBT compatibility layer for x86 legacy applications. While receiving praise for its performance, execution of legacy applications through Apple’s Rosetta 2 technology still incurs a performance penalty when compared to direct host execution. A particular limitation of Rosetta 2 is that code is either executed exclusively as native Arm code, or as translated Arm code. In particular, mixed mode execution of native Arm code and translated code is not possible. This is a missed opportunity, especially in the case of shared libraries where both optimized x86 and Arm versions of the same library are available. In this paper, we develop mixed mode execution capabilities for shared libraries in a DBT system, eliminating the need to translate code where a highly optimised native version already exists. Our novel execution model intercepts calls to shared library functions in the DBT system and automatically redirects them to their faster host counterparts, making better use of the underlying host ISA. To ease the burden for the developer, we make use of an Interface Description Language (IDL) to capture library function signatures, from which relevant stubs and data marshalling code are generated automatically. We have implemented our novel mixed mode execution approach in the open-source QEMU DBT system, and demonstrate both ease of use and performance benefits for three popular libraries (standard C Math library, SQLite, and OpenSSL). Our evaluation confirms that with minimal developer effort, accelerated host execution of shared library functionality results in speedups between 2.7x and 6.3x on average, and up to 28x for x86 legacy applications on an Arm host system.

Tom Spink

University of St Andrews

United Kingdom

Björn Franke

University of Edinburgh