Don’t Write, but Return: Replacing Output Parameters with Algebraic Data Types in C-to-Rust Translation
Translating legacy system programs from C to Rust is a promising way to enhance their reliability. To alleviate the burden of manual translation, automatic C-to-Rust translation is desirable. However, existing translators fail to generate Rust code fully utilizing Rust’s language features, including algebraic data types. In this work, we focus on tuples and Option
/Result
types, an important subset of algebraic data types. They are used as functions’ return types to represent those returning multiple values and those that may fail. Due to the absence of these types, C programs use output parameters, i.e., pointer-type parameters for producing outputs, to implement such functions. As output parameters make code less readable and more error-prone, their use is discouraged in Rust. To address this problem, this paper presents a technique for removing output parameters during C-to-Rust translation. This involves three steps: (1) syntactically translating C code to Rust using an existing translator; (2) analyzing the Rust code to extract information related to output parameters; and (3) transforming the Rust code using the analysis result. The second step poses several challenges, including the identification and classification of output parameters. To overcome these challenges, we propose a static analysis based on abstract interpretation, complemented by the notion of abstract read/write sets, which approximate the sets of read/written pointers, and two sensitivities: write set sensitivity and nullity sensitivity. Our evaluation shows that the proposed technique is (1) scalable, with the analysis and transformation of 190k LOC within 213 seconds, (2) useful, with the detection of 1,670 output parameters across 55 real-world C programs, and (3) mostly correct, with 25 out of 26 programs passing their test suites after the transformation.