DSI: Automated detection of dynamic data structures in C programs and binary code
|Author(s):||Rupprecht, Thomas; Boockmann, Jan ; White, David H.; Lüttgen, Gerald|
|By:||... ; White, David; ...|
|Editors:||Amme, Wolfram; Heinze, Thomas|
|Title of the compilation:||Programmiersprachen und Grundlagen der Programmierung : 19. Kolloquium, KPS 2017 ; Weimar, 25.–27. September 2017 ; Tagungsband|
|Publisher Information:||Weimar : Friedrich-Schiller-Universität Jena|
|Year of publication:||2017|
|Series ; Volume:||Jenaer Schriften zur Mathematik und Informatik ; Math/Inf/ 02 /2017|
Program comprehension is an important task for software engineers who maintain legacy code, as well as for reverse engineers who analyse binary executables such as malware. Detecting dynamic, i.e., pointer-based data structures is a particular challenge due to the complex usage of pointers found in real world software.
This paper presents the key results of the DFG-funded project "Learning Data Structure Behaviour from Executions of Pointer Programs" (DSI), in which dynamic analysis techniques have been developed to identify dynamic data structures in C programs and x86 binary code. DSI's analysis utilizes a novel memory abstraction that allows for a compact description of pointer-based data structures such as linked lists and binary trees, and their interconnections such as parent-child nesting. On top of this abstraction, an evidence-collecting approach calculates a natural language description of the observed data structure with the help of a systematic taxonomy. The inferred data structure information is not only helpful for program comprehension but also for other use cases including software verification and software visualization.
|Release Date:||20. November 2018|