Disassembly of Executable Code Revisited
Benjamin Schwarz,
Saumya Debray,
Gregory Andrews
Department of Computer Science
University of Arizona
Tucson, AZ 85721, U.S.A.
Abstract
Machine code disassembly routines form a fundamental component of software
systems that statically analyze or modify executable programs. The task of
disassembly is complicated by indirect jumps and
the presence of non-executable data---jump
tables, alignment bytes, etc.---in the instruction stream. Existing
disassembly algorithms are not always able to cope successfully with
executable files containing such features and fail silently---i.e.,
produce incorrect disassemblies without any indication that the
results they are producing are incorrect. This can be a serious problem,
since it can compromise the correctness of a binary rewriting tool.
In this paper we examine two commonly-used disassembly algorithms and illustrate
their shortcomings. We propose a hybrid approach that performs better
than these algorithms in the sense that it is able to detect situations
where the disassembly may be incorrect and limit the
extent of such disassembly errors. Experimental results indicate that
the algorithm is quite effective: the amount of code flagged as incurring
disassembly errors is usually quite small.