Predicate Analysis and If-Conversion in an Itanium Link-Time Optimizer
Noah Snavely,
Saumya Debray,
Gregory Andrews
Department of Computer Science
University of Arizona
Tucson, AZ 85721, U.S.A.
Abstract
EPIC architectures, such as the Intel IA-64 (Itanium),
combine explicit instruction-level parallelism with instruction predication.
To generate efficient code, it is important to use predication
effectively. In particular, it is important to replace conditional branches
and multiple code blocks by single, branch-free code blocks when doing
so would lead to faster code. This process, which is known as if-conversion,
is generally carried out early in the code-generation process; hence
subsequent analyses and optimizations have to deal with predicated code.
This paper examines an alternative approach in which code is unpredicated
during disassembly, the internal representations are virtually identical
to those in a conventional architecture (specifically the IA-32 Pentium)
and if-conversion is done late in the compilation process,
at the same time as instruction scheduling and just before code layout.
This paper also presents new algorithms for analyzing predicated code
and evaluates their efficacy.
We show that our approach is able to produce code that
is denser (fewer nop instructions) and almost as fast as the best code
produced by the Intel ecc compiler on the SPECint-2000 benchmark suite.
On the same programs, our predicate analysis and if-conversion algorithms
lead to an average speed improvement of a little over 4% on the
best code produced by the gcc compiler.