

Home > Research > Compilers

Compiler transformations are applied under strict conditions to ensure the semantic equivalence of the code before and after a codeoptimizing transformation. The conditions for loop transformations are based on extensive loop code analysis to find linear and nonlinear induction variables, determine crossiteration data dependences, and other looprelated properties. Loop parallelization for multicore machines gives reasonable speedups, because most of the execution time of (scientific) applications is spent in loops. We are investigating new loop analysis techniques based on the Chains of Recurrences (CR) algebra. We applied the method to automatic loop parallelization, SIMD vectorization, and strength reduction. We are also investigating efficient methods for loop iteration space bounding, with applications to worstcase execution time (WCET) estimation.

Techniques for Fast Evaluation of Numerical Functions

Y. Shou and R. van Engelen, Automatic SIMD Vectorization of Chains of Recurrences, in the proceedings of the ACM International Conference on Supercomputing (ICS), 2008.
Many computational tasks require repeated evaluation of functions over structured grids, such as plotting in a coordinate system, rendering of parametric objects in 2D and 3D, numerical grid generation, and signal processing. In this paper, we present a method to speed up closedform function evaluations over grids by vectorizing Chains of Recurrences (CR) forms of these functions. Results show a significant performance increase of our method over Intel's SVML and scalar CRs, from doubling the execution speed to running an order of magnitude faster. An autotuning tool is developed for optimal performance and accuracy.
Data Dependence Analysis Techniques

J. Birch, R. van Engelen, K. Gallivan, and Y. Shou, An Empirical Evaluation of Chains of Recurrences for Array Dependence Testing, in the proceedings of to the ACM/IEEE International Conference on Parallel Architectures and Compilation Techniques (PACT), 2006, pages 295304.
This paper evaluates a new approach for fast and accurate nonlinear array dependence testing using Chains of Recurrences (CRs). We implemented a new CRbased Banerjee test in the Polaris compiler and compared the results to the Omega test and Range test on a set of SPEC and LAPACK Benchmark programs. The experimental results suggest that a CR enhancement can dramatically increase the effectiveness of a dependence test without a significant cost increase.
R. van Engelen, J. Birch, Y. Shou, B. Walsh, and K. Gallivan, A Unified Framework for Nonlinear Dependence Testing and Symbolic Analysis, in the proceedings of the ACM International Conference on Supercomputing (ICS), 2004, pages 106115.
This paper presents a unified approach for generalized induction variable recognition and substitution, pointer analysis, analysis of conditionally updated variables, value range analysis, array region analysis, and nonlinear dependence testing. The analysis techniques share a welldefined uniform approach based on the Chains of Recurrences algebra. The paper introduces a new set of analysis algorithms that accurately handle conditional control flow, pointer arithmetic, and nonlinear symbolic expressions in loops, which are known to be problematic for conventional restructuring compilers.
J. Birch, R. van Engelen, and K. Gallivan, Value Range Analysis of Conditionally Updated Variables and Pointers, in the proceedings of the workshop on Compilers for Parallel Computing (CPC), 2004, pages 265276.
This paper presents a method for value range analysis on conditionally updated induction variables and pointers which have no closedform functions. The value range method determines the memory access patterns of pointers and arrays indexed with conditionally updated induction variables. Important applications of this method are in array bounds analysis and data dependence testing.
R. van Engelen, J. Birch, and K. Gallivan, Array Data Dependence Testing with the Chains of Recurrences Algebra, in the proceedings of the IEEE International Workshop on Innovative Architectures for Future Generation HighPerformance Processors and Systems (IWIA), January 2004, pages 7081.
This paper presents an approach to dependence testing in the presence of nonlinear and nonclosed array index expressions and pointer references. A recurrence formulation is uded to enhance the accuracy of standard dependence algorithms such as the extreme value test and range test. Because the recurrence forms are easily converted to closed forms (when they exist), induction variable substitution and array recovery can be delayed until after the loop is analyzed.
See also an extended report of this paper:
R. van Engelen, J. Birch, Y. Shou, and K. Gallivan, Array Data Dependence Testing with the Chains of Recurrences Algebra, Technical Report TR041201, Departement of Computer Science, Florida State University, 2004.
Induction Variable Recognition and Variable Classification in Linear Time

Y. Shou, R. van Engelen, and J. Birch, FlowSensitive LoopVariant Variable Classification in Linear Time, in the proceedings of the International Workshop on Languages and Compilers for Parallel Computing (LCPC), 2007.
This paper presents an efficient algorithm for classifying generalized induction variables and more complicated flowsensitive loopvariant variables that have arbitrary conditional update patterns along multiple paths in a loop nest. Variables are recognized and translated into closedform functions, such as linear, polynomial, geometric, wraparound, periodic, and mixer functions. The remaining flowsensitive variables (those that have no closed forms) are bounded by tight bounding functions on their value sequences by bounds derived from our extensions of the Chains of Recurrences (CR#) algebra. The classification algorithm has a linear worstcase execution time in the size of the SSA region of a loop nest. Classification coverage and performance results for the SPEC2000 benchmarks are given and compared to other methods.
See also an extended report of this paper:
Y. Shou, R. van Engelen, and J. Birch, FlowSensitive LoopVariant Variable Classification in Linear Time, Technical Report TR071005, Departement of Computer Science, Florida State University, 2007.
Y. Shou, R. van Engelen, J. Birch, and K. Gallivan, Toward Efficient FlowSensitive Induction Variable Analysis and Dependence Testing for Loop Optimization, in the proceedings of the ACM SouthEast conference, 2006, pages 16 (best paper award).
This paper presents the CR# (CRsharp) algebra, which is a nontrivial extension of the Chains of Recurrences (CR) algebra. The CR# algebra is introduced, efficient CR#based algorithms presented, and several potential applications for compiler analysis are described.
See also an extended report on which this paper is based:
R. van Engelen, The CR# Algebra and its Application in Loop Analysis and Optimization, Technical Report TR041223, Department of Computer Science, Florida State University, 2004.
R. van Engelen and K. Gallivan, An Efficient Algorithm for PointertoArray Access Conversion for Compiling and Optimizing DSP Applications, in the proceedings of the 2001 International Workshop on Innovative Architectures for Future Generation HighPerformance Processors and Systems (IWIA 2001), 1819 January 2001, Maui, Hawaii, pages 8089.
In this paper, we present a novel algorithm for converting pointerbased code to code with explicit array accesses. The conversion enables a compiler to perform data flow analysis and loop optimizations on DSP codes that exhibit considerable levels of pointer arithmetic in C.
R. van Engelen, Efficient Symbolic Analysis for Optimizing Compilers, in the proceedings of the ETAPS International Conference on Compiler Construction (CC), 2001, LNCS 2027, pages 118132.
This paper introduces a generic algorithm for generalized induction variable substitution with the Chains of Recurrences (CR) algebra in compilers. Value sequences of polynomial, geometric, and mixed forms are recognized. The algorithm proceeds in three stages: (1) bottomup scan of a loop body to collect recurrence relations, (2) translation into CR forms with simplification using the CR algebra, and (3) translation to closedforms for induction variable substitution.
See also an extended report of this paper:
R. van Engelen, Symbolic Evaluation of Chains of Recurrences for Loop Optimization, Technical Report TR000102, Department of Computer Science, Florida State University, 2000.
Parametric Timing Estimation by Formula Construction and its Application to DVS

R. van Engelen, K. Gallivan, and B. Walsh, Parametric Timing Estimation With the NewtonGregory Formulae, Journal of Concurrency and Computation: Practice and Experience, Volume 18, Number 10, September 2006, pages 14341464.
This paper presents a formula construction method for parametric worstcase execution time (WCET) estimation of loops. The method determines a parametric bound on the iteration space size of loops with both affine and nonaffine loop bounds in an efficient manner using a formulation based on Newton–Gregory interpolating polynomials. Parametric WCET formulae can be derived for loops with symbolic bounds, nonrectangular loops, zerotrip loops, loops with multiple critical paths, and loops with nonunit strides
R. van Engelen, K. Gallivan, and B. Walsh, Tight Timing Estimation With the NewtonGregory Formulae, in the proceedings of the Compilers for Parallel Computing (CPC) workshop, January 810, 2003, Amsterdam, Netherlands, pages 321330.
This is a shorter conference paper of the journal article referenced above.
B. Walsh, R. van Engelen, K. Gallivan, J. Birch, and Y. Shou, Parametric IntraTask Dynamic Voltage Scheduling, in the proceedings of the IEEE PACT Workshop on Compilers and Operating Systems for Low Power (COLP), 2003.
In this paper, parametric WCET formulae of loops are constructed and used to guide dynamic voltage scaling (DVS) of regions of code containing loops. A new Ptype scaling point is introduced to control DVS at the loop level.
R. van Engelen and K. Gallivan, Tight NonLinear Loop Timing Estimation, in the proceedings of the 2002 International Workshop on Innovative Architectures for Future Generation HighPerformance Processors and Systems (IWIA), January 2002, Maui, Hawaii, pages 2126.
In this paper, we propose a new framework to compute tight parameterized WCET bounds for multiple loop nests that may include non rectangular loops, zerotrip loops, and loops with nonunit strides. The framework requires only very simple symbolic manipulation capabilities and is restricted only by certain monotonic properties required of the iteration spaces of the inner loops.
C. Healy, M. Sjodin, V. Rustagi, D. Whalley, and R. van Engelen, Supporting Timing Analysis by Automatic Bounding of Loop Iterations, in RealTime Systems journal, Special Issue by P. Puschner and A. Burns on WorstCase ExecutionTime Analysis, 2000, pages 121148.
This paper describes three complementary methods to support timing analysis by bounding the number of loop iterations: (1) an algorithm to determine the minimum and maximum number of iterations of loops with multiple exits, (2) when the number of iterations is dependent on unknown values of variables, the user is asked to provide bounds for these variables, and (3) a method to tightly predict the execution time of inner loops whose number of iterations is dependent on counter variables of outer level loops by formulating the total number of iterations of a loop in terms of summations and solving the resulting equation.
C. Healy, R. van Engelen, and D. Whalley, A General Approach for Tight Timing Predictions of NonRectangular Loops, in WIP proceedings of the 5th IEEE RealTime Technology and Application Symposium, June 24, 1999, Vencouver, BC, Canada, pages 1114.
This is a shorter conference paper of the journal article referenced above.
