Home People Research Software Links of interest Contact us
Home > Research > Compilers

Compiler transformations are applied under strict conditions to ensure the semantic equivalence of the code before and after a code-optimizing transformation. The conditions for loop transformations are based on extensive loop code analysis to find linear and nonlinear induction variables, determine cross-iteration data dependences, and other loop-related properties. Loop parallelization for multicore machines gives reasonable speedups, because most of the execution time of (scientific) applications is spent in loops. We are investigating new loop analysis techniques based on the Chains of Recurrences (CR) algebra. We applied the method to automatic loop parallelization, SIMD vectorization, and strength reduction. We are also investigating efficient methods for loop iteration space bounding, with applications to worst-case execution time (WCET) estimation.

Techniques for Fast Evaluation of Numerical Functions


Y. Shou and R. van Engelen, Automatic SIMD Vectorization of Chains of Recurrences, in the proceedings of the ACM International Conference on Supercomputing (ICS), 2008.

Many computational tasks require repeated evaluation of functions over structured grids, such as plotting in a coordinate system, rendering of parametric objects in 2D and 3D, numerical grid generation, and signal processing. In this paper, we present a method to speed up closed-form function evaluations over grids by vectorizing Chains of Recurrences (CR) forms of these functions. Results show a significant performance increase of our method over Intel's SVML and scalar CRs, from doubling the execution speed to running an order of magnitude faster. An auto-tuning tool is developed for optimal performance and accuracy.

Data Dependence Analysis Techniques


J. Birch, R. van Engelen, K. Gallivan, and Y. Shou, An Empirical Evaluation of Chains of Recurrences for Array Dependence Testing, in the proceedings of to the ACM/IEEE International Conference on Parallel Architectures and Compilation Techniques (PACT), 2006, pages 295-304.

This paper evaluates a new approach for fast and accurate nonlinear array dependence testing using Chains of Recurrences (CRs). We implemented a new CR-based Banerjee test in the Polaris compiler and compared the results to the Omega test and Range test on a set of SPEC and LAPACK Benchmark programs. The experimental results suggest that a CR enhancement can dramatically increase the effectiveness of a dependence test without a significant cost increase.

R. van Engelen, J. Birch, Y. Shou, B. Walsh, and K. Gallivan, A Unified Framework for Nonlinear Dependence Testing and Symbolic Analysis, in the proceedings of the ACM International Conference on Supercomputing (ICS), 2004, pages 106-115.

This paper presents a unified approach for generalized induction variable recognition and substitution, pointer analysis, analysis of conditionally updated variables, value range analysis, array region analysis, and nonlinear dependence testing. The analysis techniques share a well-defined uniform approach based on the Chains of Recurrences algebra. The paper introduces a new set of analysis algorithms that accurately handle conditional control flow, pointer arithmetic, and nonlinear symbolic expressions in loops, which are known to be problematic for conventional restructuring compilers.

J. Birch, R. van Engelen, and K. Gallivan, Value Range Analysis of Conditionally Updated Variables and Pointers, in the proceedings of the workshop on Compilers for Parallel Computing (CPC), 2004, pages 265-276.

This paper presents a method for value range analysis on conditionally updated induction variables and pointers which have no closed-form functions. The value range method determines the memory access patterns of pointers and arrays indexed with conditionally updated induction variables. Important applications of this method are in array bounds analysis and data dependence testing.

R. van Engelen, J. Birch, and K. Gallivan, Array Data Dependence Testing with the Chains of Recurrences Algebra, in the proceedings of the IEEE International Workshop on Innovative Architectures for Future Generation High-Performance Processors and Systems (IWIA), January 2004, pages 70-81.

This paper presents an approach to dependence testing in the presence of nonlinear and non-closed array index expressions and pointer references. A recurrence formulation is uded to enhance the accuracy of standard dependence algorithms such as the extreme value test and range test. Because the recurrence forms are easily converted to closed forms (when they exist), induction variable substitution and array recovery can be delayed until after the loop is analyzed.

See also an extended report of this paper:

R. van Engelen, J. Birch, Y. Shou, and K. Gallivan, Array Data Dependence Testing with the Chains of Recurrences Algebra, Technical Report TR-041201, Departement of Computer Science, Florida State University, 2004.

Induction Variable Recognition and Variable Classification in Linear Time


Y. Shou, R. van Engelen, and J. Birch, Flow-Sensitive Loop-Variant Variable Classification in Linear Time, in the proceedings of the International Workshop on Languages and Compilers for Parallel Computing (LCPC), 2007.

This paper presents an efficient algorithm for classifying generalized induction variables and more complicated flow-sensitive loop-variant variables that have arbitrary conditional update patterns along multiple paths in a loop nest. Variables are recognized and translated into closed-form functions, such as linear, polynomial, geometric, wrap-around, periodic, and mixer functions. The remaining flow-sensitive variables (those that have no closed forms) are bounded by tight bounding functions on their value sequences by bounds derived from our extensions of the Chains of Recurrences (CR#) algebra. The classification algorithm has a linear worst-case execution time in the size of the SSA region of a loop nest. Classification coverage and performance results for the SPEC2000 benchmarks are given and compared to other methods.

See also an extended report of this paper:

Y. Shou, R. van Engelen, and J. Birch, Flow-Sensitive Loop-Variant Variable Classification in Linear Time, Technical Report TR-071005, Departement of Computer Science, Florida State University, 2007.

Y. Shou, R. van Engelen, J. Birch, and K. Gallivan, Toward Efficient Flow-Sensitive Induction Variable Analysis and Dependence Testing for Loop Optimization, in the proceedings of the ACM SouthEast conference, 2006, pages 1-6 (best paper award).

This paper presents the CR# (CR-sharp) algebra, which is a nontrivial extension of the Chains of Recurrences (CR) algebra. The CR# algebra is introduced, efficient CR#-based algorithms presented, and several potential applications for compiler analysis are described.

See also an extended report on which this paper is based:

R. van Engelen, The CR# Algebra and its Application in Loop Analysis and Optimization, Technical Report TR-041223, Department of Computer Science, Florida State University, 2004.

R. van Engelen and K. Gallivan, An Efficient Algorithm for Pointer-to-Array Access Conversion for Compiling and Optimizing DSP Applications, in the proceedings of the 2001 International Workshop on Innovative Architectures for Future Generation High-Performance Processors and Systems (IWIA 2001), 18-19 January 2001, Maui, Hawaii, pages 80-89.

In this paper, we present a novel algorithm for converting pointer-based code to code with explicit array accesses. The conversion enables a compiler to perform data flow analysis and loop optimizations on DSP codes that exhibit considerable levels of pointer arithmetic in C.

R. van Engelen, Efficient Symbolic Analysis for Optimizing Compilers, in the proceedings of the ETAPS International Conference on Compiler Construction (CC), 2001, LNCS 2027, pages 118-132.

This paper introduces a generic algorithm for generalized induction variable substitution with the Chains of Recurrences (CR) algebra in compilers. Value sequences of polynomial, geometric, and mixed forms are recognized. The algorithm proceeds in three stages: (1) bottom-up scan of a loop body to collect recurrence relations, (2) translation into CR forms with simplification using the CR algebra, and (3) translation to closed-forms for induction variable substitution.

See also an extended report of this paper:

R. van Engelen, Symbolic Evaluation of Chains of Recurrences for Loop Optimization, Technical Report TR-000102, Department of Computer Science, Florida State University, 2000.

Parametric Timing Estimation by Formula Construction and its Application to DVS


R. van Engelen, K. Gallivan, and B. Walsh, Parametric Timing Estimation With the Newton-Gregory Formulae, Journal of Concurrency and Computation: Practice and Experience, Volume 18, Number 10, September 2006, pages 1434-1464.

This paper presents a formula construction method for parametric worst-case execution time (WCET) estimation of loops. The method determines a parametric bound on the iteration space size of loops with both affine and non-affine loop bounds in an efficient manner using a formulation based on Newton–Gregory interpolating polynomials. Parametric WCET formulae can be derived for loops with symbolic bounds, non-rectangular loops, zero-trip loops, loops with multiple critical paths, and loops with non-unit strides

R. van Engelen, K. Gallivan, and B. Walsh, Tight Timing Estimation With the Newton-Gregory Formulae, in the proceedings of the Compilers for Parallel Computing (CPC) workshop, January 8-10, 2003, Amsterdam, Netherlands, pages 321-330.

This is a shorter conference paper of the journal article referenced above.

B. Walsh, R. van Engelen, K. Gallivan, J. Birch, and Y. Shou, Parametric Intra-Task Dynamic Voltage Scheduling, in the proceedings of the IEEE PACT Workshop on Compilers and Operating Systems for Low Power (COLP), 2003.

In this paper, parametric WCET formulae of loops are constructed and used to guide dynamic voltage scaling (DVS) of regions of code containing loops. A new P-type scaling point is introduced to control DVS at the loop level.

R. van Engelen and K. Gallivan, Tight Non-Linear Loop Timing Estimation, in the proceedings of the 2002 International Workshop on Innovative Architectures for Future Generation High-Performance Processors and Systems (IWIA), January 2002, Maui, Hawaii, pages 21-26.

In this paper, we propose a new framework to compute tight parameterized WCET bounds for multiple loop nests that may include non- rectangular loops, zero-trip loops, and loops with non-unit strides. The framework requires only very simple symbolic manipulation capabilities and is restricted only by certain monotonic properties required of the iteration spaces of the inner loops.

C. Healy, M. Sjodin, V. Rustagi, D. Whalley, and R. van Engelen, Supporting Timing Analysis by Automatic Bounding of Loop Iterations, in Real-Time Systems journal, Special Issue by P. Puschner and A. Burns on Worst-Case Execution-Time Analysis, 2000, pages 121-148.

This paper describes three complementary methods to support timing analysis by bounding the number of loop iterations: (1) an algorithm to determine the minimum and maximum number of iterations of loops with multiple exits, (2) when the number of iterations is dependent on unknown values of variables, the user is asked to provide bounds for these variables, and (3) a method to tightly predict the execution time of inner loops whose number of iterations is dependent on counter variables of outer level loops by formulating the total number of iterations of a loop in terms of summations and solving the resulting equation.

C. Healy, R. van Engelen, and D. Whalley, A General Approach for Tight Timing Predictions of Non-Rectangular Loops, in WIP proceedings of the 5th IEEE Real-Time Technology and Application Symposium, June 2-4, 1999, Vencouver, BC, Canada, pages 11-14.

This is a shorter conference paper of the journal article referenced above.





Last modified Sunday, May 11, 2008
Copyright: Robert van Engelen (C)