Publications

This page contains some of the research papers associated with the ROSE project over the last several years. For numerous reasons, we feel that the latest papers are the best papers, this is likely typical of any ambitious project; but we have included everything for completeness. It is hoped that the underlying goal within each paper of supporting the use of high-level abstractions will be clear together with our attempts to address the performance issues required for the use of high-level abstractions within scientific computing.

2015

  • [DOI] C. Liao, P. Lin, D. J. Quinlan, Y. Zhao, and X. Shen, “Enhancing domain specific language implementations through ontology,” in Proceedings of the 5th international workshop on domain-specific languages and high-level frameworks for high performance computing, New York, NY, USA, 2015, p. 3:1–3:9.
    [Bibtex]
    @inproceedings{Liao:2015:EDS:2830018.2830022,
    author = {Liao, Chunhua and Lin, Pei-Hung and Quinlan, Daniel J. and Zhao, Yue and Shen, Xipeng},
    title = {Enhancing Domain Specific Language Implementations Through Ontology},
    booktitle = {Proceedings of the 5th International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing},
    series = {WOLFHPC '15},
    year = {2015},
    isbn = {978-1-4503-4016-8},
    location = {Austin, Texas},
    pages = {3:1--3:9},
    articleno = {3},
    numpages = {9},
    url = {http://doi.acm.org/10.1145/2830018.2830022},
    doi = {10.1145/2830018.2830022},
    acmid = {2830022},
    publisher = {ACM},
    address = {New York, NY, USA},
    keywords = {compiler, domain-specific language, high-performance computing, knowledge base, ontology},
    }

[pdf][presentation slide] 

  • [DOI] P. Lin, C. Liao, D. J. Quinlan, and S. Guzik, “Experiences of using the openmp accelerator model to port DOE stencil applications,” in Openmp: heterogenous execution and data movements – 11th international workshop on openmp, IWOMP 2015, aachen, germany, october 1-2, 2015, proceedings, 2015, pp. 45-59.
    [Bibtex]
    @inproceedings{DBLP:conf/iwomp/LinLQG15,
    author = {Pei-Hung Lin and
    Chunhua Liao and
    Daniel J. Quinlan and
    Stephen Guzik},
    title = {Experiences of Using the OpenMP Accelerator Model to Port {DOE} Stencil
    Applications},
    booktitle = {OpenMP: Heterogenous Execution and Data Movements - 11th International
    Workshop on OpenMP, {IWOMP} 2015, Aachen, Germany, October 1-2, 2015,
    Proceedings},
    pages = {45--59},
    year = {2015},
    crossref = {DBLP:conf/iwomp/2015},
    url = {http://dx.doi.org/10.1007/978-3-319-24595-9_4},
    doi = {10.1007/978-3-319-24595-9_4},
    timestamp = {Thu, 01 Oct 2015 15:09:30 +0200},
    biburl = {http://dblp.uni-trier.de/rec/bib/conf/iwomp/LinLQG15},
    bibsource = {dblp computer science bibliography, http://dblp.org}
    }

[pdf][presentation slide] 

  • [DOI] Y. Yan, P. Lin, C. Liao, B. R. de Supinski, and D. J. Quinlan, “Supporting multiple accelerators in high-level programming models,” in Proceedings of the sixth international workshop on programming models and applications for multicores and manycores, New York, NY, USA, 2015, pp. 170-180.
    [Bibtex]
    @inproceedings{multiGPU2014,
    author = {Yan, Yonghong and Lin, Pei-Hung and Liao, Chunhua and de Supinski, Bronis R. and Quinlan, Daniel J.},
    title = {Supporting Multiple Accelerators in High-level Programming Models},
    booktitle = {Proceedings of the Sixth International Workshop on Programming Models and Applications for Multicores and Manycores},
    series = {PMAM '15},
    year = {2015},
    isbn = {978-1-4503-3404-4},
    location = {San Francisco, California},
    pages = {170--180},
    numpages = {11},
    url = {http://doi.acm.org/10.1145/2712386.2712405},
    doi = {10.1145/2712386.2712405},
    acmid = {2712405},
    publisher = {ACM},
    address = {New York, NY, USA},
    }

[pdf][presentation slide] 

2014

  • [DOI] M. Schordan, P. Lin, D. Quinlan, and L. Pouchet, “Verification of polyhedral optimizations with constant loop bounds in finite state space computations,” in Leveraging applications of formal methods, verification and validation. specialized techniques and applications, T. Margaria and B. Steffen, Eds., Springer Berlin Heidelberg, 2014, vol. 8803, pp. 493-508.
    [Bibtex]
    @incollection{schordan2014,
    year={2014},
    isbn={978-3-662-45230-1},
    booktitle={Leveraging Applications of Formal Methods, Verification and Validation. Specialized Techniques and Applications},
    volume={8803},
    series={Lecture Notes in Computer Science},
    editor={Margaria, Tiziana and Steffen, Bernhard},
    doi={10.1007/978-3-662-45231-8_41},
    title={Verification of Polyhedral Optimizations with Constant Loop Bounds in Finite State Space Computations},
    url={http://dx.doi.org/10.1007/978-3-662-45231-8_41},
    publisher={Springer Berlin Heidelberg},
    author={Schordan, Markus and Lin, Pei-Hung and Quinlan, Dan and Pouchet, Louis-Noël},
    pages={493-508},
    language={English}
    }

2013

  • C. Liao, Y. Yan, B. R. de Supinski, D. J. Quinlan, and B. Chapman, “Early experiences with the openmp accelerator model,” in Openmp in the era of low power devices and accelerators, Springer, 2013, pp. 84-98.
    [Bibtex]
    @incollection{liao2013,
    title={Early Experiences with the OpenMP Accelerator Model},
    author={Liao, Chunhua and Yan, Yonghong and de Supinski, Bronis R and Quinlan, Daniel J and Chapman, Barbara},
    booktitle={OpenMP in the Era of Low Power Devices and Accelerators},
    pages={84--98},
    year={2013},
    publisher={Springer}
    }

[pdf][presentation slide] 

In this paper, we examine the newly released accelerator directives and create an initial reference implementation, referred to as HOMP (Heterogeneous OpenMP). Focused on targeting NVIDIA GPUs, our work is based on an existing OpenMP implementation in the ROSE source-to-source compiler infrastructure. HOMP includes extensions to parse the new constructs and to represent them in the AST and other compiler translation details. Further we provide initial runtime support. For our evaluation, we have adapted a few existing OpenMP codes to use the accelerator model directives and present preliminary performance results. Finally, we critique the accelerator model in terms of its impact on developers and compiler writers and suggest possible improvements.

2012

  • H. Ma, Q. Chen, L. and Wang, C. Liao, and D. Quinlan, “Openmp-checker: detecting concurrency errors of openmp programs using hybrid program analysis,” in Poster paper icpp’12, the 41st international conference on parallel processing, , 2012.
    [Bibtex]
    @incollection{ma2012,
    title={OpenMP-Checker: Detecting Concurrency Errors of OpenMP Programs Using Hybrid Program Analysis},
    author={Hongyi Ma and Qichang Chen and and Liqiang Wang and Chunhua Liao and Daniel Quinlan},
    booktitle={Poster paper ICPP’12, The 41st International Conference on Parallel Processing},
    year = {2012},
    month = {September},
    location={Pittsburgh, PA}
    }

This paper presents a novel technique to detect data races and deadlocks of OpenMP programs, using hybrid program analysis. Specifically, we use an SMT-solver based static analysis to analyze OpenMP source code. Then we use a dynamic analysis to confirm, or rule out, the potential errors. The static analysis narrows down the code regions and events that need to be monitored, significantly reducing the overhead of the dynamic analysis. Our experiments show that OpenMP-Checker is more scalable and accurate at pinpointing concurrency errors within a set of chosen benchmarks, compared to the two commercial tools, Sun Thread Analyzer and Intel Thread Checker.

  • J. Lidman, D. J. Quinlan, C. Liao, and S. A. McKee, “Rose:: fttransform-a source-to-source translation framework for exascale fault-tolerance research,” in Dependable systems and networks workshops (dsn-w), 2012 ieee/ifip 42nd international conference on, 2012, pp. 1-6.
    [Bibtex]
    @inproceedings{lidman2012rose,
    title={ROSE:: FTTransform-A source-to-source translation framework for exascale fault-tolerance research},
    author={Lidman, Jacob and Quinlan, Daniel J and Liao, Chunhua and McKee, Sally A},
    booktitle={Dependable Systems and Networks Workshops (DSN-W), 2012 IEEE/IFIP 42nd International Conference on},
    pages={1--6},
    year={2012},
    organization={IEEE}
    }

This paper presents a compiler based transformation released in ROSE and demonstrates the use of Triple Modular Redundancy as an approach to provide HPC software with fault tolerance against transient faults, as we expect them to manifest themselves on future Exascale architectures. The paper presents performance results showing that for a randomly selected subset of benchmarks the overhead of this extra layer of support is about 20%. We expect that may be competitive with future approaches to fault tolerance using check-point restart that may be much more expensive or maybe even intractable for Exascale. This work is released as a framework within ROSE to support research work in this area by ourselves and collaborators.

  • S. Royuela, A. Duran, C. Liao, and D. J. Quinlan, “Auto-scoping for openmp tasks,” in Openmp in a heterogeneous world, Springer, 2012, pp. 29-43.
    [Bibtex]
    @incollection{royuela2012auto,
    title={Auto-scoping for OpenMP tasks},
    oocation={Pittsburgh, PA},
    author={Royuela, Sara and Duran, Alejandro and Liao, Chunhua and Quinlan, Daniel J},
    booktitle={OpenMP in a Heterogeneous World},
    pages={29--43},
    year={2012},
    publisher={Springer}
    }

This paper presents an auto-scoping algorithm to work with OpenMP tasks. (Auto-scoping is the process of automatically determining the data sharing dependencies of variables in OpenMP programs). This is a much more complex challenge due to the uncertainty of when a task will be executed, which makes it harder to determine what parts of the program will run concurrently. We also introduce an implementation of the algorithm and results with several benchmarks showing that the algorithm is able to correctly scope a large percentage of the variables appearing in them.

  • S. M. F. Rahman, J. Guo, A. Bhat, C. Garcia, M. H. Sujon, Q. Yi, C. Liao, and D. Quinlan, “Studying the impact of application-level optimizations on the power consumption of multi-core architectures,” in Proceedings of the 9th conference on computing frontiers, 2012, pp. 123-132.
    [Bibtex]
    @inproceedings{rahman2012studying,
    title={Studying the impact of application-level optimizations on the power consumption of multi-core architectures},
    author={Rahman, Shah Mohammad Faizur and Guo, Jichi and Bhat, Akshatha and Garcia, Carlos and Sujon, Majedul Haque and Yi, Qing and Liao, Chunhua and Quinlan, Daniel},
    booktitle={Proceedings of the 9th conference on Computing Frontiers},
    pages={123--132},
    year={2012},
    organization={ACM}
    }

[pdf]

This paper presents an extensive study of the impact of application level optimizations on both the performance and power efficiencies of applications from a wide range of scientific and embedded systems domains. We observe that application-level optimizations often have a much larger impact on performance than on power consumption. However, optimizing for performance does not necessarily lead to better power consumption, and vice versa. Compared to sequential applications, multithreaded applications give more room for performance and power improvements. Additionally, a number of optimizations, including loop and thread affinity optimizations, have shown great potential in supporting collective enhancement of both performance and power efficiency. Our experimental results provide several insights to help exploit these optimizations effectively.

  • T. Nguyen, P. Cicotti, E. Bylaska, D. Quinlan, and S. B. Baden, “Bamboo: translating mpi applications to a latency-tolerant, data-driven form,” in Proceedings of the international conference on high performance computing, networking, storage and analysis, 2012, p. 39.
    [Bibtex]
    @inproceedings{nguyen2012bamboo,
    title={Bamboo: translating MPI applications to a latency-tolerant, data-driven form},
    author={Nguyen, Tan and Cicotti, Pietro and Bylaska, Eric and Quinlan, Dan and Baden, Scott B},
    booktitle={Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis},
    pages={39},
    year={2012},
    organization={IEEE Computer Society Press}
    }

[pdf]

2011

  • J. Shalf, D. Quinlan, and C. Janssen, “Rethinking hardware-software codesign for exascale systems,” Computer, vol. 44, iss. 11, pp. 22-30, 2011.
    [Bibtex]
    @article{shalf2011rethinking,
    title={Rethinking hardware-software codesign for exascale systems},
    author={Shalf, John and Quinlan, Dan and Janssen, Curtis},
    journal={Computer},
    volume={44},
    number={11},
    pages={22--30},
    year={2011},
    publisher={IEEE}
    }

This paper presents work combining the LBL node-simulator, the SNL, network simulator, and the ROSE compiler to demonstrate analysis of software and the workflow required for such tools to analyze the power requirements of HPC code using autotuning to define optimial points in the design space. The paper lays out an approach to co-design at the start of work that is a part of the CoDEX project lead by LBL and including both SNL and LLNL.

  • D. Quinlan and C. Liao, “The rose source-to-source compiler infrastructure,” in Cetus users and compiler infrastructure workshop, in conjunction with pact 2011, 2011.
    [Bibtex]
    @inproceedings{rose2011,
    title={The ROSE Source-to-Source Compiler Infrastructure},
    author={Dan Quinlan and Chunhua Liao},
    booktitle={Cetus Users and Compiler Infrastructure Workshop, in conjunction with PACT 2011},
    year={2011},
    month={October}
    }

 pdf

  • M. J. Sottile, C. E. Rasmussen, W. N. Weseloh, R. W. Robey, D. Quinlan, and J. Overbey, “Foropencl: transformations exploiting array syntax in fortran for accelerator programming,” in 2nd international workshop on gpus and scientific applications (gpusca 2011), 2011, p. 23.
    [Bibtex]
    @inproceedings{sottile2011foropencl,
    title={ForOpenCL: Transformations Exploiting Array Syntax in Fortran for Accelerator Programming},
    author={Sottile, Matthew J and Rasmussen, Craig E and Weseloh, Wayne N and Robey, Robert W and Quinlan, Daniel and Overbey, Jeffrey},
    booktitle={2nd International Workshop on GPUs and Scientific Applications (GPUScA 2011)},
    pages={23},
    year={2011}
    }

This paper presents an OpenCL code generator leveraging the semantics of the F90 array constructs. Such GPU work is expected to be an important part of future Exascale programming environments, this work demonstrates how ROSE is used to support the analysis of the input code, and the translation and code generation required to generate OpenCL code for GPUs.

  • P. Pirkelbauer, C. Liao, T. Panas, and D. Quinlan, “Runtime detection of c-style errors in upc code,” in Proceedings of fifth conference on partitioned global address space programming models, pgas, 2011.
    [Bibtex]
    @inproceedings{pirkelbauer2011runtime,
    title={Runtime detection of c-style errors in upc code},
    author={Pirkelbauer, Peter and Liao, Chunhua and Panas, Thomas and Quinlan, Dan},
    booktitle={Proceedings of fifth conference on partitioned global address space programming models, PGAS},
    volume={11},
    year={2011}
    }

pdf

This paper present work to define a dynamic analysis for correctness of UPC usage and leverages the RTED test suite from Iowa State University. This work is released in ROSE and shows how to build a dynamic analysis level of support to catch errors as represented by test codes in the RTED test suit for UPC. The correctness of using programming models is an important aspect of the design of future programming models for Exascale. This paper shows how to design dynamic analysis-based tools to evaluate correctness of the UPC languages programming model.

2010

  • C. Liao, D. J. Quinlan, T. Panas, and B. R. de Supinski, “A rose-based openmp 3.0 research compiler supporting multiple runtime libraries,” in Beyond loop level parallelism in openmp: accelerators, tasking and more, Springer, 2010, pp. 15-28.
    [Bibtex]
    @incollection{liao2010rose,
    title={A ROSE-based OpenMP 3.0 research compiler supporting multiple runtime libraries},
    author={Liao, Chunhua and Quinlan, Daniel J and Panas, Thomas and de Supinski, Bronis R},
    booktitle={Beyond Loop Level Parallelism in OpenMP: Accelerators, Tasking and More},
    pages={15--28},
    year={2010},
    publisher={Springer}
    }

 pdf

  • C. Liao, D. J. Quinlan, J. J. Willcock, and T. Panas, “Semantic-aware automatic parallelization of modern applications using high-level abstractions,” International journal of parallel programming, vol. 38, iss. 5-6, pp. 361-378, 2010.
    [Bibtex]
    @article{liao2010semantic,
    title={Semantic-aware automatic parallelization of modern applications using high-level abstractions},
    author={Liao, Chunhua and Quinlan, Daniel J and Willcock, Jeremiah J and Panas, Thomas},
    journal={International Journal of Parallel Programming},
    volume={38},
    number={5-6},
    pages={361--378},
    year={2010},
    publisher={Springer}
    }

pdf

2009

  • C. Liao, D. Quinlan, and T. Panas, “Towards an abstraction-friendly programming model for high productivity and high performance computing,” Lawrence Livermore National Laboratory (LLNL), Livermore, CA 2009.
    [Bibtex]
    @techreport{liao2009towards,
    title={Towards an Abstraction-Friendly Programming Model for High Productivity and High Performance Computing},
    author={Liao, Chunhua and Quinlan, D and Panas, Thomas},
    year={2009},
    institution={Lawrence Livermore National Laboratory (LLNL), Livermore, CA}
    }

pdf  This is a short position paper.

  • C. Liao, D. J. Quinlan, R. Vuduc, and T. Panas, “Effective source-to-source outlining to support whole program empirical optimization,” in Languages and compilers for parallel computing, Springer, 2010, pp. 308-322.
    [Bibtex]
    @incollection{liao2010effective,
    title={Effective source-to-source outlining to support whole program empirical optimization},
    author={Liao, Chunhua and Quinlan, Daniel J and Vuduc, Richard and Panas, Thomas},
    booktitle={Languages and Compilers for Parallel Computing},
    pages={308--322},
    year={2010},
    publisher={Springer}
    }

pdf 

This paper describes our work of using ROSE to build an effective source-to-source outliner in order to support whole program empirical optimization (also called autotuning). The ROSE outliner addresses the problem of extracting tunable kernels out of large scale applications, thereby helping to convert the challenging whole-program tuning problem into a set of more manageable kernel tuning tasks. In particular, the outliner can generate kernels which preserve performance characteristics of tuning targets which can be easily handled by other tools. This work also demonstrates how one can use ROSE’s compiler analyses to enhance the quality of source-to-source translation.

  • A. Sæbj{o}rnsen, J. Willcock, T. Panas, D. Quinlan, and Z. Su, “Detecting code clones in binary executables,” in Proceedings of the eighteenth international symposium on software testing and analysis, 2009, pp. 117-128.
    [Bibtex]
    @inproceedings{saebjornsen2009detecting,
    title={Detecting code clones in binary executables},
    author={S{\ae}bj{\o}rnsen, Andreas and Willcock, Jeremiah and Panas, Thomas and Quinlan, Daniel and Su, Zhendong},
    booktitle={Proceedings of the eighteenth international symposium on Software testing and analysis},
    pages={117--128},
    year={2009},
    organization={ACM}
    }

pdf

  • T. Panas and D. Quinlan, “Techniques for software quality analysis of binaries: applied to windows and linux,” Defects, vol. 9, pp. 6-10, 2009.
    [Bibtex]
    @article{panas2009techniques,
    title={Techniques for software quality analysis of binaries: Applied to Windows and Linux},
    author={Panas, Thomas and Quinlan, Daniel},
    journal={DEFECTS},
    volume={9},
    pages={6--10},
    year={2009}
    }

 pdf

  • C. Liao, D. J. Quinlan, J. J. Willcock, and T. Panas, “Extending automatic parallelization to optimize high-level abstractions for multicore,” in Evolving openmp in an age of extreme parallelism, Springer, 2009, pp. 28-41.
    [Bibtex]
    @incollection{liao2009extending,
    title={Extending automatic parallelization to optimize high-level abstractions for multicore},
    author={Liao, Chunhua and Quinlan, Daniel J and Willcock, Jeremiah J and Panas, Thomas},
    booktitle={Evolving OpenMP in an Age of Extreme Parallelism},
    pages={28--41},
    year={2009},
    publisher={Springer}
    }

 pdf 

This paper describes an approach to extending automatic parallelization to optimize applications written using high level abstractions. This work exemplifies a typical usage of ROSE and an initial work by us on the general subject of how to leverage semantics associated with high level of abstractions to enable more optimizations.

2008

  • T. Panas, “Signature visualization of software binaries,” in Proceedings of the 4th acm symposium on software visualization, 2008, pp. 185-188.
    [Bibtex]
    @inproceedings{panas2008signature,
    title={Signature visualization of software binaries},
    author={Panas, Thomas},
    booktitle={Proceedings of the 4th ACM symposium on Software visualization},
    pages={185--188},
    year={2008},
    organization={ACM}
    }

 pdf

  • D. J. Quinlan, G. Barany, and T. Panas, “Towards distributed memory parallel program analysis,” in Scalable program analysis, Dagstuhl, Germany, 2008.
    [Bibtex]
    @InProceedings{quinlan_et_al:DSP:2008:1568,
    author =  {Daniel J. Quinlan and Gerg{\"o} Barany and Thomas Panas},
    title =  {Towards Distributed Memory Parallel Program Analysis},
    booktitle =  {Scalable Program Analysis},
    year =  {2008},
    editor =  {Florian Martin and Hanne Riis Nielson and Claudio Riva and Markus Schordan},
    number =  {08161},
    series =  {Dagstuhl Seminar Proceedings},
    ISSN =  {1862-4405},
    publisher =  {Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, Germany},
    address =  {Dagstuhl, Germany},
    URL = {http://drops.dagstuhl.de/opus/volltexte/2008/1568},
    annote =  {Keywords: Parallel computing, attribute evaluation, program analysis}
    }

 pdf

2007

  • D. Quinlan, G. Barany, and T. Panas, “Shared and distributed memory parallel security analysis of large-scale source code and binary applications,” Lawrence Livermore National Laboratory (LLNL), Livermore, CA 2007.
    [Bibtex]
    @techreport{quinlan2007shared,
    title={Shared and Distributed Memory Parallel Security Analysis of Large-Scale Source Code and Binary Applications},
    author={Quinlan, Dan and Barany, Gergo and Panas, Thomas},
    year={2007},
    institution={Lawrence Livermore National Laboratory (LLNL), Livermore, CA}
    }

 pdf

  • T. Panas, T. Epperly, D. Quinlan, A. Saebjornsen, and R. Vuduc, “Communicating software architecture using a unified single-view visualization,” in Engineering complex computer systems, 2007. 12th ieee international conference on, 2007, pp. 217-228.
    [Bibtex]
    @inproceedings{panas2007communicating,
    title={Communicating software architecture using a unified single-view visualization},
    author={Panas, Thomas and Epperly, Thomas and Quinlan, Daniel and Saebjornsen, Andreas and Vuduc, Richard},
    booktitle={Engineering Complex Computer Systems, 2007. 12th IEEE International Conference on},
    pages={217--228},
    year={2007},
    organization={IEEE}
    }

 pdf

  • D. J. Quinlan, R. W. Vuduc, and G. Misherghi, “Techniques for specifying bug patterns,” in Proceedings of the 2007 acm workshop on parallel and distributed systems: testing and debugging, 2007, pp. 27-35.
    [Bibtex]
    @inproceedings{quinlan2007techniques,
    title={Techniques for specifying bug patterns},
    author={Quinlan, Daniel J and Vuduc, Richard W and Misherghi, Ghassan},
    booktitle={Proceedings of the 2007 ACM workshop on Parallel and distributed systems: testing and debugging},
    pages={27--35},
    year={2007},
    organization={ACM}
    }

 pdf

  • T. Panas, D. Quinlan, and R. Vuduc, “Analyzing and visualizing whole program architectures,” in Icse workshop on aerospace software engineering (aerose), minneapolis, mn, 2007.
    [Bibtex]
    @inproceedings{panas2007analyzing,
    title={Analyzing and Visualizing Whole Program Architectures},
    author={Panas, T and Quinlan, D and Vuduc, R},
    booktitle={ICSE Workshop on Aerospace Software Engineering (AeroSE), Minneapolis, MN},
    year={2007}
    }

 pdf

  • T. Panas, D. Quinlan, and R. Vuduc, “Tool support for inspecting the code quality of hpc applications,” in Proceedings of the 3rd international workshop on software engineering for high performance computing applications, 2007, p. 2.
    [Bibtex]
    @inproceedings{panas2007tool,
    title={Tool support for inspecting the code quality of hpc applications},
    author={Panas, Thomas and Quinlan, Dan and Vuduc, Richard},
    booktitle={Proceedings of the 3rd International Workshop on Software Engineering for High Performance Computing Applications},
    pages={2},
    year={2007},
    organization={IEEE Computer Society}
    }

 pdf

2006

  • R. Vuduc, M. Schulz, D. Quinlan, B. De Supinski, and A. Sæbj{o}rnsen, “Improving distributed memory applications testing by message perturbation,” in Proceedings of the 2006 workshop on parallel and distributed systems: testing and debugging, 2006, pp. 27-36.
    [Bibtex]
    @inproceedings{vuduc2006improving,
    title={Improving distributed memory applications testing by message perturbation},
    author={Vuduc, Richard and Schulz, Martin and Quinlan, Dan and De Supinski, Bronis and S{\ae}bj{\o}rnsen, Andreas},
    booktitle={Proceedings of the 2006 workshop on Parallel and distributed systems: testing and debugging},
    pages={27--36},
    year={2006},
    organization={ACM}
    }

 pdf

  • D. Quinlan, R. Vuduc, T. Panas, J. Härdtlein, and A. Sæbj{o}rnsen, “Support for whole-program analysis and the verification of the one-definition rule in c+,” Paul e. black, helen gill, and w. bradley martin (co-chairs), vol. 500, p. 27, 2006.
    [Bibtex]
    @article{quinlan2006support,
    title={Support for Whole-Program Analysis and the Verification of the One-Definition Rule in C+},
    author={Quinlan, Dan and Vuduc, Richard and Panas, Thomas and H{\"a}rdtlein, Jochen and S{\ae}bj{\o}rnsen, Andreas},
    journal={Paul E. Black, Helen Gill, and W. Bradley Martin (co-chairs)},
    volume={500},
    pages={27},
    year={2006}
    }

 pdf

  • Parameterization and Search-space Exploitation of Loop Fusion, 2006 pdf .
    This represent some recent work on empirical optimization (not yet published). This entry should likely be removed until it is published.

2001-2005

  • B. S. White, S. A. McKee, B. R. de Supinski, B. Miller, D. Quinlan, and M. Schulz, “Improving the computational intensity of unstructured mesh applications,” in Proceedings of the 19th annual international conference on supercomputing, 2005, pp. 341-350.
    [Bibtex]
    @inproceedings{white2005improving,
    title={Improving the computational intensity of unstructured mesh applications},
    author={White, Brian S and McKee, Sally A and de Supinski, Bronis R and Miller, Brian and Quinlan, Daniel and Schulz, Martin},
    booktitle={proceedings of the 19th annual international conference on Supercomputing},
    pages={341--350},
    year={2005},
    organization={ACM}
    }

 pdf 

This paper is about the optimization of unstructured grid applications and represent preparatory work for future automated transformations specific to unstructured grid applications within DOE using ROSE.

  • Q. Yi and D. Quinlan, “Applying loop optimizations to object-oriented abstractions through general classification of array semantics,” in Languages and compilers for high performance computing, Springer, 2005, pp. 253-267.
    [Bibtex]
    @incollection{yi2005applying,
    title={Applying loop optimizations to object-oriented abstractions through general classification of array semantics},
    author={Yi, Qing and Quinlan, Dan},
    booktitle={Languages and Compilers for High Performance Computing},
    pages={253--267},
    year={2005},
    publisher={Springer}
    }

  pdf 

This paper outlines an approach to the optimization of user-defined abstractions. This work represents a substantial goal for ROSE and an initial work by us on the general subject of how to write code at a very high level of abstraction and have the lower level code required to get good performance be automatically generated. This paper covers the details of optimizing object-oriented abstractions usingROSE. Unfortunately, ROSE is not mentioned anywhere in the paper, a ridiculous oversight, but oh well. The subject is the optimization, not the ROSE compiler infrastructure.

  • D. Quinlan, M. Schordan, Q. Yi, and A. Saebjornsen, Classification and utilization of abstractions for optimization, Springer, 2006.
    [Bibtex]
    @book{quinlan2006classification,
    title={Classification and utilization of abstractions for optimization},
    author={Quinlan, Dan and Schordan, Markus and Yi, Qing and Saebjornsen, Andreas},
    year={2006},
    publisher={Springer}
    }

 pdf This paper is a general introduction to recent work in the ROSE project.

  • Schordan M., Quinlan D., “A Source-To-Source Architecture for User-Defined Optimizations”, Joint Modular Languages Conference held in conjunction with EuroPar’03, Austria, August 2003 pdf .
    This paper covers the architecture of ROSE as a project.
  • Daniel J. Quinlan, Markus Schordan, Qing Yi, Bronis R. de Supinski: Semantic-Driven Parallelization of Loops Operating on User-Defined Containers. LCPC 2003: 524-538 pdf .

This paper is the informal proceedings version and demonstrates the optimization of generalized container abstractions and is related to Active Library research (or so I understand). It is also related to Telescoping Language research. The paper demonstrates a few of the newest features in ROSE and has served an an introduction for the authors into the optimization of the STL library more generally.

  • Daniel J. Quinlan, Markus Schordan, Qing Yi, Bronis R. de Supinski: A C++ Infrastructure for Automatic Introduction and Translation of OpenMP Directives. WOMPAT 2003: 13-25 pdf .

This paper demonstrates the use of ROSE to recognize OpenMP pragmas and, using the Nanos OpenMP runtime library, build a subset of an OpenMP specific compiler for C++.

  • Quinlan, D. J., Miller, B., Philip, B., and Schordan, M. 2002. Treating a User-Defined Parallel Library as a Domain-Specific Language. In Proceedings of the 16th international Parallel and Distributed Processing Symposium (April 15 – 19, 2002). IEEE Computer Society, Washington, DC, 324. pdf .

This paper is specific to compile-time optimization of array classes. It demonstrates what was at the time the most current work on the compile-time optimization of an array class library. ROSE is more general, but this paper is very specific to the optimization of a single library.

  • Quinlan, D. Schordan, M. Philip, B. Kowarschik, M. “Parallel Object-Oriented Framework Optimization”, Special Issue of Concurrency: Practice and Experience (2003), also in Proceedings of Conference on Parallel Compilers (CPC2001), Edinburgh, Scotland, June 2001. pdf .

This is one of the first papers on ROSE presented at CPC2001 and later updated for publication into the Journal of Concurrency, Practice, and Experience.

  • Quinlan, D., Schordan, M. Philip, B. Kowarschik, M. “The Specification of Source-To-Source Transformations for the Compile-Time Optimization of Parallel Object-Oriented Scientific Applications”, Submitted to Parallel Processing Letters, also in Proceedings of 14th Workshop on Languages and Compilers for Parallel Computing (LCPC2001), Cumberland Falls, KY, August 1-3 2001. pdf .

This was a paper which specified some elements of what later became the string based AST rewrite mechanism used in ROSE.

  • D. Quinlan and B. Philip, “ROSETTA: The Compile-Time Recognition of Object-Oriented Library Abstractions and Their Use Within User Applications”, in Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA 2001), 2001 pdf .

This paper describes the development of a tool, ROSETTA, which build object-oriented Intermediate Representations (IRs) for compilers. It is a tool used within ROSE to build the SAGE III IR which we use internally with the EDG front-end. It is specific to details of the internal ROSE compiler infrastructure.

2000 and Earlier

  • Quinlan, D., “ROSE: Compiler Support for Object-Oriented Frameworks” Proceedings of Conference on Parallel Compilers (CPC2000), Aussois, France, January 2000. Also published in special issue of Parallel Processing Letters, Vol. 10. pdf 

This paper was an introduction to the work being done at the time on ROSE complete with a more detailed motivation for compile-time optimization of specific libraries.

  • Kei Davis and Dan Quinlan, ROSE II: An Optimizing Code Transformer for C++ Object-Oriented Array Class Libraries, World Multiconference on Systemics, Cybernetics and Informatics and 5th International Conference on Information Systems Analysis and Synthesis Vol.5: Computer Science and Engineering, Jul 31-Aug 4, 1999, Orlando, Florida pdf 
    This paper present preliminary work on the compile-time optimization of array class libraries.
  • F. Bassetti, K. Davis, D. Quinlan, “C++ Expression Templates Performance Issues in Scientific Computing,” ipps, pp.0635, 12th. International Parallel Processing Symposium, 1998 pdf .

Discusses the different approaches to the optimization of array class libraries. Optimization of array class libraries led to the development of ROSE as a project, though ROSE is not at all specific to array class libraries and addresses the optimization of libraries generally. This paper can be helpful in understanding what work was done using language template features within C++ before attempting to address the optimization issues more generally at compile time. Prior work started on ROSE had been abandoned because of the perceived significant advantages of template meta-programming techniques for scientific computing. Several papers on the details of template use were written, this is the most complete of them. It is included with these papers to provide a bit of perspective (currently historical).

Related Papers

This section is incomplete.

  • R. Parsons and D. Quinlan, “A++/P++ array classes for architecture independent finite difference computations,” in Proceedings of the second annual object-oriented numerics conference (oonski’94), 1994.
    [Bibtex]
    @INPROCEEDINGS{PQ94,
    author = {Rebecca Parsons and Dan Quinlan},
    title = {A++/{P}++ Array Classes for Architecture Independent Finite Difference
    Computations},
    booktitle = {Proceedings of the Second Annual Object-Oriented Numerics Conference
    (OONSKI'94)},
    year = {1994},
    month = {April}
    }
  • M. Lemke and D. Quinlan, “P++, a c++ virtual shared grids based programming environment for architecture-independent development of structured grid applications,” in Preceeding of the conpar/vapp v, 1992.
    [Bibtex]
    @INPROCEEDINGS{PQ92,
    author = {M. Lemke and Dan Quinlan},
    title = {P++, a C++ Virtual Shared Grids Based Programming Environment for
    Architecture-Independent Development of Structured Grid Applications},
    booktitle = {Preceeding of the CONPAR/VAPP V},
    year = {1992},
    location = {Lyon, France},
    month = {September}
    }
  • D. Brown, W. Henshaw, and D. Quinlan, “Overture: a framework for the complex geometries,” in Proceedings of the iscope’99 conference, 1999.
    [Bibtex]
    @INPROCEEDINGS{PQ99,
    author = {D. Brown and W. Henshaw and Dan Quinlan},
    title = {OVERTURE: A Framework for the Complex Geometries},
    booktitle = {Proceedings of the ISCOPE'99 Conference},
    year = {1999},
    location = {San Francisco, CA},
    month = {December}
    }
  • Guy Steele’s “Growing A Language”
  • Telescoping Languages work at Rice (Ken)
  • Plus/Minus Languages (Bjarne)