Journal Articles

Apan Qasem and Joshua Magee. Improving TLB performance on current chip multiprocessor architectures through demand-driven superpaging. Software Practice and Experience (SPE), 43(6):750-729, 2013.
      BibTeX

Santosh Sarangkar and Apan Qasem. Mats: A model-driven adaptive tuning system for parallel workloads. Journal of Parallel and Cloud Computing (JPCC), 1(2):50-64, 2012.
      BibTeX

Apan Qasem. High-level language extensions for fast execution of pipeline-parallelized code on current chip multi-processor systems. International Journal of Programming Languages and Applications (IJPLA), 2(3):1-12, 2012.
      BibTeX

Apan Qasem. Architectural considerations for compiler-guided unroll-and-jam of cuda kernels. American Journal of Computer Architecture, 1(2):12-20, 2012.
      BibTeX

Apan Qasem. Autotuning strategies for reducing synchronization costs in multithreaded kernels. Journal of Systems and Software, 2(4):152-165, 2012.
      BibTeX

Hammad Rashid, Clara Novoa, Mark McKenney, and Apan Qasem. Efficient parallel solutions to the integral knapsack problem on current chip-multiprocessor systems. International Journal of Parallel, Emergent and Distributed Systems (IJPEDS), 27(1):19-44, 2012.
      BibTeX

Apan Qasem and Ken Kennedy. Model-guided empirical tuning of loop fusion. International Journal of High Performance Systems Architecture (IJHPSA), 1(3):183-198, 2008.
      BibTeX

Apan Qasem, Ken Kennedy, and John M. Mellor-Crummey. Automatic tuning of whole applications using direct search and a performance-based transformation system. The Journal of Supercomputing, 36(2):183-196, 2006.
      BibTeX

Conference and Workshop Papers

Jim Holt, George Bazzera, Apan Qasem, Jason Miller, and Henry Hoffmann. A pattern language for adaptive parallel software. In Proceedings of the 20th International Conference on Pattern Languages of Programs, 2013.
      BibTeX

Christopher R Hyatt, Greg R. LaKomski, Dan Tamir, and Apan Qasem. Power aware task matching and migration in heterogeneous processing environments. In Proceedings of the 15th Annual TECHCON Conference, 2013.
      BibTeX

Shwetha Shankar, Dan Tamir, and Apan Qasem. Towards an operating system based framework for energy-efficient scheduling of parallel workloads. In Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA), 2013.
      BibTeX

Apan Qasem and Schmichael Chen. Using macro features in learning algorithms for optimizing dense-matrix computations. Technical Report CSTR12-17, Dept. of Computer Science, Texas State University, January 2012.
      BibTeX

Apan Qasem, Michael Jason Cade, and Dan Tamir. Improved energy efficiency for multithreaded kernels through model-based autotuning. In Proceedings of the 2012 IEEE Green Technology Conferenc (GTC12), pages 1-12, 2012.
      BibTeX

Swapneela Unkule, Christopher Shaltz, and Apan Qasem. Automatic restructuring of GPU kernels for exploiting inter-thread data locality. In Proc. Int'l. Conf. on Compiler Construction (CC12), pages 21-40, 2012.
      BibTeX

Apan Qasem. Efficient execution of time-step computations with pipelined parallelism and inter-thread data locality optimizaitions. In Proceedings of the 2012 PPOPP International Workshop on Programming Models and Applications for Multicores and Manycores (PMAM12), pages 27-35, 2012.
      BibTeX

Apan Qasem and Dan Tamir. Memory performance diagnosis through feedback synthesis. In Proceeding of the Workshop on Feedback-Directed Compiler Optimization for Multi-Core Architectures (COMA12 a HIPEAC workshop), pages 5-10, 2012.
      BibTeX

Faizur Rahman, Qing Yi, and Apan Qasem. Understanding stencil code performance on multicore architectures. In Conf. Computing Frontiers (CF11), pages 30-45, 2011.
      BibTeX

Swapneela Unkule and Apan Qasem. Regisxter pressure aware code transformations on GPU. In 24th International Conference on High Performance Computing Networking, Storage and Analysis - Companion Volume (SC11), pages 19-20, 2011.
      BibTeX

Clara Novoa, Apan Qasem, Hammad Rashid, and Mark McKenney. Dynamic programming solutions for the integral knapsack problem on multicore architectures, (extended abstract). In 11th INFORMS Computing Society Conference, (ICS11), 2011.
      BibTeX

Santosh Sarangkar and Apan Qasem. Intelligent feedback for fast and effective autotuning, (extended poster abstract). In 23rd International Conference on High Performance Computing, Networking, Storage and Analysis - Companion Volume (SC10), 2010.
      BibTeX

Qing Yi, Jichi Guo, and Apan Qasem. Evaluating the role of optimization-specific search heuristics in effective autotuning (short paper). In 23rd International Workshop Languages and Compilers for Parallel Computing (LCPC10), 2010.
      BibTeX

Apan Qasem. Locality-conscious superpaging for improved tlb behavior of stencil computations. In Proceedings of the 2010 International Conference on High Performance Computing Systems (HPCS10), 2010.
      BibTeX

Qing Yi, Santosh Sarangkar, and Apan Qasem. Improving autotuning effciency and portability through feedback diagnostics. In Proceedings of the Fifth International Workshop on Automatic Performance Tuning (iWAPT10), 2010.
      BibTeX

Hammad Rashid, Clara Novoa, and Apan Qasem. An evaluation of parallel knapsack algorithms on multicore architectures. In Proceedings of the 2010 International Conference on Scientific Computing (CSC10), pages 230-235, 2010.
      BibTeX

Santosh Sarangkar and Apan Qasem. Restructuring parallel loops to curb false sharing on multicore architectures. In 24th IEEE International Symposium on Parallel and Distributed Processing (IPDPS Workshops), pages 1-7, 2010.
      BibTeX

Apan Qasem, Jichi Guo, Faizur Rahman, and Qing Yi. Exposing tunable parameters in multi-threaded numerical code. In Network and Parallel Computing, IFIP International Conference, (NPC10), pages 46-60, 2010.
      BibTeX

Joshua Magee and Apan Qasem. A case for compiler-driven superpage allocation. In Proceedings of the 47th Annual Southeast Regional Conference, (ACMSE09), 2009.
      BibTeX

Michael Jason Cade and Apan Qasem. Balancing locality and parallelism on shared-cache mulit-core systems. In 11th IEEE International Conference on High Performance Computing and Communications (HPCC09), pages 188-195, 2009.
      BibTeX

Qing Yi and Apan Qasem. Exploring the optimization space of dense linear algebra kernels. In 21st International Workshop Languages and Compilers for Parallel Computing (LCPC08), pages 343-355, 2008.
      BibTeX

Apan Qasem. Evaluating an early-stop criterion and a statistical pruning strategy of the optimization search space. In Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA), pages 506-510, 2008.
      BibTeX

Apan Qasem and Ken Kennedy. Pruning the optimization search space using architectureaware cost models. In Proceedings of the First Workshop on Statistical and Machine Learning Approaches Applied to Architecture and Compilation (SMART07), 2007.
      BibTeX

Apan Qasem and Ken Kennedy. Profitable loop fusion and tiling using model-driven empirical search. In Proceedings of the 20th Annual International Conference on Supercomputing (ICS), pages 249-258, 2006.
      BibTeX

Apan Qasem and Ken Kennedy. A cache-conscious profitability model for empirical tuning of loop fusion. In 18th International Workshop on Languages and Compilers for Parallel Computing, (LCPC), pages 106-120, 2005.
      BibTeX

Apan Qasem, Ken Kennedy, and John Mellor-Crummey. Automatic tuning of whole applications using direct search and a performance-based transformation system. In Proceedings of the Los Alamos Computer Science Institute 5th Annual Symposium (LACSI04), 2004.
      BibTeX

Robert Fowler, John Mellor-Crummey, Guohua Jin, and Apan Qasem. A source-to-source loop transformation tool (extended poster abstract). In Proceedings of the Los Alamos Computer Science Institute 3rd Annual Symposium (LACSI02), 2002.
      BibTeX

Apan Qasem, David B. Whalley, Xin Yuan, and Robert van Engelen. Using a swap instruction to coalesce loads and stores. In 7th International Euro-Par Conference Parallel Processing, (EuroPar01), pages 235-240, 2001.
      BibTeX

Apan Qasem. Automatic Tuning of Scientific Applications. PhD thesis, Rice University, July 2007.
      BibTeX

Automatically created from self.bib at Wed May 28 08:50:35 2014 by yab2web.

Technical Reports

Theses

  • Hammad Rashid, Parallel Knapsack Algorithms on Multicore Architectures, Masters Thesis, (Advisor: Apan Qasem), Texas State University, May 2010.
  • Joshua A. Magee, Automated Compiler Driven Superpage Allocation and its Applications, Masters Thesis, (Advisor: Apan Qasem), Texas State University, Dec 2008.
  • Michael Jason Cade, Balancing Data Locality and Parallelism for Improved Application Performance on Multi-core Platforms, Masters Thesis, (Advisor: Apan Qasem), Texas State University, Dec 2008.
  • Apan Qasem, Automatic Tuning of Scientific Applications, Ph.D. Dissertation, Rice University, Jul 2007.
  • Apan Qasem, Using a Swap Instruction to Reduce Memory Accesses in Applications, Masters Thesis, Florida State University, May 2001.

Disclaimer

The material on this page is presented to ensure timely dissemination of scholarly and technical work on a noncommercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Usage/Citation Stats

DBLP entry

Google Scholar

CiteseerX

Microsoft Academic Research

Co-authors

WikiCFP

Contact

Apan Qasem
Department of Computer Science
Texas State University
601 University Dr
San Marcos, TX 78666

Office: Nueces 218
Phone: (512) 245-0347
Fax: (512) 245-8750
E-mail: apan "AT" txstate · edu