Publications – Habanero Extreme Scale Software Research Lab

2025

Mapping Spiking Neural Networks to Heterogeneous Crossbar Architectures using Integer Linear Programming. Devin Pohl, Aaron Young, Kazi Asifuzzaman, Narasinga Rao Miniskar, Jeffrey S. Vetter. 2025 Design, Automation & Test in Europe Conference (DATE). March 2025.
Asynchronous Distributed-Memory Parallel Algorithm for k-mer Counting. Souvadra Hati, Akihiro Hayashi, Richard Vuduc. 39th IEEE International Parallel & Distributed Processing Symposium (IPDPS25). June 2025.
Enhancing Productivity and Performance of HClib-Actor with Efficient Task Termination. Youssef Elmougy, Nirjhar Deb, Akihiro Hayashi, Vivek Sarkar. 27th Workshop on Advances in Parallel and Distributed Computational Models (APDCM, co-located with IPDPS25) (to appear).
Divide, Conquer, and Match: A Distributed and Asynchronous Approach for Subgraph Isomorphism. Youssef Elmougy, Akihiro Hayashi, Vivek Sarkar. Workshop on Graphs, Architectures, Programming, and Learning (GrAPL, co-located with IPDPS25).
ActorISx: Exploiting Asynchrony for Scalable High-Performance Integer Sort. Youssef Elmougy, Shubhendra Pal Singhal, Akihiro Hayashi, Vivek Sarkar. IEEE TCSC International Scalable Computing Challenge (SCALE 2025, co-located with CCGRID25)

2024

Asynchronous Distributed-Memory Parallel Algorithms for Influence Maximization. Shubhendra Pal Singhal, Souvadra Hati, Jeffrey Young, Vivek Sarkar, Akihiro Hayashi, Richard Vuduc. International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’24). Nov 2024.
ActorProf: A Framework for Profiling and Visualizing Fine-grained Asynchronous Bulk Synchronous Parallel Execution. Jiawei Yang, Shubhendra Pal Singhal, Jun Shirako, Akihiro Hayashi, Vivek Sarkar. Workshop on Programming and Performance Visualization Tools (ProTools2024, co-located with SC24). November 2024.
Enabling User-level Asynchronous Tasking in the FA-BSP Model – Case Study: Distributed Triangle Counting. Akihiro Hayashi, Shubhendra Pal Singhal, Youssef Elmougy, Jiawei Yang. The Vivek Sarkar Festschrift Symposium (VIVEKFEST2024, co-located with SPLASH24). October 2024.
Intrepydd: Toward Performance, Productivity, and Portability for Massive Heterogeneous Parallelism. Jun Shirako, Tong Zhou, Akihiro Hayashi. The Vivek Sarkar Festschrift Symposium (VIVEKFEST2024, co-located with SPLASH24). October 2024.
On the Cloud We Can’t Wait: Asynchronous Actors Perform Even Better on the Cloud. Aniruddha Mysore, Youssef Elmougy, Akihiro Hayashi. The Vivek Sarkar Festschrift Symposium (VIVEKFEST2024, co-located with SPLASH24). October 2024.
Feiyang Jin, Alan Tao, Lechen Yu, Vivek Sarkar (2024). Visualizing Correctness Issues in OpenMP Programs. IWOMP 2024. September 2024.
Bottleneck Scenarios in use of the Conveyors Message Aggregation Library. Shubhendra Pal Singhal, Akihiro Hayashi, Vivek Sarkar. IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS’24, Poster Session). May 2024.
Asynchronous Distributed Actor-Based Approach to Jaccard Similarity for Genome Comparisons. Youssef Elmougy, Akihiro Hayashi, Vivek Sarkar. 39th International Conference ISC High Performance. May 2024.
A Distributed, Asynchronous Algorithm for Large-Scale Internet Network Topology Analysis. Youssef Elmougy, Akihiro Hayashi, Vivek Sarkar. IEEE TCSC International Scalable Computing Challenge (SCALE 2024, co-located with CCGRID24) Recipient of Best SCALE Challenge Award. May 2024.
APPy: Annotated Parallelism for Python on GPUs. Tong Zhou, Jun Shirako, Vivek Sarkar. The 33rd ACM SIGPLAN International Conference on Compiler Construction (CC ’24), March 2024.

2023

Early notice: GenAI-based Datarace Fix for Real-World Golang Programs
Feiyang Jin, Zhizhou Zhang, Rajkishore Barik, Gautam Korlam, and Milind Chabbi. Machine Learning for Systems Workshops at 37th NeurIPS, December 2023.
Concrete Type Inference for Code Optimization Using Machine Learning with SMT Solving. Fangke Ye, Jisheng Zhao, Jun Shirako, Vivek Sarkar. SPLASH/OOPSLA 2023, October 2023.
Towards Safe HPC: Productivity and Performance via Rust interfaces for a Distributed C++ Actors library. John T. Parrish, Nicole Wren, Tsz Hang Kiang, Akihiro Hayashi, Jeffrey Young, Vivek Sarkar. 20th International Conference on Managed Programming Languages & Runtimes (MPLR, co-located with SPLASH). October 2023.
Dynamic Determinacy Race Detection for Task-Parallel Programs with Promises. Feiyang Jin, Lechen Yu, Tiago Cogumbreiro, Jun Shirako, and Vivek Sarkar. European Conference on Object-Oriented Programming (ECOOP), July 2023.
Enabling CHIP-SPV in Chapel GPUAPI module. Jisheng Zhao, Akihiro Hayashi, Brice Videau, and Vivek Sarkar. The 10th Annual Chapel Implementers and Users Workshop (CHIUW 2023), June 2023.
Enabling Multi-threading in Heterogeneous Quantum-Classical Programming Models. Enabling Multi-threading in Heterogeneous Quantum-Classical Programming Models. Akihiro Hayashi, Austin Adams, Jeffrey Young, Alexander McCaskey, Eugene Dumitrescu, Vivek Sarkar, Thomas M. Conte, IPDPS Workshop on Quantum Computing Algorithms, Systems, and Applications (Q-CASA, co-located with IPDPS23). May 2023. [arXiv]
Highly Scalable Large-Scale Asynchronous Graph Processing using Actors. Youssef Elmougy, Akihiro Hayashi, Vivek Sarkar. IEEE TCSC International Scalable Computing Challenge (SCALE 2023, co-located with CCGRID23) Recipient of Best SCALE Challenge Award. May 2023.
A Fine-grained Asynchronous Bulk Synchronous parallelism model for PGAS applications. Sri Raj Paul, Akihiro Hayashi, Kun Chen, Youssef Elmougy, Vivek Sarkar. Journal of Computational Science, April 2023.

2022

Leveraging the Dynamic Program Structure Tree to Detect Data Races in OpenMP Programs. Lechen Yu, Feiyang Jin, Joachim Protze, Vivek Sarkar. Sixth International Workshop on Software Correctness for HPC Applications (Correctness), November 2022.
MiniKokkos: A Calculus of Portable Parallelism. Feiyang Jin, J. Jacobson, S. D. Pollard, Vivek Sarkar. Sixth International Workshop on Software Correctness for HPC Applications (Correctness), November 2022.
ReACT: Redundancy-Aware Code Generation for Tensor Expressions. Tong Zhou, Ruiqin Tian, Rizwan Ashraf, Gokcen Kestor, Roberto Gioiosa, Vivek Sarkar. The 31st International Conference on Parallel Architectures and Compilation Techniques (PACT), October 2022.
Automatic Parallelization of Python programs for Distributed Heterogeneous Computing. Jun Shirako, Akihiro Hayashi, Sri Raj Paul, Alexey Tumanov, and Vivek Sarkar. 28th International European Conference on Parallel and Distributed Computing (EuroPar), August 2022.
A Multi-Level Platform-Independent GPU API for High-Level Programming Models. Akihiro Hayashi, Sri Raj Paul, and Vivek Sarkar. HPC on Heterogeneous Hardware Workshop (H3, co-located with ISC22), June 2022.
Accelerating CHAMPS on GPUs. Akihiro Hayashi, Sri Raj Paul and Vivek Sarkar. The 9th Annual Chapel Implementers and Users Workshop (CHIUW 2022), June 2022.
A Productive and Scalable Actor-based Programming System for PGAS Applications. Sri Raj Paul, Akihiro Hayashi, Kun Chen, and Vivek Sarkar. The 22nd International Conference on Computational Science (ICCS 2022), June 2022.
Optimized Scheduling and Resource Allocation for Thread Parallel Architectures. Sana Damani, Ph.D. Thesis, May 2022.
Memory Access Scheduling to Reduce Thread Migrations. Sana Damani, Prithayan Barua, Vivek Sarkar. ACM SIGPLAN 2022 International Conference on Compiler Construction (CC 2022).
GPU Subwarp Interleaving. Sana Damani, Mark Stephenson, Ram Rangan, Daniel R. Johnson, Rishkul Kulkarni, and Stephen W.Keckler. The 28th IEEE International Symposium on High-Performance Computer Architecture (HPCA 2022), April 2022.

2021

OpenMP application experiences: Porting to accelerated nodes. Seonmyeong Bak, Colleen Bertoni, Swen Boehm, Reuben Budiardja, Barbara M. Chapman, Johannes Doerfert, Markus Eisenbach, Hal Finkel, Oscar Hernandez, Joseph Huber, Shintaro Iwasaki, Vivek Kale, Paul R.C. Kent, JaeHyuk Kwack, Meifeng Lin, Piotr Luszczek, Ye Luo, Buu Pham, Swaroop Pophale, Kiran Ravikumar, Vivek Sarkar, Thomas Scogland, Shilei Tian, P.K. Yeung. Parallel Computing (2022), October 2021.
SHMEM-ML: Leveraging OpenSHMEM and Apache Arrow for Scalable, Composable Machine Learning. Max Grossman, Steve Poole, Howard Pritchard, Vivek Sarkar. OpenSHMEM and Related Technologies Workshop 2021, September 2021.
Linear Promises: Towards Safer Concurrent Programming. Ohad Rau, Caleb Voss, Vivek Sarkar. 35th European Conference on Object-Oriented Programming (ECOOP), July 2021.
GPUAPI: Multi-level Chapel Runtime API for GPUs. Akihiro Hayashi, Sri Raj Paul and Vivek Sarkar. The 8th Annual Chapel Implementers and Users Workshop (CHIUW 2021), June 4, 2021. [slides] [video]
Task-Graph Scheduling Extensions for Efficient Synchronization and Communication. Seonmyeong Bak, Oscar Hernandez, Mark Gates, Piotr Luszczek, Vivek Sarkar. Proceedings of the 35th ACM International Conference on Supercomputing (ICS), June 2021. [arXiv]
ARBALEST: Dynamic Detection of Data Mapping Issues in Heterogeneous OpenMP Applications. Lechen Yu, Joachim Protze, Oscar Hernandez, Vivek Sarkar. 2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS), May 2021.
Towards Chapel-based Exascale Tree Search Algorithms: dealing with multiple GPU accelerators. Tiago Carneiro, Nouredine Melab, Akihiro Hayashi, Vivek Sarkar. 18th International Conference on High Performance Computing & Simulation (HPCS 2020), March 2021. Recipient of Outstanding Paper Award.
An Ownership Policy and Deadlock Detector for Promises. Caleb Voss, Vivek Sarkar. Proceedings of the 26th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, March 2021.
Compiler analysis and optimization of memory management In modern processors. Prithayan Barua, Ph.D. Thesis, January 2021.

2020

Addressing Logical Deadlocks through Task-Parallel Language Design. Caleb Voss, Ph.D. Thesis, December 2020.
Runtime Approaches to Improve the Efficiency of Hybrid and Irregular Applications. Seonmyeong Bak, Ph.D. Thesis, December 2020.
Intrepydd: Performance, Productivity, and Portability for Data Science Application Kernels. Tong Zhou, Jun Shirako, Anirudh Jain, Sriseshan Srikanth, Thomas Conte, Richard Vuduc, Vivek Sarkar. Proceedings of the 2020 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software, Onward! 2020, November 2020. [Video Presentation]
Integrating Inter-Node Communication with a Resilient Asynchronous Many-Task Runtime System. Sri Raj Paul, Akihiro Hayashi, Matthew Whitlock, Seonmyeong Bak, Keita Teranishi, Jackson Mayo, Max Grossman, Vivek Sarkar. ExaMPI’20: Proceedings of the 2020 Workshop on Exascale MPI, November 2020.
HOOVER: Leveraging OpenSHMEM for High Performance, Flexible Streaming Graph Applications. Max Grossman, Howard Pritchard, Vivek Sarkar, Steve Poole. The 3rd Annual Parallel Applications Workshop, Alternatives To MPI+X, November 2020.
PLINY: An End-To-End Framework for Big Code Analytics. Vivek Sarkar. October 2020.
Marvel: A Data-centric Compiler for DNN Operators on Spatial Accelerators. Prasanth Chatarasi, Hyoukjun Kwon, Natesh Raina, Saurabh Malik, Vaisakh Haridas, Angshuman Parashar, Michael Pellauer, Tushar Krishna, and Vivek Sarkar. arXiv preprint:2002.07752, 2020.
Vyasa: A High-Performance Vectorizing Compiler for Tensor Convolutions on the Xilinx AI Engine. Prasanth Chatarasi, Stephen Neuendorffer, Samuel Bayliss, Kees Vissers, and Vivek Sarkar.
Proceedings of the 24th IEEE High Performance Extreme Computing Conference (HPEC’20), September 2020.
Advanced Graph-Based Deep Learning for Probabilistic Type Inference. Fangke Ye, Jisheng Zhao, Vivek Sarkar. arXiv preprint arXiv:2009.05949, 2020.
A Study of Memory Anomalies in OpenMP Applications. Lechen Yu, Joachim Protze, Oscar Hernandez, Vivek Sarkar. 16th International Workshop on OpenMP (IWOMP), September 2020.
OmpMemOpt: Optimized Memory Movement for Heterogeneous Computing. Prithayan Barua, Jisheng Zhao, Vivek Sarkar. 27th International European Conference on Parallel and Distributed Computing (EuroPar), August 2020.
Advancing Compiler Optimizations for General-Purpose and Domain-Specific Parallel Architectures. Prasanth Chatarasi, Ph.D. Thesis, July 2020.
Enabling Parallelism and Optimizations in Data Mining Algorithms for Power-law Data.
Ankush Mandal, Ph.D. Thesis, July 2020.
MISIM: A Neural Code Semantics Similarity System Using the Context-Aware Semantics StructureMISIM: An End-to-End Neural Code Similarity System. Fangke Ye, Shengtian Zhou, Anand Venkat, Ryan Marcus, Nesime Tatbul, Jesmin Jahan Tithi, Paul Petersen, Timothy Mattson, Tim Kraska, Pradeep Dubey, Vivek Sarkar, Justin Gottschlich. arXiv preprint arXiv:2006.05265, 2020.
Exploring a multi-resolution GPU programming model for Chapel. Akihiro Hayashi, Sri Raj Paul, Vivek Sarkar. 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), May 2020.
MAESTRO: A Data-Centric Approach to Understand Reuse, Performance, and Hardware Cost of DNN Mappings. Hyoukjun Kwon, Prasanth Chatarasi, Vivek Sarkar, Tushar Krishna, Michael Pellauer, Angshuman Parashar. IEEE Micro Vol. 40, no. 3, May-June 2020.
Context-Aware Parse Trees. Fangke Ye, Shengtian Zhou, Anand Venkat, Ryan Marcus, Paul Petersen, Jesmin Jahan Tithi, Tim Mattson, Tim Kraska, Pradeep Dubey, Vivek Sarkar, Justin Gottschlich. arXiv preprint arXiv:2003.11118, 2020.
Speculative reconvergence for improved SIMT efficiency. Sana Damani, Daniel R.Johnson, Mark Stephenson, Stephen W. Keckler, Eddie Yan, Michael McKeown, Olivier Giroux. Proceedings of the 18th ACM/IEEE International Symposium on Code Generation and Optimization (CGO), February 2020.

2019

Experimental Insights from the Rogues Gallery. Jeffrey S Young, Jason Riedy, Thomas M Conte, Vivek Sarkar, Prasanth Chatarasi, Sriseshan Srikanth. 2019 IEEE International Conference on Rebooting Computing (ICRC), November 2019.
Common Subexpression Convergence: A New Code Optimization for SIMT Processors. Sana Damani and Vivek Sarkar. 32nd Workshop on Languages and Compilers for Parallel Computing (LCPC), October 2019.
Understanding Reuse, Performance, and Hardware Cost of DNN Dataflows: A Data-Centric Approach. Hyoukjun Kwon, Prasanth Chatarasi, Michael Pellauer, Angshuman Parashar, Vivek Sarkar, Tushar Krishna. The 52nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), October 2019.
OMPSan: Static Verification of OpenMPs Data Mapping Constructs. Prithayan Barua, Jun Shirako, Whitney Tsang, Jeeva Paudel, Wang Chen, Vivek Sarkar. 15th International Workshop on OpenMP (IWOMP), September 2019. Recipient of Best Paper Award.
Enabling Resilience in Asynchronous Many-Task Programming Models. Sri Raj Paul, Akihiro Hayashi, Nicole Slattengren, Hemanth Kolla, Matthew Whitlock, Seonmyeong Bak, Keita Teranishi, Jackson Mayo and Vivek Sarkar. 25th International European Conference on Parallel and Distributed Computing (Euro-Par), August 2019.
Optimized Execution of Parallel Loops via User-Defined Scheduling Policies. Seonmyeong Bak, Yanfei Guo, Pavan Balaji, Vivek Sarkar. Proceedings of the 48th International Conference on Parallel Processing (ICPP), August 2019.
GPUIterator: bridging the gap between Chapel and GPU platforms. Akihiro Hayashi, Sri Raj Paul, Vivek Sarkar. Proceedings of the ACM SIGPLAN 6th on Chapel Implementers and Users Workshop (CHIUW), co-located with PLDI’19, June 2019.
T2S-Tensor: Productively Generating High-Performance Spatial Hardware for Dense Tensor Computations. Nitish Kumar Srivastava, Hongbo Rong, Prithayan Barua, Guanyu Feng, Huanqi Cao, Zhiru Zhang, Vivek Sarkar, Wenguang Chen, Paul Petersen, Geoff Lowney, Christopher Hughes, Timothy Mattson, Pradeep Dubey. 27th IEEE International Symposium On Field-Programmable Custom Computing Machines, April 2019.
Transitive Joins: A Sound and Efficient Online Deadlock-Avoidance Policy. Caleb Voss, Tiago Cogumbreiro, Vivek Sarkar. ACM Conference on Principles and Practice of Parallel Programming (PPoPP), February 2019.
Valence: Variable Length Calling Context Encoding. Tong Zhou, Michael R. Jantz, Prasad A. Kulkarni, Kshitij A. Doshi, Vivek Sarkar. 28th International Conference on Compiler Construction (CC), February 2019.
Performance evaluation of OpenMP’s target construct on GPUs – exploring compiler optimizations. Akihiro Hayashi, Jun Shirako, Ettore Tiotto, Robert Ho, Vivek Sarkar. International Journal of High Performance Computing and Networking (IJHPCN), 13(1): 54-69 (2019).

2018

Topkapi: Parallel and Fast Sketches for Finding Top-K Frequent Elements. Ankush Mandal, He Jiang, Anshumali Shrivastava, Vivek Sarkar. Advances in Neural Information Processing Systems 31 (NeurIPS), December 2018.
Mapping High Level Parallel Programming Models to Asynchronous Many-Task (AMT) Runtimes. Sri Raj Paul, Ph.D. Thesis, December 2018.
Using Polyhedral Analysis to Verify OpenMP Applications are Data Race Free. Fangke Ye, Markus Schordan, Chunhua Liao, Pei-Hung Lin, Ian Karlin, Vivek Sarkar. 2018 IEEE/ACM 2nd International Workshop on Software Correctness for HPC Applications (Correctness), November 2018.
Detecting MPI usage anomalies via partial program symbolic execution. Fangke Ye, Jisheng Zhao, Vivek Sarkar. The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC18), November 2018.
A Preliminary Study of Compiler Transformations for Graph Applications on the Emu System. Prasanth Chatarasi, Vivek Sarkar. Proceedings of the Workshop on Memory Centric High Performance Computing (MCHPC, co-located with SC18), November 2018.
A Unified Runtime for PGAS and Event-Driven Programming. Sri Raj Paul, Kun Chen, Akihiro Hayashi, Max Grossman, Vivek Sarkar. Fourth International IEEE Workshop on Extreme Scale Programming Models and Middleware (ESPM2, co-located with SC18), November 2018.
Cost-driven thread coarsening for GPU kernels. Prithayan Barua, Jun Shirako, Vivek Sarkar. 27th International Conference on Parallel Architectures and Compilation Techniques (PACT), November 2018.
In-Register Parameter Caching for Dynamic Neural Nets with Virtual Persistent Processor Specialization. Farzad Khorasani, Hodjat Asghari Esfeden, Nael Abu-Ghazaleh, Vivek Sarkar. The 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), October 2018.
Using Dynamic Compilation to Achieve Ninja Performance for CNN Training on Many-Core Processors. Ankush Mandal, Raj Barik, Vivek Sarkar. 24th International European Conference on Parallel and Distributed Computing (Euro-Par), August 2018.
GT-Race: graph traversal based data race detection for asynchronous many-task parallelism. Lechen Yu, Vivek Sarkar. 24th International European Conference on Parallel and Distributed Computing (Euro-Par), August 2018.
RegMutex: Inter-Warp GPU Register Time-Sharing. Farzad Khorasani, Hodjat Asghari Esfeden, Amin Farmahini-Farahani, Nuwan Jayasena, Vivek Sarkar. 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA), June 2018.
Porting DMRG++ Scientific Application to OpenPOWER. Arghya Chatterjee, Gonzalo Alvarez, E. D’Azevedo, Wael Elwasif, Oscar Hernandez, Vivek Sarkar, International Workshop on OpenPOWER for HPC (IWOPH, co-located with ISC’18), June 2018.
HOOVER: Distributed, Flexible, and Scalable Streaming Graph Processing on OpenSHMEM. Max Grossman, Howard Pritchard, Tony Curtis, Vivek Sarkar, Workshop on OpenSHMEM and Related Technologies (OpenSHMEM 2018), March, 2018
Parallel Sparse Flow-Sensitive Points-to Analysis. Jisheng Zhao, Michael G. Burke, Vivek Sarkar. Proceedings of the 2018 International Conference on Compiler Construction (CC 2018), February 2018.
Modeling the Conflicting Demands of Multi-Level Parallelism and Temporal/Spatial Locality in Affine Scheduling. Oleksandr Zinenko, Chandan Reddy, Sven Verdoolaege, Jun Shirako, Tobias Grosser, Vivek Sarkar, Albert Cohen. Proceedings of the 2018 International Conference on Compiler Construction (CC 2018), February 2018.
General-purpose Programming Techniques for Emerging Systems with Non-volatile Byte-addressable Random Access Memory. Kumud Bhandari, Ph.D. Thesis, January 2018.

2017

Optimizing Web Virtual Reality. Rabimba Karanjai. M.S. Thesis, December 2017.
Graph500 on OpenSHMEM: Using A Practical Survey of Past Work to Motivate Novel Algorithmic Developments. Max Grossman, Howard Pritchard, Zoran Budimlić, Vivek Sarkar. Proceedings of the Second Annual PGAS Applications Workshop (PAW 17), November 2017
Exploration of Supervised Machine Learning Techniques for Runtime Selection of CPU vs. GPU Execution in Java Programs. Gloria Kim, Akihiro Hayashi, Vivek Sarkar. Fourth Workshop on Accelerator Programming Using Directives (WACCPD), November 2017. (co-located with SC17)
Chapel-on-X: Exploring Tasking Runtimes for PGAS Languages. Akihiro Hayashi, Sri Raj Paul, Max Grossman, Jun Shirako, Vivek Sarkar. Third IEEE Workshop on Extreme Scale Programming Models and Middleware (ESPM2), November 2017. (co-located with SC17)
Deadlock Avoidance in Parallel Programs with Futures: Why parallel tasks should not wait for strangers. Tiago Cogumbreiro, Rishi Surendran, Francisco Martins, Vivek Sarkar, Vasco T. Vasconcelos, and Max Grossman. In ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA). ACM, 2017.
DAMMP: A Distributed Actor Model for Mobile Platforms. Arghya Chatterjee, Srdjan Milakovic, Bing Xue, Zoran Budimlic, Vivek Sarkar. 14th International Conference on Managed Languages & Runtimes (ManLang’17), September 2017. [slides]
Implementation and Evaluation of OpenSHMEM Contexts Using OFI Libfabric. Workshop on OpenSHMEM and Related Technologies. Big Compute and Big Data Convergence, August 2017.
Distributed Communication Middleware for an Selector Model. Bing Xue. M.S. Thesis, August 2017.
Optimizing Convolutions in State-of-the-art Convolutional Neural Networks on Intel Xeon Phi. Ankush Mandal. M.S. Thesis, July 2017
Exploring Tradeoffs in Parallel Implementations of Futures in C++. Jonathan Sharman. M.S. Thesis, July 2017 [slides]
Enhanced Data and Task Abstractions for Extreme-scale Runtime Systems. Nick Vrvilo. PhD Thesis, July 2017. [resource page]
A Marshalled Data Format for Pointers in Relocatable Data Blocks. Nick Vrvilo, Lechen Yu and Vivek Sarkar. In Proceedings of the 2017 ACM SIGPLAN International Symposium on Memory Management (ISMM). June 2017.
Dyna: Toward a self-optimizing declarative language for machine learning applications. Tim Vieira, Matthew Francis-Landau, Nathaniel Wesley Filardo, Farzad Khorasani, and Jason Eisner. In Proceedings of the First ACM SIGPLAN Workshop on Machine Learning and Programming Languages, June 2017.
Performance Evaluation of OpenMP’s Target Construct on GPUs. Akihiro Hayashi, Jun Shirako, Ettore Tiotto, Robert Ho, Vivek Sarkar. International Journal of High Performance Computing and Networking (IJHPCN), June 2017.
Preparing an Online Java Parallel Computing Course. Vivek Sarkar, Max Grossman, Zoran Budimlic and Shams Imam. 7th NSF/TCPP Workshop on Parallel and Distributed Computing Education (EduPar-17). May 2017.
A Pluggable Framework for Composable HPC Scheduling Libraries. Max Grossman, Vivek Kumar, Nick Vrvilo, Zoran Budimlic, Vivek Sarkar. The Seventh International Workshop on Accelerators and Hybrid Exascale Systems (AsHES). May 2017. [slides]
Formalization of Habanero Phasers using Coq. Tiago Cogumbreiro, Jun Shirako, and Vivek Sarkar. Journal of Logical and Algebraic Methods in Programming (JLAMP), March 2017.
Extending the Polyhedral Compilation Model for Debugging and Optimization of SPMD-style Explicitly-Parallel Programs. Prasanth Chatarasi. M.S. Thesis, April 2017 [slides].
Enabling Distributed Reconfiguration in an Actor Model. Arghya Chatterjee. M.S. Thesis, April 2017.
Debugging, Repair, and Synthesis of Task-Parallel Programs. Rishi Surendran. Ph.D. Thesis, March 2017.
Productive Programming Systems for Heterogeneous Supercomputers. Max Grossman. Ph.D. Thesis, February 2017.
Optimized Two-Level Parallelization for GPU Accelerators using the Polyhedral Model. Jun Shirako, Akihiro Hayashi, Vivek Sarkar. Proceedings of the 2017 International Conference on Compiler Construction (CC 2017), February 2017 [slides].

2016

PIPES: A Language and Compiler for Task-Based Programming on Distributed-Memory Clusters. Martin Kong, Louis-Noël Pouchet, P. Sadayappan, Vivek Sarkar. The Conference on High Performance Computing, Networking, Storage and Analysis (SC16), November 2016.
Static Cost Estimation for Data Layout Selection on GPUs. Yuhan Peng, Max Grossman, Vivek Sarkar. 7th International Workshop in Performance Modeling, Benchmarking, and Simulation of High Performance Computer Systems (PMBS16, co-located with SC16). November 2016.
Fine-grained parallelism in probabilistic parsing with Habanero Java. Matthew Francis-Landau (Johns Hopkins University), Bing Xue (Rice University), Jason Eisner (Johns Hopkins University), and Vivek Sarkar (Rice University). In Proceedings of the Sixth Workshop on Irregular Applications: Architectures and Algorithms (IA3, co-located with SC16), November 2016 [slides].
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator Model on a POWER8+GPU Platform. Akihiro Hayashi, Jun Shirako, Ettore Tiotto, Robert Ho, Vivek Sarkar. Third Workshop on Accelerator Programming Using Directives (WACCPD, co-located with SC16), November 2016.
Optimized Distributed Work-Stealing. Vivek Kumar, Karthik Murthy, Vivek Sarkar and Yili Zheng. 6th workshop on Irregular Applications: Architectures and Algorithms (IA^3), ACM, November 2016 [slides].
Automatic Parallelization of Pure Method Calls via Conditional Future Synthesis. Rishi Surendran and Vivek Sarkar. 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2016), November 2016.
Pedagogy and Tools for Teaching Parallel Computing at the Sophomore Undergraduate Level. Max Grossman, Maha Aziz, Heng Chi, Anant Tibrewal, Shams Imam, Vivek Sarkar. Journal of Parallel and Distributed Computing Special Issue on Parallel, Distributed, and High Performance Computing Education. 2016.
OpenMP as a High-Level Specification Language for Parallelism. Max Grossman, Jun Shirako, Vivek Sarkar. International Workshop on OpenMP (IWOMP), October 2016.
An Extended Polyhedral Model for SPMD Programs and its use in Static Data Race Detection. Prasanth Chatarasi, Jun Shirako, Martin Kong, Vivek Sarkar. The 29th International Workshop on Languages and Compilers for Parallel Computing (LCPC), September 2016 [slides].
The Open Community Runtime: A Runtime System for Extreme Scale Computing. Timothy G. Mattson, Romain Cledat, Vincent Cave, Vivek Sarkar, Zoran Budimlic, Sanjay Chatterjee, Josh Fryman, Ivan Ganev, Robin Knauerhase, Min Lee, Benoıt Meister, Brian Nickerson, Nick Pepperling, Bala Seshasayee, Sagnak Tasirlar, Justin Teller, Nick Vrvilo. In 2016 IEEE High Performance Extreme Computing Conference (HPEC ’16).
Dynamic Determinacy Race Detection for Task Parallelism with Futures. Rishi Surendran and Vivek Sarkar. 16th International Conference on Runtime Verification (RV’16), September 2016.
Declarative Tuning for Locality in Parallel Programs. Sanjay Chatterjee, Nick Vrvilo, Zoran Budimlic, Kathleen Knobe, Vivek Sarkar. The 45th International Conference on Parallel Processing (ICPP-2016), August 2016. (slides)
Integrating Asynchronous Task Parallelism with OpenSHMEM. Max Grossman, Vivek Kumar, Zoran Budimlic, Vivek Sarkar. OpenSHMEM Workshop, August 2016.
A Distributed Selectors Runtime System for Java Applications. Arghya Chatterjee, Branko Gvoka, Bing Xue, Zoran Budimlic, Shams Imam, Vivek Sarkar. 13th International Conference on the Principles and Practice of Programming on the Java Platform: virtual machines, languages, and tools (PPPJ’16), August 2016 [slides].
Design and verification of distributed phasers. Karthik Murthy, Sri Raj Paul, Kuldeep S. Meel, Tiago Cogumbreiro, and John M. Mellor-Crummey. 23rd International European Conference on Parallel and Distributed Computing (EuroPAR), August 2016.
Brief Announcement: Dynamic Determinacy Race Detection for Task Parallelism with Futures. Rishi Surendran and Vivek Sarkar. 28th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), July 2016.
SWAT: A Programmable, In-Memory, Distributed, High-Performance Computing Platform . Max Grossman, Vivek Sarkar. International ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC), May 2016.
Efficient Checkpointing of Multi-Threaded Applications as a Tool for Debugging, Performance Tuning, and Resilience. Max Grossman, Vivek Sarkar. IEEE International Parallel and Distributed Processing Symposium (IPDPS), May 2016.
Formalization of phase ordering. Tiago Cogumbreiro, Jun Shirako, Vivek Sarkar. Programming Language Approaches to Concurrency- and Communication-cEntric Software (PLACES 2016), April 2016 [resource page].
Automatic Data Layout Generation and Kernel Mapping for CPU+GPU Architectures. DeepakMajeti, KuldeepMeel, RajBarik and Vivek Sarkar. 25th International Conference on Compiler Construction (CC 2016), March 2016.
Static Data Race Detection for SPMD Programs via an Extended Polyhedral Representation. Prasanth Chatarasi, Jun Shirako, Vivek Sarkar. 6th International Workshop on Polyhedral Compilation Techniques (IMPACT 2016), January 2016 [slides].

2015

Efficient Static and Dynamic Memory Management Techniques for Multi-GPU System. Max Grossman, Mauricio Araya-Polo. Workshop on Runtime Systems for Extreme Scale Programming Models and Architectures. November 2015.
Distributed, Heterogeneous Scheduling Techniques Motivated by Production Geophysical Applications. Max Grossman, Mauricio Araya-Polo. Workshop on Many-Task Computing on Clouds, Grids, and Supercomputer. November 2015.
Concurrent Collections. Kathleen Knobe, Michael G. Burke, and Frank Schlimbach. Programming Models for Parallel Computing, Chapter 11, pages 247-280. Pavan Balaj, editor. The MIT Press, November 2015.
Auto-Grading for Parallel Programs. Maha Aziz,Heng Chi, Anant Tibrewal, Max Grossman, Vivek Sarkar. Workshop on Education for High-Performance Computing (EduHPC, co-located with SC15). November 2015.
LLVM-based Communication Optimizations for PGAS Programs. Akihiro Hayashi, Jisheng Zhao, Michael Ferguson, Vivek Sarkar. The 2nd Workshop on the LLVM Compiler Infrastructure in HPC (LLVM, co-located with SC15), November, 2015.
Model Checking Task Parallel Programs using Gradual Permissions. Eric Mercer, Peter Anderson, Nick Vrvilo, and Vivek Sarkar. 30th IEEE/ACM International Conference on Automated Software Engineering (ASE), New ideas category, Lincoln, Nebraska, November,2015. [paper]
Optimized Event-Driven Runtime Systems for Programmability and Performance. Sagnak Tasirlar. Ph.D. Thesis, October 2015.
Extending Polyhedral Model for Analysis and Transformations of OpenMP Programs. Prasanth Chatarasi, and Vivek Sarkar. PACT ACM Student Research Competition, October 2015. [accepted as poster with accompanying extended abstract][poster].
Polyhedral Optimizations of Explicitly Parallel Programs. Prasanth Chatarasi, Jun Shirako, and Vivek Sarkar. 24th International Conference on Parallel Architectures and Compilation Techniques (PACT), October 2015. One of four papers selected for Best Paper session [slides].
Compiling and Optimizing Java 8 Programs for GPU Execution. Kazuaki Ishizaki, Akihiro Hayashi, Gita Koblents, Vivek Sarkar. 24th International Conference on Parallel Architectures and Compilation Techniques (PACT), October 2015.
Exploiting Parallelism in Mobile Devices. Arghya Chatterjee, Timothy Newton, Tom Roush, Hunter Tidwell, Vivek Sarkar. SPLASH 2015 Poster Session, October 2015. [accepted as poster with accompanying extended abstract][paper].
Heterogeneous Work-stealing across CPU and DSP cores. Vivek Kumar, Alina Sbirlea, Ajay Jayaraj, Zoran Budimlic, Deepak Majeti, Vivek Sarkar. 19th IEEE High Performance Extreme Computing conference (HPEC’15). September 2015. [paper]
Polyhedral Optimizations for a Data-Flow Graph Language. Alina Sbirlea, Jun Shirako, Louis-Noel Pouchet, Vivek Sarkar. The 28th International Workshop on Languages and Compilers for Parallel Computing (LCPC), September 2015.
HJ-OpenCL: Reducing the Gap Between the JVM and Accelerators. Max Grossman, Shams Imam, Vivek Sarkar. 12th International Conference on the Principles and Practice of Programming on the Java Platform (PPPJ’15), September 2015. [paper]
Machine-Learning-based Performance Heuristics for Runtime CPU/GPU Selection. Akihiro Hayashi, Kazuaki Ishizaki, Gita Koblents, Vivek Sarkar. 12th International Conference on the Principles and Practice of Programming on the Java Platform: virtual machines, languages, and tools (PPPJ’15), September 2015.
A Composable Deadlock-free Approach to Object-based Isolation. Shams Imam, Jisheng Zhao, Vivek Sarkar. 21st International European Conference on Parallel and Distributed Computing (Euro-Par’15), August 2015. [paper]
Elastic Tasks: Unifying Task Parallelism and SPMD Parallelism with an Adaptive Runtime. Alina Sbirlea, Kunal Agrawal, Vivek Sarkar. 21st International European Conference on Parallel and Distributed Computing (Euro-Par’15), August 2015.
Data Layout Optimization for Portable Performance. Kamal Sharma, Ian Karlin, Jeff Keasler, James McGraw, Vivek Sarkar. 21st International European Conference on Parallel and Distributed Computing (Euro-Par’15), August 2015.
Load Balancing Prioritized Tasks via Work-Stealing. Shams Imam, Vivek Sarkar. 21st International European Conference on Parallel and Distributed Computing (Euro-Par’15), August 2015. [paper]
HadoopCL2: Motivating the Design of a Distributed, Heterogeneous Programming System With Machine-Learning Applications. Max Grossman, Mauricio Breternitz, Vivek Sarkar. IEEE Transactions on Parallel and Distributed Systems (TPDS), 2015.
High-level execution models for multicore architectures. Alina Sbirlea. Ph.D. Thesis, July 2015.
The Eureka Programming Model for Speculative Task Parallelism. Shams Imam, Vivek Sarkar. 29th European Conference on Object-Oriented Programming (ECOOP), July 2015. [paper]
Race Detection in Two Dimensions. Dimitar Dimitrov, Martin Vechev, Vivek Sarkar. 27th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), June 2015.
Heterogeneous Habanero-C (H2C): A Portable Programming Model for Heterogeneous Processors. DeepakMajeti, Vivek Sarkar. ProgrammingModels,LanguagesandCompilersfor Manycore and Heterogeneous Architectures (PLC), 25th May 2015, Hyderabad, India. Co-located with IPDPS 2015.
Cooperative Execution of Parallel Tasks with Synchronization Constraints. Shams Imam. Ph.D. Thesis, May 2015.
Memory and Communication Optimizations for Macro-dataflow Programs. Dragos Sbirlea. Ph.D. Thesis, May 2015.
Portable Programming Models for Heterogeneous Platforms. Deepak Majeti. Ph.D. Thesis, May 2015.
JPF Verification of Habanero Java Programs using Gradual Type Permission Regions. Peter Anderson, Nick Vrvilo, Eric Mercer, and Vivek Sarkar. 2015. SIGSOFT Softw. Eng. Notes 40, 1 (February 2015), 1-5. [doi]
Parallelizing a Discrete Event Simulation Application Using the Habanero-Java Multicore Library. Wei-Cheng Xiao, Jisheng Zhao, and Vivek Sarkar. The 6th International Workshop on Programming Models and Applications for Multicore and Manycores (PMAM 2015), Feb 2015.
Polyhedral Transformations of Explicitly Parallel Programs. Prasanth Chatarasi, Jun Shirako, Vivek Sarkar. 5th International Workshop on Polyhedral Compilation Techniques (IMPACT 2015), January 2015. [slides]

2014

Oil and Water Can Mix: An Integration of Polyhedral and AST-based Transformations. Jun Shirako, Louis-Noel Pouchet, Vivek Sarkar. IEEE Conference on High Performance Computing, Networking, Storage and Analysis (SC’14), November 2014. [slides] One of eight Best Paper Finalists in conference.
HabaneroUPC++: a Compiler-free PGAS Library. Vivek Kumar, Yili Zheng, Vincent Cave, Zoran Budimlic, Vivek Sarkar. 8th International Conference on Partitioned Global Address Space Programming Models (PGAS14), October 2014. [slides]
HJ-Viz: A New Tool for Visualizing, Debugging and Optimizing Parallel Programs. Peter Elmers, Hongyu Li, Shams Imam, Vivek Sarkar. SPLASH 2014 Poster Session, October 2014. [accepted as poster with accompanying extended abstract]. [paper]
Selectors: Actors with Multiple Guarded Mailboxes. Shams Imam, Vivek Sarkar. 4th International Workshop on Programming based on Actors, Agents, and Decentralized Control (AGERE! 2014), October 2014. [paper, slides]
Savina – An Actor Benchmark Suite. Shams Imam, Vivek Sarkar. 4th International Workshop on Programming based on Actors, Agents, and Decentralized Control (AGERE! 2014), October 2014. [paper, slides]
Habanero-Java Library: a Java 8 Framework for Multicore Programming. Shams Imam, Vivek Sarkar. 11th International Conference on the Principles and Practice of Programming on the Java Platform: virtual machines, languages, and tools (PPPJ’14), September 2014. [paper, slides]
ADHA: Automatic Data layout framework for Heterogeneous Architectures . Deepak Majeti, Kuldeep S. Meel, Rajkishore Barik and Vivek Sarkar. (Poster) International Conference on Parallel Architectures and Compilation Techniques (PACT), August 2014.
Locality Transformations of Computation and Data for Portable Performance. Kamal Sharma. Ph.D. Thesis, August 2014.
Bounded Memory Scheduling of Dynamic Task Graphs. Dragos Sbirlea, Zoran Budimlic, Vivek Sarkar. International Conference on Parallel Architectures and Compilation Techniques (PACT), August 2014.
DFGR: an Intermediate Graph Representation for Macro-Dataflow Programs. Alina Sbirlea, Louis-Noel Pouchet, Vivek Sarkar. Fourth Workshop on Dataflow Execution Models for Extreme Scale Computing – in conjunction with PACT 2014 (DFM 2014)
Asynchronous Checkpoint/Restart for the Concurrent Collections Model. Nick Vrvilo. M.S. Thesis, August 2014. [thesis resources page]
Cooperative Scheduling of Parallel Tasks with General Synchronization Patterns. Shams Imam, Vivek Sarkar. 28th European Conference on Object-Oriented Programming (ECOOP), July 2014. [paper, slides]
Test-Driven Repair of Data Races in Structured Parallel Programs. Rishi Surendran, Raghavan Raman, Swarat Chaudhuri, John Mellor-Crummey, and Vivek Sarkar. 35th ACM Conference on Programming Language Design and Implementation (PLDI), June 2014.
Exploiting Implicit Parallelism in Dynamic Array Programming Languages. Shams Imam, Vivek Sarkar, David Leibs, Peter B. Kessler. ACM SIGPLAN International Workshop on Libraries, Languages and Compilers for Array Programming (ARRAY ’14), June 2014. [paper, slides]
A Case for Cooperative Scheduling in X10’s Managed Runtime. Shams Imam, Vivek Sarkar. The 2014 X10 Workshop (X10’14), June 2014. [paper, slides]
Optimized Runtime Systems for MapReduce Applications in Multi-core Clusters. Yunming Zhang. M.S. Thesis, May 2014.
LLVM Optimizations for PGAS Programs -Case Study: LLVM Wide Optimization in Chapel-. Akihiro Hayashi, Rishi Surendran, Jisheng Zhao, Michael Ferguson, Vivek Sarkar. The 1st Chapel Implementers and Users Workshop (co-located with IPDPS2014), May 2014.
Inter-iteration Scalar Replacement Using Array SSA Form. Rishi Surendran, Rajkishore Barik, Jisheng Zhao, Vivek Sarkar. The 23rd International Conference on Compiler Construction (CC 2014), April 2014.
Dynamic Determinism Checking for Structured Parallelism. Edwin Westbrook, Raghavan Ramean, Jisheng Zhao, Zoran Budimlic, Vivek Sarkar. The 5th Workshop on Determinism and Correctness in Parallel Programming (WoDet 2014), March 2014.

2013

A Decoupled non-SSA Global Register Allocation using Bipartite Liveness Graphs. Rajkishorebarik, Jisheng Zhao and Vivek Sarkar, ACM Transactions on Architecture and Code Optimization (TACO), Volume 10 Issue 4, December 2013.
Automatic Detection of Inter-application Permission Leaks in Android Applications. Dragos Sbirlea, Michael G. Burke, Salvatore Guarnieri, Marco Pistoia, Vivek Sarkar. IBM Journal of Research and Development (Volume:57 , Issue: 6 ), November – December 2013.
Runtime Systems for Extreme Scale Platforms . Sanjay Chatterjee. Ph.D Thesis, December 2013.
Isolation for Nested Task Parallelism . Jisheng Zhao, Roberto Lublinerman, Zoran Budimlic, Swarat Chaudhuri, Vivek Sarkar. The 29th International Conference on the Object-Oriented Programming, System, Languages and Application (OOPSLA), October 2013.
Speculative Execution of Parallel Programs with Precise Exception Semantics on GPUs. Akihiro Hayashi, Max Grossman, Jisheng Zhao, Jun Shirako, Vivek Sarkar. The 26th International Workshop on Languages and Compilers for Parallel Computing (LCPC), September 2013.
Expressing DOACROSS Loop Dependencies in OpenMP. Jun Shirako, Priya Unnikrishnan, Sanjay Chatterjee, Kelvin Li, Vivek Sarkar. 9th International Workshop on OpenMP (IWOMP), September 2013.
Accelerating Habanero-Java Programs with OpenCL Generation . Akihiro Hayashi, Max Grossman, Jisheng Zhao, Jun Shirako, Vivek Sarkar. 10th International Conference on the Principles and Practice of Programming in Java (PPPJ), September 2013.
Interprocedural Strength Reduction of Critical Sections in Explicitly-Parallel Programs . Rajkishore Barik, Jisheng Zhao, Vivek Sarkar. The 22nd International Conference on Parallel Architectures and Compilation Techniques (PACT), September 2013.
The Flexible Preconditions Model for Macro-Dataflow Execution . Dragos Sbirlea, Alina Sbîrlea, Kyle B. Wheeler, Vivek Sarkar. The 3rd Data-Flow Execution Models for Extreme Scale Computing (DFM), September 2013.
Compiler-Driven Data Layout Transformation for Heterogeneous Platforms . Deepak Majeti, Rajkishore Barik, Jisheng Zhao, Vivek Sarkar and Max Grossman. The International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Platforms (HeteroPar’2013) , August 2013.
Oil and Water can mix! Experiences with integrating Polyhedral and AST-based Transformations . Jun Shirako, Vivek Sarkar. 17th Workshop on Compilers for Parallel Programming (CPC), July 2013.
HJ-Hadoop: An Optimized MapReduce Runtime for Multi-core Systems . Yunming Zhang, Alan Cox, Vivek Sarkar. 5th USENIX Workshop on Hot Topics in Parallelism (HotPar ’13), June 2013. [accepted as poster with accompanying paper]. [ slides ]
A Transformation Framework for Optimizing Task-Parallel Programs . V. Krishna Nandivada, Jun Shirako, Jisheng Zhao, Vivek Sarkar. ACM Transactions on Programming Languages and Systems (TOPLAS), Volume 35, May 2013.
Integrating Asynchronous Task Parallelism with MPI . Sanjay Chatterjee, Sagnak Tasirlar, Zoran Budimlic, Vincent Cave , Milind Chabbi, Max Grossman, Yonghong Yan, Vivek Sarkar. IEEE International Parallel & Distributed Processing Symposium (IPDPS), May 2013.
HadoopCL: MapReduce on Distributed Heterogeneous Platforms Through Seamless Integration of Hadoop and OpenCL . Max Grossman, Mauricio Breternitz, Vivek Sarkar. International Workshop on High Performance Data Intensive Computing, May 2013 (co-located with IPDPS 2013).
Programming Models and Runtimes for Heterogeneous Systems. Max Grossman. M.S. Thesis, April 2013.
Finish Accumulators: a Deterministic Reduction Construct for Dynamic Task Parallelism . Jun Shirako, Vincent Cave, Jisheng Zhao, Vivek Sarkar. The 4th Workshop on Determinism and Correctness in Parallel Programming (WoDet), March 2013.
DOE ASCAC report on Synergistic Challenges in Data-Intensive Science and Exascale Computing , Vivek Sarkar et al, March 2013.

2012

Integrating Task Parallelism with Actors. Shams Imam, Vivek Sarkar. Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), October 2012. [paper, slides]
Rice ROSE Compositional Analysis and Transformation Framework (R2CAT). Jisheng Zhao, Micheal Burke, Vivek Sarkar. LLNL Technical Report 590233, October 2012.
Determinacy and Repeatability of Parallel Program Schemata. Jack B. Dennis, Guang R. Gao, Vivek Sarkar. Workshop on Data-Flow Execution Models for Extreme Scale Computing (DFM 2012).
Folding of Tagged Single Assignment Values for Memory-Efficient Parallelism . Dragos Sbirlea, Kathleen Knobe, Vivek Sarkar. International European Conference on Parallel and Distributed Computing (Euro-Par), August 2012.
A Practical Approach to DOACROSS Parallelization . Priya Unnikrishnan, Jun Shirako, Kit Barton, Sanjay Chatterjee, Raul Silvera, Vivek Sarkar. International European Conference on Parallel and Distributed Computing (Euro-Par), August 2012.
Dynamic Data Race Detection for Structured Parallelism. Raghavan Raman. Ph.D. Thesis, August 2012.
Design, Verification and Applications of a New Read-Write Lock Algorithm . Jun Shirako, Nick Vrvilo, Eric G. Mercer, Vivek Sarkar. 24th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), June 2012.
Scalable and Precise Dynamic Data Race Detection for Structured Parallelism. Raghavan Raman, Jisheng Zhao, Vivek Sarkar, Martin Vechev, Eran Yahav. 33rd ACM SIGPLAN conference on Programming Language Design and Implementation (PLDI), June 2012. [slides]
- An extended version of this paper along with the correctness proofs can be found in Technical Report TR12-01.
Mapping a Data-Flow Programming Model onto Heterogeneous Platforms. Alina Sbirlea, Yi Zou, Zoran Budimlic, Jason Cong, Vivek Sarkar. Conference on Languages, Compilers, Tools and Theory for Embedded Systems (LCTES), June 2012. [slides] [doi]
Practical Permissions for Race-Free Parallelism. Edwin Westbrook, Jisheng Zhao, Zoran Budimlic, Vivek Sarkar. 26th European Conference on Object-Oriented Programming (ECOOP), June 2012.
CnC-Python: Multicore Programming with High Productivity. Shams Imam, Vivek Sarkar. 4th USENIX Workshop on Hot Topics in Parallelism (HotPar ’12), June 2012. [accepted as poster with accompanying paper]. [paper]
Efficient data race detection for async-finish parallelism. Raghavan Raman, Jisheng Zhao, Vivek Sarkar, Martin T. Vechev, Eran Yahav. Formal Methods in System Design (FMSD) 41(3): 321-347 (2012).
Mapping a Dataflow Programming Model onto Heterogeneous Architectures. Alina Sbirlea. Master’s Thesis, May 2012.
Habanero-Scala: A Hybrid Programming model integrating Fork/Join and Actor models. Shams Imam. Master’s Thesis, May 2012. [paper, slides]
Habanero-Scala: Async-Finish Programming in Scala. Shams Imam, Vivek Sarkar. The Third Scala Workshop (Scala Days 2012), April 2012. [paper, slides]
Analytical Bounds for Optimal Tile Size Selection. Jun Shirako, Kamal Sharma, Naznin Fauzia, Louis-Noel Pouchet, J. Ramanujam, P. Sadayappan, Vivek Sarkar. Proceedings of the 2012 International Conference on Compiler Construction (CC 2012), April 2012.
Report on Inter-Agency Workshop on HPC Resilience at Extreme Scale. (Editor: John T. Daly.) February 2012.
The Tuning Language for Concurrent Collections. Kathleen Knobe, Michael G. Burke. Proceedings of the 2012 International Workshop on Compilers for Parallel Computing (CPC 2012), January 2012.

2011

Integrating Stream Parallelism and Task Parallelism in a Dataflow Programming Model. Dragos Sbirlea, Master’s thesis, December 2011.
Delegated Isolation. Roberto Lublinerman, Jisheng Zhao, Zoran Budimlic, Swarat Chaudhuri, Vivek Sarkar. Proceedings of OOPSLA 2011, October 2011.
SCnC: Efficient Unification of Streaming with Dynamic Task Parallelism. Dragos Sbirlea, Jun Shirako, Ryan Newton, Vivek Sarkar. Proceeding of the Data-Flow Execution Models for Extreme Scale Computing (DFM 2011), in conjunction with PACT 2011, October 2011.
Intermediate Language Extensions for Parallelism . Jisheng Zhao, Vivek Sarkar. 5th Workshop on Virtual Machine and Intermediate Languages (VMIL’11), October 2011.
Interfacing Chapel with Traditional HPC Programming Languages. Adrian Prantl, Thomas Epperly, Shams Imam, Vivek Sarkar. PGAS11 Proceedings, October 2011. [paper, slides]
Permission Regions for Race-Free Parallelism. Edwin Westbrook, Jisheng Zhao, Zoran Budimlic, Vivek Sarkar. Proceedings of the 2nd International Conference on Runtime Verification (RV ’11), September 2011.
Data-Driven Tasks and their Implementation. Sagnak Tasirlar, Vivek Sarkar. Proceedings of the International Conference on Parallel Processing (ICPP) 2011, September 2011. [slides]
Dynamic Task Parallelism with a GPU Work-Stealing Runtime System. Sanjay Chatterjee, Max Grossman, Alina Sbirlea, Vivek Sarkar. 2011 Workshop on Languages and Compilers for Parallel Computing (LCPC), September 2011. [slides]
Habanero-Java: the New Adventures of Old X10. Vincent Cave, Jisheng Zhao, Jun Shirako, Vivek Sarkar. 9th International Conference on the Principles and Practice of Programming in Java (PPPJ), August 2011.
DrHJ — a lightweight pedagogic IDE for Habanero Java. Jarred Payne, Vincent Cave, Raghavan Raman, Mathias Ricken, Robert Cartwright, Vivek Sarkar. Tool Demonstration paper, 9th International Conference on the Principles and Practice of Programming in Java (PPPJ), August 2011.
Hardware and Software Tradeoffs for Task Synchronization on Manycore Architectures. Yonghong Yan, Sanjay Chatterjee, David Orozco, Elkin Garcia, Zoran Budimlic, Jun Shirako, Robert Pavel, Guang R. Gao, Vivek Sarkar. Proceedings of Euro-Par 2011, August 2011.
Unifying Barrier and Point-to-Point Synchronization in OpenMP with Phasers. Jun Shirako, Kamal Sharma, Vivek Sarkar. 7th International Workshop on OpenMP (IWOMP), June 2011. [slides]
Communication Optimizations for Distributed-Memory X10 Programs. Rajkishore Barik, Jisheng Zhao, David Grove, Igor Peshansky, Zoran Budimlic, Vivek Sarkar. 25th IEEE International Parallel and Distributed Processing Symposium (IPDPS), May 2011.
Scheduling Macro-Dataflow Programs on Task-Parallel Runtime Systems. Sagnak Tasirlar, Master’s thesis, April 2011. [slides]
Subregion Analysis and Bounds Check Elimination for High Level Arrays. Mackale Joyner, Zoran Budimlic, Vivek Sarkar. Proceedings of the 2011 International Conference on Compiler Construction (CC 2011), April 2011.
Lightweight Dynamic Task Creation and Scheduling on the Intel Single Chip Cloud (SCC) Processor. Deepak Majeti. Fourth Workshop on Programming Language Approaches to Concurrency and Communication-Centric Software (PLACES 2011), April 2011. [slides]
Customizable Domain-Specific Computing. Jason Cong, Vivek Sarkar, Glenn Reinman, Alex Bui. IEEE Design & Test, 2:28, pp.6-15, March 2011.
Deterministic Reductions in an Asynchronous Parallel Language. Zoran Budimlic, Michael Burke, Kathleen Knobe, Ryan Newton, David Peixotto, Vivek Sarkar, Edwin Westbrook. The 2nd Workshop on Determinism and Correctness in Parallel Programming (WoDet), March 2011.
The Concurrent Collections Programming Model. Michael G. Burke, Kathleen Knobe, Ryan Newton, Vivek Sarkar. David Padua (Ed.), Encyclopedia of Parallel Computing, Springer New York, 2011.

2010

Efficient Selection of Vector Instructions using Dynamic Programming. Rajkishore Barik, Jisheng Zhao, Vivek Sarkar. MICRO-43, December 2010.
The Concurrent Collections Programming Model. Michael G. Burke, Kathleen Knobe, Ryan Newton, Vivek Sarkar. Technical Report TR 10-12, Department of Computer Science, Rice University, December 2010. (Preprint of a chapter in Encyclopedia of Parallel Computing, 2011.)
Efficient Date Race Detection for Async-Finish Parallelism. Raghavan Raman, Jisheng Zhao, Vivek Sarkar, Martin Vechev, Eran Yahav. Proceedings of the 1st International Conference on Runtime Verification (RV ’10). November 2010. Recipient of Best Paper Award. [slides]
CnC-CUDA: Declarative Programming for GPU’s. Max Grossman, Alina Simion Sbirlea, Zoran Budimlic, Vivek Sarkar. 2010 Workshop on Languages and Compilers for Parallel Computing (LCPC), October 2010. [doi]
Parallel Object-Oriented Scientific Computing with Habanero-Java. Zoran Budimlic, Vincent Cave, Jun Shirako, Yonghong Yan, Jisheng Zhao, Vivek Sarkar, MichaelGlinsky, James Gunning. 9th Workshop on Parallel/High-Performance Object-Oriented Scientific Computing (POOSC’10), co-located with SPLASH 2010, October 2010.
Modeling and Mapping for Customizable Domain-Specific Computing. Zoran Budimlic, Alex Bui, Jason Cong, Glenn Reinman, Vivek Sarkar. Workshop on Concurrency for the Application Programmer (CAP), co-located with SLASH 2010, October 2010.
Comparing the Usability of Library vs. Language Approaches to Task Parallelism. Vincent Cave, Zoran Budimlic, Vivek Sarkar. Workshop on Evaluation and Usability of Programming Languages and Tools (PLATEAU), co-located with SLASH 2010, October 2010.
Reducing Task Creation and Termination Overhead in Explicitly Parallel Programs. Jisheng Zhao, Jun Shirako, Krishna V. Nandivada, Vivek Sarkar. The Nineteenth International Conference on Parallel Architectures and Compilation Techniques (PACT), September 2010.
Verifying Determinism of Structured Parallel Programs. Martin Vechev, Eran Yahav, Raghavan Raman, Vivek Sarkar. Proceedings of the 17th International Statical Analysis Symposium (SAS 2010), September 2010.
A Scalable Locality-aware Adaptive Work-stealing Scheduler for Multi-core Task Parallelism. Yi Guo. Ph.D. Thesis, August 2010.
Concurrent Collections. Zoran Budimlic, Michael Burke, Vincent Cave, Kathleen Knobe, Geoff Lowney, Ryan Newton, Jens Palsberg, David Peixotto, Vivek Sarkar, Frank Schlimbach, SagnakTasirlar. Scientific Programming, 18:3-4, pp. 203-217, August 2010.
A Study of a Software Cache Implementation of the OpenMP Memory Model for Multicore and Manycore Architecture. Chen Chen, Joseph B. Manzano, Ge Gan, Guang R. Gao, Vivek Sarkar. Proceedings of Euro-Par 2010, August 2010.
SLAW: a Scalable Locality-aware Adaptive Work-stealing Scheduler. Yi Guo, Jisheng Zhao, Vincent Cave, Vivek Sarkar. 24th IEEE International Parallel and Distributed Processing Symposium (IPDPS), April 2010.
Hierarchical Phasers for Scalable Synchronization and Reduction. Jun Shirako, Vivek Sarkar. 24th IEEE International Parallel and Distributed Processing Symposium (IPDPS), April 2010.
Compiler Support for Work-Stealing Parallel Runtime Systems. Raghavan Raman, Jisheng Zhao, Zoran Budimlic, and Vivek Sarkar. Rice Technical Report, TR10-02, March 2010.
Software Challenges in Extreme Scale Systems. V. Sarkar, W. Harrod, A.E. Snavely. SciDAC Review Special Issue on Advanced Computing: The Roadmap to Exascale, pp. 60-65, January 2010.

2009

Hierarchical Place Trees: A Portable Abstraction for Task Parallelism and Date Movement. Yonghong Yan, Jisheng Zhao, Yi Guo, Vivek Sarkar. Proceedings of the 22nd Workshop on Languages and Compilers for Parallel Computing (LCPC), October 2009.
Efficient Optimization of Memory Accesses in Parallel Programs. Rajkishore Barik. Ph.D. thesis, October 2009.
DARPA Exascale Software Study report, Vivek Sarkar et al, September 2009.
Interprocedural Load Elimination for Dynamic Optimization of Parallel Programs. Rajkishore Barik, Vivek Sarkar. The Eighteenth International Conference on Parallel Architectures and Compilation Techniques (PACT), September 2009. One of three papers selected for Best Paper session.
JCUDA: a Programmer-Friendly Interface for Accelerating Java Programs with CUDA. Yonghong Yan, Max Grossman, Vivek Sarkar. Proceedings of Euro-Par 2009, August 2009.
Chunking Parallel Loops in the Presense of Synchronization. Jun Shirako, Jisheng Zhao, V. Krishna Nandivada, Vivek Sarkar. Proceedings of the 2009 ACM International Conference on Supercomputing (ICS), June 2009.
Work-First and Help-First Scheduling Policies for Terminally Strict Parallel Programs. Yi Guo, Rajkishore Barik, Raghavan Raman, Vivek Sarkar. 23rd IEEE International Parallel and Distributed Processing Symposium (IPDPS), May 2009.
Phaser Accumulators: a New Reduction Construct for Dynamic Parallelism. Jun Shirako, David Peixotto, Vivek Sarkar, William Scherer. 23rd IEEE International Parallel and Distributed Processing Symposium (IPDPS), May 2009
Compiler Support for Work-Stealing Parallel Runtime Systems. Raghavan Raman, M.S. Thesis, May 2009.
Programming Efficiency in Parallel Computing. Keisha Cumber, Stephanie Diehl, Chuck Koelbel, and Vivek Sarkar. 2009 Richard Tapia Celebration of Diversity in Computing Conference, April 2009.
Declarative Aspects of Memory Management in the Concurrent Collections Parallel Programming Model. Zoran Budimlic, Aparna Chandramowlishwaran, KathleenKnobe,Goeff Lowney, Vivek Sarkar, Leo Treggiari. Proceedings of DAMP 2009 Workshop (Declarative Aspects of Multicore Programming), co-located with POPL, January 2009.
Multicore Implementations of the Concurrent Collections Programming Model. Zoran Budimlic, Aparna Chandramowlishwaran, Kathleen Knobe, Geoff Lowney, Vivek Sarkar, Leo Treggiari. Proceedings of the 2009 Workshop on Compilers for Parallel Computing (CPC), January 2009.

2008

Array Optimizations for High Productivity Programming Languages. Mackale Joyner. Ph.D. Thesis, September 2008.
Minimum Lock Assignment: A Method for Exploiting Concurrency Among Critical Sections. Yuan Zhang, Vugranam Sreedhar, Weirong Zhu, Vivek Sarkar, Guang Gao. Proceedings of the 21st Workshop on Languages and Compilers for Parallel Computing (LCPC), July 2008.
Phasers: a Unified Deadlock-Free Construct for Collective and Point-to-point Synchronization. Jun Shirako, David Peixotto, Vivek Sarkar, William Scherer. Proceedings of the 2008 ACM International Conference on Supercomputing (ICS), June 2008.
Array Optimizations for Parallel Implementations of High Productivity Languages. Mackale Joyner, Zoran Budimlic, Vivek Sarkar, Rui Zhang. Proceedings of the HIPS-POHLL workshop, co-located with IPDPS. April 2008.
Type Inference for Locality Analysis of Distributed Data Structures. Satish Chandra, Vijay Saraswat, Vivek Sarkar, Rastislav Bodik. Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming (PPoPP), February 2008.

2007

Language Extensions in Support of Compiler Parallelization. Jun Shirako, Hironori Kasahara, Vivek Sarkar. Proceedings of the Twentieth Workshop on Languages and Compilers for Parallel Computing (LCPC), October 2007.
Optimizing Array Accesses in High Productivity Languages. Mackale Joyner, Zoran Budimlic, Vivek Sarkar. Proceedings of the 2007 Performance Computation Conference (HPCC), September 2007.

Acknowledgment

This material is based upon work supported by the National Science Foundation under Grants No. 0833166, 0938018, 0926127, 0964520, 1302570. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF).