OpenACC User Group Meetups: Meeting Minutes

The OpenACC organization has so far hosted four user group meets co-located with

  • Supercomputing Conference (SC16) @ Salt Lake City, Utah
  • GTC 2017 @ San Jose, CA
  • ISC 2017 @ Frankfurt, Germany
  • Supercomputing Conference (SC17) @ Denver, CO

On an average, these meet ups have been attended by over 65 interested scientists, researchers, developers and students from various organizations, institutions and companies from across the globe.

Meet up Outcomes include:

  • A slack communication channel has been created for all attendees to stay in touch with each other and learn from each other. In less than a year, we were over 300 members in Slack
  • Several senior researchers such as Jack Wells, Jeff Vetter, Raghu Kumar (National Center for Atmospheric Research, NCAR) shared their experiences using OpenACC on real-world scientific applications.
  • Scientific codes were solicited for yearly GPU programming hackathons organized by OLCF run by Fernanda Foertter @ OLCF; these are training forums where teams submit their application and upon selection they are paired with mentors to facilitate migration of their codes to large scale supercomputers including TITAN, JURECA
  • Hackathon organizers from NASA, CSCS, BNL, TU-Dresden shared their experiences of how OpenACC has enabled them to do more science and less programming via 5-day hackathons.
  • Mentors from the user group meetups were recruited for these hackathons
  • Newer collaborations have emerged among participants at the meetups
  • Sample and small benchmark codes were collected and populated at OpenACC open source GitHub  for those who would like to teach OpenACC at their respective institutions and organizations
  • Newer locations to host OpenACC training and workshops were identified to span the US and other countries
  • Recently edited books on OpenACC were given to the user group attendees:
  • SC17_.JPG

OpenACC User group. Hilton, Almaden Ballroom.OpenACC User group. Hilton, Almaden Ballroom.GTC_2017_2.jpgSC17_Jack_wellsSC17_Raghu_NCAR






OpenACC Usergroup co-located with GTC 2018; Meeting Minutes

Thank you very much for coming!!! Hope you all had fun!! 🙂

Pls share your feedback with us.

Here are some pics from the event!

My so very special thanks to Julia Levites (NVIDIA), Andi Moore (PGI) and of course, Duncan Poole (President, OpenACC) for their help with planning and execution of this meetup !!!  🙂

John Stone’s (UIUC) and Randy Allen’s (Mentor Graphics) slides are available here.

My very special thanks go out to both these speakers for setting the stage at the meetup!!!!

Michael Wolfe’s talk (PGI)

  • True deep copy (PGI is currently working on an implementation)
  • Asynchronous feature for multicore architectures
  • Multiple devices
  • Abstractions for memory hierarchies
  • Defining how OpenACC works with devices that share part of memory such as CUDA unified memory
  • Request for examples of applications or algorithms that need to be expressed differently for GPU and CPU, to drive a discussion of how to express them more abstractly allowing the implementation (compiler and runtime) to select the appropriate implementation.

July 31 – OpenACC Scientific Application User feedback session at Oak Ridge National Laboratory (ORNL)

  • If you are using OpenACC for porting your scientific applications and have been thinking about newer directives that the specification currently lacks, talk to us!

  • Your talk doesn’t need to be formal at all. What we would like to hear about are the challenges you faced using OpenACC for your applications, newer features/directives you are thinking about and compiler bugs that you would have encountered in your porting process.

  • You are welcome to come in person to ORNL or dial in via webex.

  • Write to me ( if you wish to be scheduled to give us your feedback.


Access to Teaching Materials

Please contact Julia Levites <>. These materials have helped me a lot, and I am sure it will help you too!!

Two Books:

  • OpenACC for Programmers: Concepts and Strategies” -> GitHub consisting of exercises, hands-on examples, and solutions, edited by Sunita Chandrasekaran and Guido Juckeland

  • Parallel Programming with OpenACC” -> Github consisting of exercises, hands-on examples, and solutions, edited by Rob Farber



On behalf of the wonderful OpenACC team,

Sunita Chandrasekaran

Director, OpenACC User Adoption

OpenACC Tutorial + Workshop

Thank you for taking the time to read my blog on UDEL’s GPU Hackathon and checking out the article 🙂

After the success of the  GPU Hackathon (linked to the video) in early May this year, NVIDIA and PGI decided to come back to University of Delaware campus to host lectures and workshops on OpenACC from June 7th through June 9th, 2016.

Yay!!!! Thank you NVIDIA/PGI! 🙂

Tutorial on Day 1 was open to all registrants.

Presented by NVIDIA team members Mathew Colgrove, Abel Brown and Barton Fiske, the workshop introduced programming techniques using OpenACC and included topics such as optimization and profiling methods for GPU programming. The teams used PGI OpenACC compilers.


Barton Fiske of NVIDIA kicked-off the 3-day workshop by sharing how NVIDIA has emerged as the world leader in visual computing along with the different programs NVIDIA has to offer for academia including the teaching kit.


The lectures and hands-on exercises were given by Mathew Colgrove of NVIDIA’s PGI compiler team covering the use of OpenACC on GPU accelerated systems. GPUs, as you know, are the most pervasive parallel computing model, used by over 300,000 developers worldwide.

Attendees included faculty, undergraduate, post-graduate students and research scientists from Computer & Information Sciences, Electrical and Computer Engineering, Chemical and Biomolecular Engineering, Mechanical Engineering of the University of Delaware, faculty and students from Prof. Tomasz Smolinski’s team @ Delaware State University and from Prof. Haklin Kimm @ East Stroudsbrug University, Pennsylvania. IMG_2298.JPG

When you are teaching non-CS students, CS – don’t you feel you are on top of the world? 🙂 I do! Interdisciplinary research is so important!


OpenACC compilers can also target multicore platforms. Yes, they can. Read more.

So if you want to try OpenACC programming model on your quad-core or dual-core laptop, you would simply download the OpenACC Toolkit (free for academia) that includes the popular PGI Accelerator Fortran/C Compiler and developer tools for acceleration with OpenACC. (this is just in case you do not have a PGI compiler license, yet). More useful resources along with online course materials.


Mini-workshop on Day 2 and Day 3: 

A team of 4 from East Stroudsbrug University (ESU) and a team of 6 from Delaware State University (DSU) worked on parallelizing their codes that represented Evolutionary Algorithm, Dynamic Programming Algorithm and Satellite Image Processing Algorithm.

The team from ESU were using Matlab for image processing and have been trying to use OpenMP and OpenACC directives. To them it seemed as though it was impossible to use directives for their image processing code.

A CS Masters student, Aakashdeep Goyal from ESU says “The workshop was not only limited to discuss the OpenACC framework but also provided a background study of the various existing parallel processing alternatives through open discussions.”

So this is the part I enjoy the most about the Hackathon as well as the workshop. It’s just a brilliant forum to brainstorm ideas on the white board with mentors and a bunch of eagerly-awaiting-to-learn participants. IMG_0985This group learnt that “libraries” are the way to go!! NVdians helped the team use  OpenCV libraries instead of MATLAB and were able to integrate that with OpenMP on Ubuntu 14.04. The team used Eclipse for the same. Since the code was in MATLAB to begin with, the team spent both the days converting the code to OpenCV.

Now that the team has undergone vigorous training on OpenACC and know to use OpenCV, they plan to use OpenACC directives for C++ enabled OpenCV and later on using CUDA. The aim is to test the non-parametric regression model along with other filtering algorithms for edge detection and linkage using the OpenACC directives. The team is confident to have a working OpenACC code within the next several weeks. (This sounds positive so stay tuned for updates :-)!)

Another algorithm that one of their team members, Zuqing Z Li presented, was the Dynamic Programming algorithm. This is a classic wavefront-based problem! Every cell depends on all of its neighboring cells making it a very interesting problem since unless you fully compute the upper triangle, you cannot compute the cells of the leading diagonal and so on and so forth. There are other research groups that have used CUDA on exploiting wavefront parallelization. So we discussed with the team some of the CUDA strategies that could be transformed to OpenACC and the team is looking forward to implementing some of those strategies and probably even use MPI + OpenACC across nodes.


The team from DSU brainstormed parallelization of an Evolutionary Algorithm. These are algorithms inspired by the biological model of evolution. Genetic Algorithm (GA) is the most common type of Evolutionary Algorithm. The team came with the bulk of the evolutionary library to the workshop but their goal was to learn ways to parallelize the algorithm. As the slide presented by Prof. Tomasz Smolinski shows, the library in c++ was in development since 1997 (lots of legacy code!!!)

IMG_0979The team’s goal was to transplant their Multi-Objective Evolutionary Algorithms (MOEA) library onto the GPU platform. The library is application-agnostic, and has been successfully utilized in various domains, including computational modeling of neurons, signal decomposition, and mining for association rules in large data sets. Ultimately, the library will be the engine behind their open-source application, called NeRvolver, which will allow users from all over the world, through a web interface, generate and analyze neuronal models.

IMG_0982The team spent most of Day 2 brainstorming with Tristan Vanderbruggen and Robert Searles – mentors from University of Delaware and expert programmers of accelerators, about how to manage moving data to and from the host and the device.

Ahaa moment !! After several white board sessions, the conclusion was that the new code would create the initial genotypes on the GPU, after which crossover and mutation would occur. Then these individuals would be sent to the simulator, which returns the fitness values of these models to the GPU. On the GPU, they also hoped to store their archive of elite models, which would be updated throughout the simulation.

But wait a minute – that was not all of it, there was yet another challenge- the size of the archive would change over time and become larger than the population (i.e. size of each generation) and therefore how to allocate the appropriate space on the device???

Well – I guess they were glad that they have identified the challenge! 🙂 Sometimes finding the problem can be a challenge (Now, how many of you have experienced that!! ;-))

By the end of Day 3, with Mat Colgrove’s help, the team had a working OpenACC C++ code of the algorithm!!! The code had several compute kernels denoting it was thoroughly compute-intensive and could benefit from GPUs while using OpenACC.

Although they are at the beginning of the tunnel at the moment, Karla M Miletti, an undergraduate student at the CIS department from DSU is hopeful to take this to the next level. She says:

“Before Wednesday we were simply hopeful we could use GPU’s to optimize our algorithm since the evolutionary library actually passes control to a simulator (such as Neuron) which usually runs sequentially. However I think we managed to find a good application of OpenACC and high performance computing to our evolutionary algorithm. Eventually we hope to figure out how to parallelize the simulator”.


UDEL GPU Programming Hackathon, 2016

DAY 5 (LAST DAY) May 06 – 2016 (Scroll down for Day 1, 2, 3, 4 updates)

<for more details on the Hackathon>


OMG! I cannot begin to describe what a tremendous experience it was to have these several fabulous teams for a WEEK at UDEL. HATS OFF TO MENTORS FROM NVIDIA, PGI, ORNL, UTK & CORNELL. If you would like to get a 90 seconds overview of this week-long hackathon, check out:

Are you curious how intense the week was? (We had a 7PM reservation for beer night at a local restaurant. It was close to 8PM and nobody stopped hacking or left the room!) :-)!  Well thankfully our reservation wasn’t cancelled when we finally got there :-).


One of the teams from the Chemical & Bimolecular engineering with quite a limited background in Computer Science moved from windows to LINUX this week (naturally got a round of applause for just that) and already noticed improvement in their numbers. #CSforall matters!!! 


Teams that hadn’t used OpenACC before participating in the Hackathon picked up the high-level directive-based model pretty quickly, moved code to TITAN and even started to optimize and observed some speedup! #Directives matter! For one of other teams the jury was still out for OpenACC vs CUDA vs X.  More testing and investigation to-do.

Other random notes: Techniques to reduce launch overhead – merge multiple memcpy into one by allocating one array and doing pointer math. Sometimes CPU can be the bottleneck? Oh – and this shouldn’t be surprising. To get better speedup on GPU you might want to consider moving smaller kernels to CPU.  Size of the data MATTERS to get the BEST out of GPU.

THINK MEMORY FIRST! It’s all about memory. Platforms are only getting more and more complex with deeper and deeper memory hierarchies. Plenty of research to do. Ph.D. students: are you listening?

If you want to make some real progress at a Hackathon, you are better off breaking down a large code (several thousands of LOC) into several sub-problems.

If it is legacy code that is too optimized for CPU, you are better off starting at a less mature point for a GPU implementation.

Depending on code characteristics, it may take quite an amount of refactoring to benefit from GPUs.

Sometimes you may have a mini app working OK with correct results and satisfactory performance, but you may not see a similar outcome on a real app.

Migrating legacy code is tough, time consuming, energy draining with not a lot of hope. However you ‘cannot’ afford to NOT be in the game. Architectures are changing – RAPIDLY. The applications ‘have’ to catch up. Refactoring has to be an option to be seriously considered.

Report compiler bugs. Workaround are quick fixes and not a permanent solution.

Team with a highly complicated C++ code with regular data access patterns (originally designed for OpenMP + MPI + SIMD on Intel KNL) after going through GPU programming experience is now hopeful and quite determined to move their code to GPUs.

Once the mentors are assigned, make sure you bring him/her up to speed on the algorithm, complexity, expectations.

While waving good-bye, one of the teams says “could we have a 2-week hackaton” ? :-)! My eyes lit up! And just that made mine and Fernanda Foertter’s (my co-organizer from Oak Ridge National Lab) day !!!

Programming Exascale machine is a challenge but with such training events, I think it is a pleasant challenge!!!


DAY 4, May 05 – 2016 (Scroll down for Day 1, 2, and 3 updates)

<for more details on the Hackathon>

The Plateau of Enlightenment !!

More bugs and fixes. Eureka moments. Aha moments! Hey – why didn’t I try this on Day 1, moments! And hope these pictures give you an idea of the mood in the room 🙂

Mixed feelings and experiences about using pinned/managed memory, but the bottom line is exploring the best strategy to use memory/cache efficiently is the key to success! Other tips include – expose enough parallelism to saturate the device, keep the data copy back and forth from/to CPU and GPU as minimum as possible, or at least overlap the communication with computation.

You may use ‘async’ clause on parallel or kernel to launch work in queue asynchronously and say execute loops asynchronously; also helps with pipelining, however if the operations were already saturating the device, do not expect the ‘async’ clause to be too helpful or in other words operations to interleave.

Other optimizations that were tried and tested included tiling, nested gangs, loop fission. Some students continued to explore better ways to program on multicore using OpenACC.

‘Hero’ profiler tools – nvprof, TAU among others. Without these tools helping identify optimization opportunities, we would be nowhere! Profilers help identify causes for performance limitations; is it due to memory bandwidth? Compute Resources? Latency issues? One of the teams was digging deeper into tuning low-level parameters by leveraging nvprof output.

By popular request from the teams, an hour was dedicated to learn more about TAU, although I didn’t record this talk, check out a similar talk from the Extreme Scale Computing Program:

And we were sugar high by the end of the day! 🙂

DAY 3, May 04 – 2016 (Scroll down for Day 1 and 2 updates)

<for more details on the Hackathon>

New day, New beginning, New ideas and strategies = Slope of Hope !!!


Some codes captured interesting bugs and corner cases that have been reported and filed. These are usually considered to be the best case studies for the compiler implementations to be improved.



Fine tuned, manageable kernels with reduced LOC are now ported to GPUs using OpenACC. They are performing better than CPUs and the team is looking to further improving the performance by exploring gang, worker and vector levels of parallelism. Another team is investigating how to overlap communication with computation.


Codes with deeper and deeper nested templates could, to an extent be tackled with a motto ‘comment and conquer’. The team is now moving on and considering to look into a smaller c++ code. One of the other teams is exploring OpenACC’s interoperability with OpenMP and even considering to use 2 GPUs. Larger datasets matter here! 

If you don’t already know, OpenACC codes can run on multicore (Note: Use PGI 15.10 onwards if you want to run your OpenACC code on a multicore platform). Do not miss to check out PGI’s Michael Wolfe article on “OpenACC for Multicore CPUs“.

A team with chemical and biomolecular engineeering background that has never used CUDA or programmed on GPUs has now profiled their code to the find the ‘hot spots’ followed by porting the code to TITAN supercomputer and already seeing some speedup! Isn’t that fascinating?!

UDEL GPU Programming Hackathon, 2016

DAY 2, May 03 –  2016 (Scroll down for Day 1 updates)

<for more details on the hackathon>

So what was Day 2 like?

Let’s start with a math equation, shall we? 🙂

Programmers’ patience tested!

Profilers like TAU and nvprof are every team’s best friends at this point. One of the team’s kernel is over 5K LOC showcasing low latency and poor data access pattern. Another team is working on C++ codes and as observed in the past hackathons, it’s been a challenge to use OpenACC on such codes that are deeply nested and heavily templated. One of the other teams is already seeing ~1.2x speedup on GPUs using OpenACC comparing with OpenMP.

Some of the optimizations that the teams have been using include loop reorganization, kernels splitting, and flattening call structures for C++ codes. Some are restructuring their codes and trying to use the memory and caches efficiently.

Corner cases are being reported to the compiler developers and this sort of feedback is really important to improve OpenACC compiler implementations!!

Here you go with a screen full of errors – well actually doesn’t fit within a screen!

Blue screen of despair 😉


UDEL GPU Programming Hackathon, 2016

DAY 1, May 02 – 2016

<for more details on the hackathon>

Today, May 02 2016, GPU Programming Hackathon co-organized with Oak Ridge National kicked off today at the University of Delaware. Dr. Eric Nielsen from NASA Langley gave an invited talk on FUN3D-arge-scale computational fluid dynamics solver for complex aerodynamic flows seen in a broad range of aerospace (and other) applications.

6 teams participate in this hackathon that aims to meet several kinds of expectations a) Educate teams to program GPUs using high-level directive-based programming models, OpenACC, OpenMP c) Train teams to accelerate their codes on GPUs or CPUs d) Provide teams with a clear roadmap on how to tap into massive potential that GPUs can offer.

Several mentors from NVIDIA/PGI, UTK, Cornell, ORNL and UDEL with extensive programming experience are on site at UDEL to work with the teams to facilitate them meet their desired expectations.

Teams are paired up mentors. Goals are set. White boards are filled up with equations, tables, figures and what not! Codes are being migrated to ORNL’s TITAN (world’s second largest supercomputer) as we speak!

It’s Showtime, folks!!

The participating teams are:

  • NASA Langley

FUN3D is a large-scale computational fluid dynamics solver for complex aerodynamic flows seen in a broad range of aerospace (and other) applications. FUN3D solves the Navier-Stokes equations on unstructured meshes using a node-based finite volume spatial discretization with implicit time advancement. FUN3D is approximately 800,000 lines of code.

FUN3D is predominantly written in Fortran 200x. The code can make use of a broad range of third-party libraries, depending on the application. At a minimum, an MPI implementation must be available, as well as one of the supported partitioning packages (Metis, ParMetis, or Zoltan).

FUN3D is used across the country on a wide range of systems, from desktops to large HPC resources. The code has been run out to 80,000 cores (CPU) on TITAN. Select kernels have been ported to GPU, with the majority of effort to-date spent on the workhorse linear solver (multicolor point-implicit). Some OpenACC, some CUDA Fortran through an ongoing collaboration with NVIDIA.

FUN3D is widely used to support major national research and engineering efforts, both within NASA and among groups across U.S. industry, the Department of Defense, and academia. A past collaboration with the Department of Energy received the Gordon Bell Prize. Some applications that FUN3D currently supports include:

– NASA aeronautics research, spanning fixed-wing applications, rotary-wing vehicles, and supersonic boom mitigation efforts.
– Design and analysis of NASA’s new Space Launch System.
– Analysis of re-entry deceleration concepts for NASA space missions, such as supersonic retro-propulsion and hypersonic inflatable aerodynamic decelerator systems.

– Development of commercial crew spacecraft at companies such as SpaceX.
– Timely analysis of vehicles and weapons systems for U.S. military efforts around the world.
– Efficient green energy concept development, such as wind turbine design and drag minimization for long-haul trucking.

  • Brookhaven National Lab’s Lattice QCD

Lattice QCD, a numerical simulation approach to solve the high-dimensional non-linear problems in strong interactions. Lattice QCD is an indispensable tool for nuclear and particle physics. The heart of Lattice QCD is Monte Carlo simulations, in which the dominating numerical cost (>90%) is the matrix inversion of the type Ax=b or A^+ A x = b. Due to the high numerical cost, Lattice QCD simulations typically run on massively parallel computers and on PC clusters to hundreds of nodes.

The particular code that aimed to port to GPUs using OpenACC is the newly engineered Grid library: It is written in C++11 at the top level, with a vectorized data layout and SIMD intrinsics targeting current and upcoming Intel CPUs with long vector registers. Right now it has OpenMP for threading and MPI for communications. The code has a total of about 60,000 lines and is evolving. But the Dslash compute kernel, mainly matrix vector multiplications, that is needed in the matrix inversions is relatively localized.

Right now Grid runs on PC clusters with Intel CPUs, achieves about 25% peak single-node performance in single precision on the Cori Phase I machine with Intel Haswell CPUs (~600 GFlops/node) and is one of the best Lattice QCD CPU codes available to date. Turning on communications drops the performance down to about 1/4 of the single-node performance.

Grid is a new piece of code still under development, so the current user community is limited to the developers and early users. It is expected that eventually Grid will be used by many users in the lattice QCD community on the CORAL machines, and on the exascale computers further down the road. But to ensure that it will have the widest user base, it will need to make it portable across different platforms. Hence the interest to using OpenACC to port it to GPUs.

The constructs of Grid have portability in mind. The OpenMP pragmas are contained in macros, which can be replaced with OpenACC pragmas on the first pass. It will be interesting to see how much more tuning is required to achieve good performance on the GPUs.

  • National Cancer Institute – CBIIT Team

The application determines RNA structure using data from small angle x-ray scattering experiments. The application has been optimized for CPU performance and parallelized with OpenMP and MPI. Preliminary explorations with GPU technologies have been performed and several folds of speedup is expected to be achieved on GPUs. With the increasing need for RNA structures in biological applications and availability of instrument data, the code is expected to have a much broader impact across a large biophysical structure and molecular modeling community.
The primary application runs locally on the Biowulf cluster (non-GPU) and on the Mira system at Argonne National Laboratory.

  • UDEL CIS Dr. John Cazos’s team

The application takes a graph based representation of a program and detects whether that application is malicious and if it is, categorizes it in the appropriate family of malicious code based on its characteristics. This application analyzes program similarity. In particular, the focus is on static program graphs. There is a very large set of graphs (hundreds of thousands, even up to millions) on which similarity analysis needs to be performed. The algorithm currently runs on the CPU and makes use of multi-core CPU, and the goal is to port it to the GPU. A minimum of 78% efficiency is achieved across 16 CPU cores, and a similar efficiency is expected on the GPU since the problem is embarrassingly parallel. The code uses Python and C++ .


  • UDEL Dr. Michael Klein’s Chem & Biomolecular Engineering Team

This application is called the Kinetic Model Builder. As its name might imply, it is used to build models based on chemical engineering principles. In order for a user to obtain a model of their engineering system the following inputs must be given: a reaction network (ex. A->B, BC), properties of species involved, reactor type and conditions, and chemical kinetic parameters. With the aforementioned inputs, the application creates a set of ordinary differential equation based on microkinetics and solves them using the C-based variable order differential equation solver (CVODES) by Lawrence Livermore National Laboratory. If the user has data for the output (ex. output concentrations) of their engineering system, then optimization of kinetic parameters may be achieved via an adapted simulated annealing (ASA) algorithm written by Cal Tech.

Everything in the Kinetic Model Builder is written in C++ with object oriented coding in mind and runs on the CPU. The focus would be to optimize the code. There are 7,433 lines of ASA code. The code of ASA was written in C at CalTech but has been adapted with more C++ characteristics for its use in the Kinetic Model Builder here at UD. In terms of performance it only uses ~20% of the CPU and 20 MB of memory with no noticeable memory leaks. The time to find a solution becomes an issue and scales linearly depending on model size, with larger solution times for larger models. Performance gain is envisioned using GPU.

  • UDEL ECE Dr. Guang Gao’s Team

This application belonging to physics domain is a simple iterative solver that uses a 5-point stencil. The goal is to extend the team’s open source dataflow based runtime system called DARTS to support heterogeneous architectures containing CPU and GPUs tasks. Specifically we are coding a benchmark of a generic 5 points stencil for our RTS that run on both CPU and GPU. The runtime system is about 10000 lines of code. And there are several implementations of the stencil benchmarks each around 500 LOC.

The runtime uses C++ and relies on open source libraries HWLOC and Intel TBB. The application uses both CPU and GPU and the goal is to make the runtime system be able to use both at the same time exploiting the whole machine.

The expectation is to get higher performance than using a classical CPU-based (and CPU-only) approach. Right now the team is seeing ~3× speedups (compared with a sequential execution time) using CUDA. Expectation is to improve this by at least ~6× if possible (this is the max peak performance on selected input sizes using OpenMP)