Kevin Pouget / Latest news from Kevin Pouget PieCrust 3.3.0 Entitlement-Free Deployment of the NVIDIA GPU Operator on OpenShift /gpu-entitlement-free/ Senior software engineer,
doctor ès debugging
Senior software engineer,
doctor ès debugging
blog-post PSAP 2021-12-14T00:00:00+01:00 /gpu-entitlement-free/

Last night, we published the blog post presenting the work I lead during the past few months, where we removed the need to deploy RHEL entitlement certificates to build and deploy the GPU driver of NVIDIA GPU Operator. This requirement for a valid RHEL subscription key was a major burden for OpenShift GPU computing customers, as the key generation and deployment process couldn’t be properly automated.

This work was a great cooperation effort, as it required the enhancement of multiple parts of the overall software stack:

  • first at the team level with enhancements of the Node Feature Discovery (Eduardo Arango) and the OpenShift Driver Toolkit container image (David Gray and Zvonko Kaiser) + Ashish Kamra
  • then at the project level, with core OpenShift corner-cases bugs discovered and solved, revolving around the Driver Toolkit dynamic image-streams,
  • finally inter-company Open Source cooperation, with NVIDIA Cloud-Native team (Shiva Krishna Merla) reviewing the PRs with valuable feedback and bugs spotted in the middle of the different rewriting of the logic!

Link to the blog post

Timeline of the project milestones (in 2021):

  • May 27th..June 1st: idea of using the Driver Toolkit for entitlement-free arises from Slack discussion, to solve disconnected cluster deployment challenges. Confirming quickly that with minor changes, the DTK provides everything required to build NVIDIA driver.

  • July 30th..August 11th: working POC of the GPU Operator building without entitlement, without any modification of the operator, only a bit of configuration + manually baked YAML files

  • August 26th..~November 15th: intensive work to add seamless upgrade support to the POC and get it all polished, tested and merged in the GPU Operator

  • December 2nd: GPU Operator v1.9.0 releases, with the entitlement-free deployment enabled by default in OpenShift \o/

It’s funny to see how it took only a couple of days to get the first POC working, while the integration of the seamless upgrade support took two full months of work!

(Seamless upgrade support is the idea that at a given time during the cluster upgrade, different nodes may run different versions of the OS. With one-container-image-for-all-os-versions, no worry, the driver deployment will work all the time; but with one-container-imager-per-os-version, that’s another topic! This is covered in-depth in the blog post.)

Using NVIDIA A100’s Multi-Instance GPU to Run Multiple Workloads in Parallel on a Single GPU /a100_benchmark/ Senior software engineer,
doctor ès debugging
Senior software engineer,
doctor ès debugging
blog-post PSAP 2021-08-26T00:00:00+02:00 /a100_benchmark/

Today, my work on benchmarking NVIDIA A100 Multi-Instance GPUs running multiple AI/ML workloads in parallel has been published on OpenShift blog:

The new Multi-Instance GPU (MIG) feature lets GPUs based on the NVIDIA Ampere architecture run multiple GPU-accelerated CUDA applications in parallel in a fully isolated way. The compute units of the GPU, as well as its memory, can be partitioned into multiple MIG instances. Each of these instances presents as a stand-alone GPU device from the system perspective and can be bound to any application, container, or virtual machine running on the node. At the hardware-level, the MIG instance has its own dedicated resources (compute, cache, memory), so the workload running in one instance does not affect what is running on the other ones.

In collaboration with NVIDIA, we extended the GPU Operator to give OpenShift users the ability to dynamically reconfigure the geometry of the MIG partitioning. The geometry of the MIG partitioning is the way hardware resources are bound to MIG instances, so it directly influences their performance and the number of instances that can be allocated. The A100-40GB, which we used for this benchmark, has eight compute units and 40 GB of RAM. When the MIG mode is enabled, the eighth instance is reserved for resource management.

NVIDIA A100 MIG benchmark

Multi-Instance GPU Support with the GPU Operator v1.7.0 /mig-support/ Senior software engineer,
doctor ès debugging
Senior software engineer,
doctor ès debugging
blog-post Work 2021-06-15T00:00:00+02:00 /mig-support/

Today, my work on enabling NVIDIA GPU Operator to support the A100 Multi-Instance GPU capability has been released, and we published a blog post on the topic:

Version 1.7.0 of the GPU Operator has just landed in OpenShift OperatorHub, with many different updates. We are proud to announce that this version comes with the support of the NVIDIA Multi-Instance GPU (MIG) feature for the A100 and A30 Ampere cards. MIG is the capability of the NVIDIA GPU card to be partitioned into multiple instances and exposed into pods as independent GPUs.

This MIG support on the GPU Operator comes from a joint effort between the NVIDIA Cloud Native team and Red Hat Performance and Latency Sensitive Applications (PSAP) team.

NVIDIA A100 MIG support

Benchmarking HPC workloads on OpenShift /devconf/ Senior software engineer,
doctor ès debugging
Senior software engineer,
doctor ès debugging
talk Presentation PSAP 2021-02-19T00:00:00+01:00 /devconf/

Today, our talk (with my team-mate David Gray) entitled Benchmarking HPC workloads on OpenShift got presented at 2021. It’s a reworked version of what we presented at SuperComputing 2020 - OpenShift Gathering. This talk was more focused on my benchmarking work.

In this session, we’ll demonstrate how we used OpenShift as a proof-of-concept high-performance computing (HPC) platform for running scientific workload.

We’ll present the set of tools and operators that were used to setup the HPC environment, then we’ll introduce two scientific applications, Gromacs and Specfem, that we benchmarked on this cluster.

We’ll detail in how we ran Specfem on OpenShift with the help of a K8s Go client coordinating the application build and execution; and we’ll introduce the tool we designed to run the extensive benchmarking.

Finally, we’ll present the performance results on a 32-node cluster comparing OpenShift with an identical bare-metal cluster. 2021

HPC on Openshift Deploying Scientific Workloads on OpenShift /openshift_commons/ Senior software engineer,
doctor ès debugging
Senior software engineer,
doctor ès debugging
talk Presentation PSAP 2020-11-17T00:00:00+01:00 /openshift_commons/

Today, our talk (with my team-mate David Gray) entitled HPC on Openshift Deploying Scientific Workloads on OpenShift with the MPI Operator got presented at the OpenShift Commons workshop of KubeCon/NA conference.

High Performance Computing (HPC) workloads increasingly rely on the use of containers that make applications easier to manage, preserve their dependencies and add portability across different environments. Red Hat OpenShift Container Platform is a Kubernetes-based platform for deploying containerized applications on shared compute resources.

In this talk we will show how to effectively deploy scientific applications, GROMACS and SPECFEM3D Globe, on OpenShift using the MPI Operator from the Kubeflow project using two different distributed shared filesystems, Lustre and CephFS.

We also publish in-depth blog posts on this topic:

OpenShift Commons

Recording SPICE Adaptive Streaming /spice_recording/ Senior software engineer,
doctor ès debugging
Senior software engineer,
doctor ès debugging
talk Presentation 2020-04-14T00:00:00+02:00 /spice_recording/

Focused version

Today, I recorded a video clip presenting my work on SPICE Adaptive Streaming (based on my last talk in Grenoble).

More demos about the SPICE project are available on the website.

As part of the SPICE Adaptive Streaming project, we developed a toolbox for studying the performance of real-time video streaming. The toolbox consists in:

(1) a recording infrastructure that collects performance indicators in the guest/host/client systems,

(2) a scripted benchmarking engine that controls the systems’ configuration and the video encoding settings, and benchmarks each set of parameters one by one, and

(3) a graph visualization GUI that plots the benchmarks results and allows studying the impact of each parameter variations.

In the current step, we are working on a mathematical modelisation of the resource usage (CPU, GPU, etc.)

SPICE Adaptive Streaming @ Grenoble Python User Group /meetup-grenoble/ Senior software engineer,
doctor ès debugging
Senior software engineer,
doctor ès debugging
talk Presentation 2020-02-18T00:00:00+01:00 /meetup-grenoble/

Grenoble Python User Group

This evening, I will present the work I’m currently doing as part of Red Hat SPICE team on the performance measurement and analysis of real-time video streaming, at the Grenoble Python User Group meeting.

Working on SPICE at RedHat /redhat/ Senior software engineer,
doctor ès debugging
Senior software engineer,
doctor ès debugging
2019-06-03T00:00:00+02:00 /redhat/

Since June 3rd, 2019, I joined RedHat SPICE team, working remotely from my place in Grenoble area, France.

With team mates spread all over Italy, UK, Poland, Israel, Brazil and the US, I will work on SPICE, RedHat solution for remote virtual desktop: you run virtual machines in a powerful server, and you access them transparently over the LAN or Internet network.

SPICE offers features such as:

  • USB redirection (plug your mouse/keyboard/USB stick in your computer, and it shows up as plugged in the VM),
  • file drag-and-drop, to seamlessly transfer files from your computer to the VM, as well as shared directories,
  • shared clipboard for transparent copy-and-paste

Happy to join RedHat

ExaNode/ExaNeSt @ eXdci'18 ExascaleHPC workshop /hipeac-valencia/ Senior software engineer,
doctor ès debugging
Senior software engineer,
doctor ès debugging
talk Presentation 2019-01-21T00:00:00+01:00 /hipeac-valencia/

HiPEAC conference workshop

Today, my colleage Alvise Rigo is at Valencia, Spain, at the the HiPEAC conference to present our work on virtualisation at the EuroExaScale joint-Workshop between ExaNoDe, ExaNeSt, ECOSCALE and EuroEXA projects.

The talk is entitled:

OpenCL Accelerator Para-Virtualization

It presents the work I led on OpenCL software API Remoting.

VOSYS September Newsletter /newsletter-18-09/ Senior software engineer,
doctor ès debugging
Senior software engineer,
doctor ès debugging
newsletter virtualization 2018-09-01T00:00:00+02:00 /newsletter-18-09/

Virtual Open Systems September newsletter was published today, with two articles about my work:

Mixed-critical virtualization: VOSySmcs, mixed-criticality graphic support

VOSySmcs consists of a full fledged software stack to support a modern generation of car virtual cockpit where the In-Vehicle Infotainment (IVI) system and the Instrument Digital Cluster are consolidated and interact on a single platform. Indeed, traditional gauges and lamps are replaced by digital screens offering opportunities for new functions and interactivity. Vehicle information, entertainment, navigation, camera/video and device connectivity are being combined into displays. However, this different information does not have the same level of criticality and the consolidation of mixed-critical applications represent a real challenge.

In this context, VOSySmcs includes a mixed-criticality graphic support that enables the integration of safety-critical and non-critical information on a single display, while providing rendering guarantees for the safety-critical output. In addition, VOSySmcs supports GPU virtualization in order to provide hardware acceleration capacity for the Virtual Machines running in the non-critical partition such as Linux, Android, etc.


Computation acceleration: OpenCL inside VMs and containers

As part of the ExaNoDe H2020 research project, Virtual Open Systems develops a software API remoting solution for OpenCL. OpenCL is an open standard maintained by the Khronos Group, used for offloading computation tasks into accelerators, such as GPUs and FPGAs.

Software API remoting is a para-virtualization technique that allows accessing a host native library from the inside of a virtual machines. It operates by intercepting API function calls from the application in the guest system, and forwarding them to a helper process on the host through the use of shared memory pages. API remoting for containers can be achieved similarly, by replacing the host-to-VM communication layer (based on Virtio) with Linux inter-process communication mechanisms.

To comply with the high performance requirements of OpenCL usage, it is important to reduce as much as possible the overhead of the API remoting layer. Hence, the work has focused on passing the data buffers (that may account for several gigabytes of memory) with zero copies, that to guest physical pages lookup and remapping.