Videos

High-Performance Data Science with Modern C++: Xeus-Cling and G3P

I presented this webinar on February 12th, 2025, as a part of a series of weekly Compute Ontario Colloquia. This is a series of talks about using modern C++ for high-performance data science. In the first talk of the series, we talk a little bit about the pros and cons of using C++ for data science projects and then we cover Xeus-Cling and G3P for rapid-prototyping of C++ codes and embedding plots and charts in a Jupyter notebook, respectively. The whole series will be available as an executable book at https://armin.sobhani.me/high-performance-data-science-with-modern-cpp .

Friday, February 14, 2025 Read

The Emergence of WebAssembly (Wasm) in Scientific Computing

I presented this webinar on August 7th, 2024, as a part of a series of weekly Compute Ontario Colloquia. Developed collaboratively by major browser vendors, including Mozilla, Google, Microsoft, and Apple, WebAssembly (Wasm) addresses the limitations of traditional web programming languages like JavaScript. But what makes it so compelling for scientists? First, Wasm allows code written in languages like C/C++, Fortran or Rust to be compiled into its instruction format and run directly in the browser, making it accessible to anyone without installation hassles and eliminating the need for external servers. Second, with Wasm, developers can recycle existing code with near-native performance but without the hassle of rewriting it in JavaScript. Join us as we explore how Wasm is reshaping scientific workflows and empowering researchers worldwide.

Friday, September 20, 2024 Read

p2rng – A C++ Parallel Random Number Generator Library for the Masses

I presented this webinar on October 18th, 2023, as a part of a series of weekly Compute Ontario Colloquia. p2rng (https://github.com/arminms/p2rng) is a modern header-only C++ library for parallel algorithmic (pseudo) random number generation supporting OpenMP, CUDA, ROCm and oneAPI. Playing fair, mostly required for debugging and unit testing, is one of the unique features of p2rng. That means using the same seed and distribution you always get the same sequence of random numbers on all supported platforms. p2rng provides parallel versions of STL’s std::generate() and std::generate_n() algorithms with the same interface. In this seminar we first start with a quick review of preliminary concepts about algorithmic random number generators in general and parallelization techniques in particular. Then we continue with the standard way of generating random numbers with STL algorithms and how we can turn them into parallel version using p2rng.

Monday, October 23, 2023 Read

CUDA, ROCm, oneAPI – All for One or One for All?

I presented this webinar on April 19th, 2023, as a part of a series of weekly Compute Ontario Colloquia. For a long time, CUDA was the platform of choice for developing applications running on NVIDIA’s GPUs. That is starting to change in recent years with the introduction of AMD’s ROCm and Intel’s oneAPI which both support GPUs by other vendors. While ROCm targets both AMD and NVIDIA GPUs, using the recently released drivers by CodePlay, oneAPI applications can run on NVIDIA and AMD in addition to Intel’s GPUs. The question this seminar is trying to answer is if in 2023 you want to start a project targeting GPUs, what would be your platform of choice? Should you go with one or all of them? Later in the seminar, a boilerplate framework named one4all will be introduced that streamlines the process of developing applications targeting all the above platforms. Unit testing with Catch2 as well as benchmarking with Google benchmark are already supported by the framework.

Friday, April 21, 2023 Read

Remote Development on Clusters with VSCode - Part II

I presented this webinar on September 7th, 2022 as a part of a series of regular biweekly General Interest Webinars ran by SHARCNET. Visual Studio Code (VSCode) is a free and open-source code editor developed by Microsoft for Windows, Linux, and macOS. It includes support for debugging, embedded Git version control, syntax highlighting, intelligent code completion, snippets, and code refactoring. In Part I of this seminar we covered configuring SSH agent and config file, version controlling with external repositories, providing makefile and CMake support and remote debugging on compute nodes using the proxy jump method:

Wednesday, September 7, 2022 Read

Remote Development on Clusters with VSCode - Part I

I presented this webinar on January 12th, 2022 as a part of a series of regular biweekly General Interest Webinars ran by SHARCNET. Visual Studio Code (VSCode) is a free and open-source code editor developed by Microsoft for Windows, Linux, and macOS. It includes support for debugging, embedded Git version control, syntax highlighting, intelligent code completion, snippets, and code refactoring. It is also extensible and customizable, so users can install extensions to add new languages, themes, debuggers and change the editor’s keyboard shortcuts, and preferences.

Wednesday, January 12, 2022 Read

Scalable Memory Allocation for Parallel Algorithms

I presented this webinar on March 17, 2021 as a part of a series of regular biweekly General Interest Webinars ran by SHARCNET. In a multithreaded C/C++ program, using standard non-threaded allocators, memory allocation can become a bottleneck. That is firstly caused by thread competition for a lock on a shared global heap, and secondly for caching effects. Programs that run this way are not scalable and may slow down as the number of cores increases. Scalable memory allocators such as Intel’s TBB allocators, FreeBSD’s jemalloc and Google’s TCMalloc solve this problem by providing various optimizations such as per-CPU caches, thread-private heaps, sized deletes and fast/slow path improvements. You can easily gain a 20-30% performance improvement for parallel sections and even 4X in extreme cases by simply relinking with a scalable memory allocator. This webinar will tell you all about these allocators, with a live session running some benchmarks at the end. Materials presented during the live session are available on GitHub.

Wednesday, March 17, 2021 Read

How to Use C++ Parallel Algorithms in a Distributed Memory Setup (i.e. MPI)

I presented this webinar on July 29th, 2020 as a part of a series of regular biweekly General Interest Webinars ran by SHARCNET. Last year and earlier this year, SHARCNET presented two webinars introducing C++17 parallel algorithms (first webinar; second webinar). There was an interesting frequently asked question: is it possible to use them in an MPI setup? This seminar tries to address that question. First, there will be a very short intro to C++17 parallel algorithms followed by an overview of Partitioned Global Address Space (PGAS) parallel programming model. Then, DASH C++ template library will be introduced. A live demonstration of installing and building programs with DASH concludes the webinar. You can find material presented during the live session on GitHub.

Wednesday, July 29, 2020 Read

Dipping into C++17 Parallel Algorithms with Intel's Parallel STL

I presented this webinar on February 27th, 2019 as a part of a series of regular biweekly General Interest Webinars ran by SHARCNET. If you are programming or have already developed with C++, there is a good chance that you have used Standard Template Library (STL) containers and algorithms in your codes. In that case, you can easily boost the performance of your existing codes with parallel algorithms introduced in C++17. The good news is you do not have to wait until the support for the parallel algorithm is added to the C++ compiler of your choice. The Intel’s Parallel STL is a fairly complete implementation of the C++ standard library algorithms with support for execution policies, as specified in ISO/IEC 14882:2017 standard, AKA C++17. It is a standalone header-only library available for free on GitHub (https://github.com/intel/parallelstl). It can work with any C++11 compiler that works with Intel’s Threading Building Blocks (TBB), which is also available for free at https://www.threadingbuildingblocks.org/. In addition, if you want to use non-standard vectorization (unsequenced policies), your compiler should support OpenMP 4.0 SIMD constructs. Intel have offered to donate their implementation to both GCC and Clang.

Wednesday, February 27, 2019 Read

Harnessing the Power of Heterogeneous Computing using Boost.Compute + OpenCL

I presented this webinar on August 15th, 2018 as a part of a series of regular biweekly General Interest Webinars ran by SHARCNET. The Boost Compute library provides a C++ interface to multi-core CPU and GPGPU computing platforms based on OpenCL. It provides a high-level, STL-like API and is portable to a wide variety of parallel accelerators including GPUs, FPGAs, and multi-core CPUs. This seminar gives an overview of the library and demonstrates how to write and execute high-performance C++ applications on SHARCNET clusters.

Wednesday, August 15, 2018 Read

Visual Studio Code – Your Next Coding Companion for Advanced Research Computing

I presented this webinar on February 28th, 2018 as a part of a series of regular biweekly General Interest Webinars ran by SHARCNET. Visual Studio Code (vscode) is a free and open-source code editor developed by Microsoft for Windows, Linux, and macOS. It includes support for debugging, embedded Git version control, syntax highlighting, intelligent code completion, snippets, and code refactoring. It is also extensible and customizable, so users can install extensions to add new languages, themes, debuggers and change the editor’s keyboard shortcuts, and preferences.

Wednesday, February 28, 2018 Read

Automating Software Build Process using CMake - Part II

I presented this webinar on April 26th, 2017 as a part of a series of regular biweekly General Interest Webinars ran by SHARCNET. CMake is a cross-platform, free and open-source build system that allows you automatically build, test, verify, package and deploy software in a compiler-independent manner. In Part I of this seminar we introduced CMake and the first three steps of our tutorial:

Wednesday, April 26, 2017 Read