The training introduces participants to High-Performance Data Analytics (HPDA), focusing on the fundamentals of HPC (High-Performance Computing) architectures and their application in large-scale data processing. The course is designed for researchers and professionals who can optimally leverage cluster computing resources for efficient data analysis.
This training is intended for researchers, engineers, and data scientists working with large-scale data analytics who seek to optimize performance using HPC technologies.
Training Topics
✅ Introduction to HPC Architectures
-
Parallel processing and its impact on data analytics performance
-
Key hardware components in high-performance clusters
✅ HPC vs. HTP (High-Throughput Processing)
-
Differences in workload management between HPC and HTP
-
Use cases: scientific simulations vs. large-scale data analytics
-
Choosing the right approach for different computational tasks
✅ PLGrid Resources for HPDA Users
-
Overview of PLGrid infrastructure and available computing nodes
-
Accessing and managing computing resources in PLGrid
-
Practical guidelines for running HPDA workloads on PLGrid clusters
✅ HPDA on PLGrid supercomputer - hands-on
-
Usage Python for HPDA environment deployed on computing clusters at ACC Cyfronet AGH
-
Example of data analysis workflow using SLURM scheduler and Python on multiple nodes
Technical Requirements:
Participants will need to bring their own laptops and have access to a web browser. No additional software installation is required, as all computations will be performed on remote HPC resources.
Training will be conducted by experts:
Klemens Noga, PhD (ACC Cyfronet AGH)
Leszek Grzanka, PhD (AGH/IFJ PAN)