close

Вход

Забыли?

вход по аккаунту

?

Liao-ET-HPC-workshop-final - ROSE compiler infrastructure

код для вставкиСкачать
A node-level programming model framework for
exascale computing*
By Chunhua (Leo) Liao, Stephen Guzik, Dan Quinlan
LLNL-PRES-539073
Lawrence Livermore National Laboratory
* Proposed for LDRD FYтАЩ12, initially funded by ASC/FRIC and now being moved back to LDRD
1
We are building a framework for creating node-level parallel
programming models for exascale
яВз Problem:
тАв Exascale machines: more challenges to programming models
тАв Parallel programming models: important but increasingly lag
behind node-level architectures
яВз Goal:
тАв Speedup designing/evolving/adopting programming models for
exascale
яВз Approach:
тАв Identify and implement common building blocks in node-level
programming models so both researchers and developers can
quickly construct or customize their own models
яВз Deliverables:
тАв A node-level programming model framework (PMF) with
building blocks at language, compiler, and library levels
тАв Example programming models built using the PMF
2
Programming models bridge algorithms and machines and are
implemented through components of software stack
Algorithm
Programming Model
Abstract
Machine
Express
Software Stack
Language
Compiler
Application
Compile/link
Executable
Library
Execute
Measures of success:
тАв Expressiveness
тАв Performance
тАв Programmability
тАв Portability
тАв Efficiency
тАвтАж
тАж
Real
Machine
3
Parallel programming models are built on top of sequential ones
and use a combination of language/compiler/library support
Programming
Model
Parallel
Sequential
Shared Memory (e.g. OpenMP) Distributed Memory (e.g. MPI)
Interconnect
Abstract
Machine
(overly
simplified)
Memory
CPU
Shared Memory
CPU
тАж CPU
Memory
тАж
Memory
CPU
Software
Stack:
1. Language
2. Compiler
3. Library
General purpose
Languages (GPL)
C/C++/Fortran
GPL + Directives
CPU
GPL + Call to MPI libs
Sequential
Compiler
Seq. Compiler
+ OpenMP support
Seq. Compiler
Optional Seq. Libs
OpenMP Runtime Lib
MPI library
4
Problem: programming models will become a limiting factor for
exascale computing if no drastic measures are taken
яВз Future exascale architectures
тАв Clusters of many-core nodes, abundant threads
тАв Deep memory hierarchy, CPU+GPU, тАж
тАв Power and resilience constraints, тАж
яВз (Node level) programming models:
тАв Increasingly complex design space
тАв Conflicting goals: performance, power, productivity,
expressiveness
яВз Current situation:
тАв Programming model researchers: struggle to design/build
individual models to find the right one in the huge design space
тАв Application developers: stuck with stale models: insufficient
high-level models and tedious low-level ones
5
Solution: we are building a programming model framework (PMF)
to address exascale challenges
A three-level, open framework to facilitate building node-level
programming models for exascale architectures
Programming model 1
Level 1
Language
Extensions
Directive 1
тАж
Directive n
Reuse & Customize
Language Ext.
Compiler Sup.
Runtime Lib.
Level 2
Compiler
Support
(ROSE)
Tool 1
тАж
Programming model 2
Tool n
Compiler Sup.
Runtime Lib.
Level 3
Runtime
Library
Function 1
тАж
тАж
Programming model n
Function 1
Runtime Lib.
6
We will serve both researchers and developers, engage lab
applications, and target heterogeneous architectures
яВз Users:
тАв Programming model
researchers: explore design
space
тАв Experienced application
developers: build custom
models targeting current and
future machines
яВз Scope of this project
The programming model framework vastly increases
the flexibility in how the HPC stack can be used for
application development.
тАв DOE/LLNL applications
тАв Heterogeneous architectures: CPUs + GPUs
тАв Example building blocks: parallelism, heterogeneity, data locality,
power efficiency, thread scheduling, etc.
тАв Two major example programming models built using PMF
7
Example 1: researchers use the programming model framework
to extend a higher-level model (OpenMP) to support GPUs
яВз OpenMP: a high level, popular node-level programming
model for shared memory programming
тАв High demand for GPU support (within a node)
яВз PMF: provides a set of selectable, customizable
building blocks
тАв Language: directives, like #acc_region,
#data_region, #acc_loop, #data_copy, #device, etc.
тАв Compiler: parser builder, outliner, loop tiling, loop
collapsing, dependence analysis, etc. , based on
ROSE
тАв Runtime: thread management, task scheduling, data
transferring, load balancing, etc.
8
Using PMF to extend OpenMP for GPUs
Programming model framework
Level 1
Language
Extensions
OpenMP Extended for GPUs
Directive 1
тАж
#pragma omp acc region
Directive n
#pragma omp acc_region_loop
#pragma omp acc_loop
Reuse &
Customize
Level 2
Level 3
Compiler
Support
(ROSE)
Runtime
Library
Tool 1
тАж
Pragma_parsing()
Outlining_for_GPU()
Insert_runtime_call()
Tool n
Optimize_memory()
Function 1
тАж
Dispatch_tasks()
Balancing_load()
Function 1
Transfer_data()
9
Example 2: application developers use PMF to explore a lower
level, domain-specific programming model
яВз Target lab application:
тАв Lattice-Boltzmann algorithm with adaptive-mesh
refinement for direct numerical simulation studies on how
wall-roughness affects turbulence transition.
тАв Stencil operations on structured arrays
яВз Requirements:
тАв Concurrent, balanced execution on CPU & GPU
тАв Users do not like translating OpenMP to GPU
тАв Want to have the power to express lower level details like
data decomposition
тАв Exploit domain features: a box-based approach for
describing data-layout and regions for numerical solvers
тАв Target current and future architectures
10
Using the PMF to implement the domain-specific programming
model (ongoing work with many unknown details)
тАв C++ (main
algorithm
infrastructure)
тАв Pragmas (gluing
and supplemental
semantics)
тАв Cuda (describe
kernels)
Compiler
Support
Building blocks
Architecture A
Architecture B
Language feature
тАв Use a sequential
language, CUDA, and
pragmas to describe
algorithms
Source-code
that can be
compiled
using native
compilers
Compiler (first compilation)
тАв Generate code to help
chores
тАв Custom code generation
for multiple architectures
Executable
Final compilation using
native compilers,
linking with a runtime
library
* Scheduling among
CPUs and GPUs
11
Summary
яВз We are building a framework instead of a single
programming model for exascale node architectures
тАв Building blocks : language, compiler, runtime
тАв Two major example programming models
яВз Programming model researchers
тАв Quickly design and implementation solutions to
exascale challenges
тАв Eg. Explore OpenMP extensions for GPUs
яВз Experienced application developers
тАв Ability to directly change the software stack
тАв Eg. Compose domain-specific programming models
12
Thank you!
13
Документ
Категория
Презентации по английскому языку
Просмотров
4
Размер файла
614 Кб
Теги
1/--страниц
Пожаловаться на содержимое документа