出版社:机械工业出版社
年代:2011
定价:65.0
本书循序渐进地展示了如何利用MPI、PThread 和OpenMP开发高效的并行程序,教给读者如何开发、调试分布式内存和共享式内存的程序,以及对程序进行性能评估。
CHAPTER 1 Why Parallel Computing?
1.1 Why We Need Ever-Increasing Performance
1.2 Why We're Building Parallel Systems
1.3 Why We Need to Write Parallel Programs
1.4 How Do We Write Parallel Programs?
1.5 What We'll Be Doing
1.6 Concurrent, Parallel, Distributed
1.7 The Rest of the Book
1.8 A Word of Warning
1.9 Typographical Conventions
1.10 Summary
1.11 Exercises
CHAPTER 2 Parallel Hardware and Parallel Software
2.1 Some Background
2.1.1 The von Neumann architecture
2.1.2 Processes, multitasking, and threads
2.2 Modifications to the von Neumann Model
2.2.1 The basics of caching
2.2.2 Cache mappings
2.2.3 Caches and programs: an example
2.2.4 Virtual memory
2.2.5 Instruction-level parallelism
2.2.6 Hardware multithreading.
2.3 Parallel Hardware
2.3.1 SIMD systems
2.3.2 MIMD systems
2.3.3 Interconnection networks
2.3.4 Cache coherence
2.3.5 Shared-memory versus distributed-memory
2.4 Parallel Software
2.4.1 Caveats
2.4.2 Coordinating the processes/threads
2.4.3 Shared-memory
2.4.4 Distributed-memory
2.4.5 Programming hybrid systems
2.5 Input and Output
2.6 Performance
2.6.1 Speedup and efficiency
2.6.2 Amdahl's law
2.6.3 Scalability
2.6.4 Taking timings
2.7 Parallel Program Design
2.7.1 An example
2.8 Writing and Running Parallel Programs
2.9 Assumptions
2.10 Summary
2.10.1 Serial systems
2.10.2 Parallel hardware
2.10.3 Parallel software
2.10.4 Input and output
2.10.5 Performance.
2.10.6 Parallel program design
2.10.7 Assumptions
2.11 Exercises
CHAPTER 3 Distributed-Memory Programming with MPI
3.1 Getting Started
3.1.1 Compilation and execution
3.1.2 MPI programs
3.1.3 MPI Init and MPI Finalize
3.1.4 Communicators, MPI Comm size and MPI Comm rank
3.1.5 SPMD programs
3.1.6 Communication
3.1.7 MPI Send
3.1.8 MPI Recv
3.1.9 Message matching
3.1.10 The status p argument
3.1.11 Semantics of MPI Send and MPI Recv
3.1.12 Some potential pitfalls
3.2 The Trapezoidal Rule in MPI
3.2.1 The trapezoidal rule
3.2.2 Parallelizing the trapezoidal rule
Contents xiii
3.3 Dealing with I/O
3.3.1 Output
3.3.2 Input
3.4 Collective Communication
3.4.1 Tree-structured communication
3.4.2 MPI Reduce
3.4.3 Collective vspoint-to-point communications
3.4.4 MPI Allreduce
3.4.5 Broadcast
3.4.6 Data distributions
3.4.7 Scatter
3.4.8 Gather
3.4.9 Allgather
3.5 MPI Derived Datatypes
3.6 Performance Evaluation of MPI Programs
3.6.1 Taking timings
3.6.2 Results
3.6.3 Speedup and efficiency
3.6.4 Scalability
3.7 A Parallel Sorting Algorithm
3.7.1 Some simple serial sorting algorithms
3.7.2 Parallel odd-even transposition sort
3.7.3 Safety in MPI programs
3.7.4 Final details of parallel odd-even sort
3.8 Summary
3.9 Exercises
3.10 Programming Assignments .
CHAPTER 4 Shared-Memory Programming with Pthreads .
4.1 Processes, Threads, and Pthreads
4.2 Hello, World
4.2.1 Execution
4.2.2 Preliminaries
4.2.3 Starting the threads
4.2.4 Running the threads
4.2.5 Stopping the threads
4.2.6 Error checking
4.2.7 Other approaches to thread startup
4.3 Matrix-Vector Multiplication
4.4 Critical Sections
xiv Contents
4.5 Busy-Waiting
4.6 Mutexes .
4.7 Producer-Consumer Synchronization and Semaphores
4.8 Barriers and Condition Variables
4.8.1 Busy-waiting and a mutex
4.8.2 Semaphores
4.8.3 Condition variables
4.8.4 Pthreads barriers
4.9 Read-Write Locks
4.9.1 Linked list functions
4.9.2 A multi-threaded linked list
4.9.3 Pthreads read-write locks
4.9.4 Performance of the various implementations
4.9.5 Implementing read-write locks
4.10 Caches, Cache Coherence, and False Sharing
4.11 Thread-Safety
4.11.1 Incorrect programs can produce correct output
4.12 Summary
4.13 Exercises
4.14 Programming Assignments .
CHAPTER 5 Shared-Memory Programming with OpenMP .
5.1 Getting Started
5.1.1 Compiling and running OpenMP programs
5.1.2 The program
5.1.3 Error checking
5.2 The Trapezoidal Rule
5.2.1 A first OpenMP version
5.3 Scope of Variables
5.4 The Reduction Clause .
5.5 The parallel for Directive
5.5.1 Caveats
5.5.2 Data dependences
5.5.3 Finding loop-carried dependences
5.5.4 Estimating
5.5.5 More on scope
5.6 More About Loops in OpenMP: Sorting .
5.6.1 Bubble sort
5.6.2 Odd-even transposition sort
5.7 Scheduling Loops
5.7.1 The schedule clause
5.7.3 The dynamic and guided schedule types
5.7.4 The runtime schedule type
5.7.5 Which schedule?
5.8 Producers and Consumers
5.8.1 Queues
5.8.2 Message-passing
5.8.3 Sending messages
5.8.4 Receiving messages
5.8.5 Termination detection
5.8.6 Startup
5.8.7 The atomic directive
5.8.8 Critical sections and locks
5.8.9 Using locks in the message-passing program
5.8.10 critical directives, atomic directives, or locks?
5.8.11 Some caveats
5.9 Caches, Cache Coherence, and False Sharing
5.10 Thread-Safety
5.10.1 Incorrect programs can produce correct output
5.11 Summary
5.12 Exercises
5.13 Programming Assignments .
CHAPTER 6 Parallel Program Development
6.1 Two n-Body Solvers
6.1.1 The problem
6.1.2 Two serial programs
6.1.3 Parallelizing the n-body solvers
6.1.4 A word about I/O
6.1.5 Parallelizing the basic solver using OpenMP
6.1.6 Parallelizing the reduced solver using OpenMP
6.1.7 Evaluating the OpenMP codes
6.1.8 Parallelizing the solvers using pthreads
6.1.9 Parallelizing the basic solver using MPI
6.1.10 Parallelizing the reduced solver using MPI
6.1.11 Performance of the MPI solvers
6.2 Tree Search
6.2.1 Recursive depth-first search
6.2.2 Nonrecursive depth-first search
6.2.3 Data structures for the serial implementations
6.2.6 A static parallelization of tree search using pthreads
6.2.7 A dynamic parallelization of tree search using pthreads
6.2.8 Evaluating the pthreads tree-search programs
6.2.9 Parallelizing the tree-search programs using OpenMP
6.2.10 Performance of the OpenMP implementations
6.2.11 Implementation of tree search using MPI and static
partitioning
6.2.12 Implementation of tree search using MPI and dynamic
partitioning
6.3 A Word of Caution
6.4 Which API?
6.5 Summary
6.5.1 Pthreads and OpenMP
6.5.2 MPI
6.6 Exercises
6.7 Programming Assignments
CHAPTER 7 Where to Go from Here
References
Index
采用教程形式,从简短的编程实例起步,一步步编写更有挑战性的程序。重点介绍分布式内存和共享式内存的程序设计、调试和性能评估。使用MPI、PTrlread和OperIMP等编程模型,强调实际动手开发并行程序。并行编程已不仅仅是面向专业技术人员的一门学科。如果想要全面开发机群和多核处理器的计算能力,那么学习分布式内存和共享式内存的并行编程技术是不可或缺的。由Peter S.Pacheco编著的《并行程序设计导论(英文版)》循序渐进地展示了如何利用MPI、PThread和OperlMP开发高效的并行程序,教给读者如何开发、调试分布式内存和共享式内存的程序,以及对程序进行性能评估。
书籍详细信息 | |||
书名 | 并行程序设计导论站内查询相似图书 | ||
丛书名 | 经典原版书库 | ||
9787111358282 如需购买下载《并行程序设计导论》pdf扫描版电子书或查询更多相关信息,请直接复制isbn,搜索即可全网搜索该ISBN | |||
出版地 | 北京 | 出版单位 | 机械工业出版社 |
版次 | 1版 | 印次 | 1 |
定价(元) | 65.0 | 语种 | 英文 |
尺寸 | 25 × 17 | 装帧 | 平装 |
页数 | 370 | 印数 | 3000 |
并行程序设计导论是机械工业出版社于2011.9出版的中图分类号为 TP311.11 的主题关于 并行程序-程序设计-英文 的书籍。
(美) 帕切克 (Pacheco,P.S.) , 著
(德) 贝蒂尔·施密特 (Bertil Schmidt) , (西) 豪尔赫·冈萨雷斯-多明格斯, (德) 克里斯蒂安·洪特 (Christian Hundt) , (德) 莫里茨·施拉布 (Moritz Schlarb) , 著
(美) 威尔金森 (Wilkinson,B.) , (美) 阿兰 (Allen,M.) , 著
(美) 林 (Lin,C.) , (美) 斯奈德 (Snyder,L.) , 著
(美) 威尔金森 (Wilkinson,B.) , (美) 艾伦 (Allen,M.) , 著
(美) 罗布·法伯 (Rob Farber) , 编著
(美) 斯奈德 (Snyder,L.) , (美) 林 (Lin,C.) , 著
(美) 奎因 (Qiunn,M.J.) , 著
(日) 迈克尔·麦库尔 (Michael McCool) , 等著