Lecture 25

Review -- Distributed Parallel Programming

   exchange values problem
      coordinator   --   manager/workers
      exchange with all (symmetric)   --   heartbeat
      ring   --   pipeline

   distributed parallel computing paradigms:
      saw manager/workers last time
      will look at heartbeat and pipeline today


Heartbeat Algorithms

  what:  divide work (evenly)

         Worker[i]::  while(not done) {
                        compute
                        exchange values with neighbors (send .. receive)
                      }

  applications:  Section 9.2 -- image processing, cellular automata
                 Chapter 11 -- grid, particle, and matrix computations

  Jacobi iteration -- recall the problem

         draw rectangular grid and some of the points
         new values of points are average of previous values of
            the four neighbors

         compute from old -> new and then swap, or
         unroll loop once and compute from old -> new then new -> old

         divide into strips (or blocks)

         Worker[i]:  while (...) {
                       exchange edges
                       compute new values
                       exchange edges
                       compute old values
                     }

         the exchanges provide a "fuzzy" barrier

         better code:  overlap communication and computation

            send my edges to neighbors; compute interior
            receive other's edges; compute my new edges


Pipeline Algorithms

   what:  divide work evenly
          compute and circulate data among workers

   pipeline structures (Figure 9.5) -- circular or closed (or open)

   when:  used when workers need all the data, not just edges from neighbors

   applications:  matrix multiplication -- Section 9.3
                  nbody problem -- Section 11.2 (more below)

   a different kind of pipeline -- wavefronts

      consider Gauss-Seidel iteration (Sections 11.1 and 12.2)

      "raster scan" of grid -- hence can update in place

         picture of update order (Figure 12.6)
         data dependencies
         idea of loop skewing (Figure 12.6)
         this results in wave front parallelism, which can be implemented
            by a pipeline

         show how to do this by using column strips and having one
            worker per strip
         after a worker updates a row of its strip, it sends the row to the
            next worker to its right
         

Distributed Algorithms for the N-body Problem -- Section 11.2

   recall the problem:  calculate forces
                        barrier
                        move bodies
                        barrier
        then repeat

   distributed programs (for the n**2 algorithm):
     the challenges are to divide up the work and to have all the data you need
     (I outlined the following three approaches; details are in the text.)

   manager/workers paradigm

      tasks in bag are pairs of blocks of bodies; e.g.,
          (1,1), (1,2), (1,3), (2,2), (2,3), (3,3)
      each worker needs data on all bodies
      workers get tasks, compute forces between all bodies in that set
      workers exchange info after calculating forces
      workers also exchange bodies after moving them

   heartbeat paradigm

      use uneven size blocks
      algorithm (for each worker)
         send bodies to lower numbered workers
         calculate local forces
         receive bodies, compute, send back (from/to higher numbered workers)
         receive bodies and add forces in (from lower numbered workers)
         move local bodies

   pipeline paradigm

      assign bodies by stripes (or reverse stripes), not by strips or blocks
      algorithm for each worker:
         send my bodies along
         compute local forces
         receive new bodies
         compute forces I'm responsible for
         send on those bodies and forces on them
         receive my bodies back and add in forces on them from
           lower-numbered bodies

   tradeoffs -- see Table 11.1

      these algorithms pass different numbers of messages of different sizes