Streaming multigrid for gradient-domain operations on large images
 
ACM Trans. Graphics (SIGGRAPH), 27(3), 2008.
Perform k multigrid V-cycles in just k-1 streaming passes over the
   data.
Abstract:
 We introduce a new tool to solve the large linear systems arising from gradient-domain image processing.
 Specifically, we develop a streaming multigrid solver, which needs just two sequential passes over
 out-of-core data.  This fast solution is enabled by a combination of three techniques: (1) use of
 second-order finite elements (rather than traditional finite differences) to reach sufficient accuracy in a
 single V-cycle, (2) temporally blocked relaxation, and (3) multi-level streaming to pipeline the
 restriction and prolongation phases into single streaming passes.  A key contribution is the extension of
 the B-spline finite-element method to be compatible with the forward-difference gradient representation
 commonly used with images.  Our streaming solver is also efficient for in-memory images, due to its fast
 convergence and excellent cache behavior.  Remarkably, it can outperform spatially adaptive solvers that
 exploit application-specific knowledge.  We demonstrate seamless stitching and tone-mapping of gigapixel
 images in about an hour on a notebook PC.
Hindsights:
 In later work, we 
 
distributed
 the solver computation over a cluster to handle terapixel images,
 and generalized the approach to operate over
 
spherical imagery.