The Steerable Pyramid
What is a steerable pyramid?
The Steerable Pyramid is a linear multi-scale, multi-orientation image
decomposition that provides a useful front-end for image-processing
and computer vision applications. We developed this
representation in 1990, in order to overcome the limitations of
orthogonal separable wavelet decompositions that were then becoming popular
for image processing (specifically, those representations are heavily aliased,
and do not represent oblique orientations well).
Once the orthogonality constraint is dropped, it makes sense to
completely reconsider the filter design problem (as opposed to just
re-using orthogonal wavelet filters in a redundant representation, as
is done in cycle-spinning or undecimated wavelet transforms!).
Detailed information may be found in the
references listed below.
|
The basis functions of the steerable pyramid are Kth-order directional
derivative operators (for any choice of K),
that come in different sizes and
K+1 orientations.
As directional derivatives, they span a rotation-invariant subspace (i.e., they are equi-variant)
and they are designed and sampled such that the whole transform forms
a tight frame.
An example decomposition of an image of a white disk on
a black background is shown to the right. This particular steerable
pyramid contains 4 orientation subbands, at 2 scales. The smallest
subband is the residual lowpass information. The residual highpass
subband is not shown.
|
|
The block diagram for the decomposition (both analysis and
synthesis) is shown to the right.
Initially, the image is separated into low and highpass
subbands, using filters L0 and H0. The lowpass subband is then divided
into a set of oriented bandpass subbands and a low(er)-pass subband.
This low(er)-pass subband is subsampled by a factor of 2 in the X and Y directions.
The recursive (pyramid) construction of a pyramid is achieved by
inserting a copy of the shaded portion of the diagram at the location
of the solid circle (i.e., the lowpass branch).
More detailed
descriptions may be found in the references (below).
|
|
What advantages does it have over separable orthonormal wavelets?
The steerable pyramid performs a polar-separable decomposition in the
frequency domain, thus allowing independent representation of scale
and orientation. Since it is a tight frame, it obeys the generalized
form of Parseval's Equality: The vector-length (L2-norm) of the
coefficients equals that of the original signal.
More importantly, the representation is translation-invariant
(i.e., the subbands are aliasing-free, or equivariant with respect
to translation) and rotation-invariant (i.e., the
subbands are steerable, or equivariant with respect to
rotation). This can make a big difference in applications that
involve representation of position or orientation of image structure.
The primary drawback is that the representation is overcomplete by a
factor of 4k/3, where k is the number of orientation bands. Also, the
filter design problem is messy, and so space-domain implementation is
not perfect-reconstruction (although errors are small enough for most
applications). We typically use a frequency-domain implementation,
which provides perfect reconstruction, but the resulting filters
exhibit more spatial "ringing".
Here is a table comparing properties to other well-known transforms
(more information about these transforms may be found in this
book chapter):
|
Steerable Pyramid |
Separable Orthog. Wavelet |
Laplacian Pyramid |
Gabor (octave) |
Block DCT |
jointly-localized (space/frequency) |
yes |
yes (can be) |
yes |
not inverse |
not in frequency |
translation-equivariant (no aliasing) |
yes (approx) |
no |
yes (approx) |
no |
no |
oriented kernels |
yes |
no (not diagonals) |
N/A |
yes |
no |
rotation-equivariant (steerable) |
yes (approx) |
no |
N/A |
no |
no |
tight frame (self-inverting) |
yes (approx) |
yes |
no |
no |
yes |
overcompleteness |
4k/3 |
1 |
4/3 |
1 |
1 |
For what applications is the steerable pyramid useful?
Applications include:
orientation analysis,
noise removal and enhancement,
transient detection,
texture representation and synthesis.
Some more examples:
Bill Freeman's
Filip Rooms'
How can I try it out?
Matlab source code is available in
our GitHub
repository.
More information may be found in the
README
file. A listing of the contents of this file may be found in the
Contents
file.
The latest modifications to the program are described in the
ChangeLog
file.
Some older C
source code is also available, although the filters accompanying
this code are not very accurate. More information may be found in the
README file.
Partial List of References
Steerable Pyramid Transforms
M Unser, N Chenouard, and D Van De Ville
Steerable Pyramids and Tight Wavelet Frames in
L2(Rd)
IEEE Trans. Image Processing, 20(10):2705-2721, Oct 2011.
J Portilla and E P Simoncelli
A Parametric Texture Model based on Joint Statistics
of Complex Wavelet Coefficients
Int'l Journal of Computer Vision. October, 2000.
Abstract and code [This paper describes and uses a complex steerable pyramid]
R Manduchi, P Perona and D Shy.
Efficient Deformable Filter Banks.
IEEE Trans Signal Processing, 46(4):1168-1173, 1998.
A Karasaridis and E Simoncelli.
A Filter Design Technique for
Steerable Pyramid Image Transforms.
Int'l Conf. Acoustics Speech and Signal Processing.
Atlanta GA, May 1996.
Abstract & Download
E P Simoncelli and W T Freeman.
The Steerable Pyramid: A Flexible Architecture for
Multi-Scale Derivative Computation.
IEEE Second Int'l Conf on Image Processing.
Washington DC, October 1995.
Abstract & Download [This paper describes the decomposition implemented in the source code above]
H Greenspan, S Belongie, R Goodman, P Perona, S Rakshit,
and C H Anderson.
Overcomplete steerable pyramid filters and rotation invariance.
Proceedings CVPR 1994, pp. 222-228.
D Shy and P Perona.
X-Y Separable Pyramid Steerable Scalable Kernels.
Proceedings CVPR 1994, pp. 237-244.
E P Simoncelli, W T Freeman, E H Adelson and D J Heeger.
Shiftable Multi-Scale Transforms
[or, "What's Wrong with Orthonormal Wavelets"].
IEEE Trans. Information Theory, Special Issue on Wavelets.
Vol. 38, No. 2, pp. 587-607, March 1992.
Abstract & Download
Applications
S Lyu and E P Simoncelli.
Modeling multiscale subbands of photographic images with fields of Gaussian scale mixtures
IEEE Trans. Patt. Analysis and Machine Intelligence, Apr 2009.
Abstract
[state-of-the-art denoising, as of 2010 - equal to BM3D]
J A Guerrero-ColC3n, E P Simoncelli and J Portilla.
Image denoising using mixtures of Gaussian scale mixtures
Proc 15th IEEE Int'l Conf on Image Proc, Oct 2008.
Abstract
M Raphan, EP Simoncelli.
Optimal denoising in redundant representations
IEEE Trans Image Processing, Aug 2008.
Abstract
J Portilla, M Wainwright, V Strela, E P Simoncelli.
Image denoising using a scale mixture of Gaussians in the wavelet domain
IEEE Transactions on Image Processing, Nov 2003.
Abstract, Code, Download
J Portilla and E P Simoncelli
A Parametric Texture Model based on Joint Statistics
of Complex Wavelet Coefficients
Int'l Journal of Computer Vision. October, 2000.
Abstract, Code, Download
E P Simoncelli
Bayesian Denoising of Visual Images in the Wavelet Domain
In
Bayesian Inference in Wavelet Based Models.
eds. P Müller and B Vidakovic.
Springer-Verlag, Lecture Notes in Statistics 141, August 1999.
Abstract & download
E P Simoncelli and J Portilla
Texture Characterization via Joint Statistics
of Wavelet Coefficient Magnitudes.
5th IEEE Int'l Conf on Image Processing.
Chicago, IL. Oct 4-7, 1998.
Abstract & download
D Heeger
and J Bergen.
Pyramid-based Texture Analysis/Synthesis.
Proceedings, ACM Siggraph, August, 1995.
E P Simoncelli and E H Adelson.
Noise Removal via Bayesian Wavelet Coring.
IEEE Third Int'l Conf on Image Processing.
Laussanne Switzerland, September 1996.
Abstract
JS Nimeroff, E Simoncelli, J Dorsey.
Efficient re-rendering of naturally illuminated environments.
5th Annual Eurographics Symposium on Rendering, 1994.
Abstract
Steerability, Steerable filters, Derivative filters
H Farid and E P Simoncelli.
Differentiation of discrete multi-dimensional signals
IEEE Trans Image Processing,
13(4):496-508, Apr 2004.
Abstract & Download
J W Zweck and L R Williams.
Euclidean group invariant computation of stochastic completion fields
using shiftable-twistable functions,
December, 1999.
H Farid and E P Simoncelli
Optimally rotation-equivariant directional derivative kernels
Int'l Conf Computer Analysis of Images and Patterns, 207-214, Kiel, Germany, 1997.
Abstract
P Teo
and
Y Hel-Or
A Computational Group-Theoretic Approach to Steerable Functions.
STAN-CS-TN-96-33, Dept. of Computer Science, Stanford University,
April 1996.
E Simoncelli and H Farid.
Steerable Wedge Filters for Local Orientation Analysis.
IEEE Trans. Image Processing, Sept 1996.
Abstract /
Full PostScript (461k)
E P Simoncelli.
A Rotation-Invariant Pattern Signature.
3rd IEEE Int'l Conf on Image Processing.
Laussanne Switzerland, Sept 1996.
Abstract /
Full Text (377k, ps.gz)
M Michaelis and G Sommer.
A Lie Group-Approach to Steerable Filters.
Pattern Recognition Letters, v16, n11.
November, 1995. pp. 1165-1174.
W Beil.
Steerable Filters and Invariance Theory.
Pattern Recognition Letters, v16, n11, 1994.
pp. 453-460.
Klas Nordberg,
Signal Representation and Processing Using Operator Groups.
Linkoping University Dissertation, No. 366. 1994.
Eero P Simoncelli.
Design of Multi-dimensional Derivative Filters.
IEEE First Int'l Conf on Image Processing.
Austin TX, November 1994.
Abstract /
Full PostScript (74k)
J Segman and Y Y Zeevi. Image Analysis by Wavelet-Type Transforms:
Group Theoretic Approach. J. Mathematical Imaging and Vision 3,
pp 51-77, 1993.
Pietro Perona.
Steerable-scalable kernels for
edge detection and junction analysis.
2nd European Conf. Computer Vision (1992), pp. 3-18.
W T Freeman and E H Adelson.
The Design and Use of Steerable Filters.
IEEE Trans. Patt. Anal. Mach. Intell.,
Vol 13 Num 9, pp 891-906, September 1991.
J G Daugman.
Six Formal Properties of Anisotropic
Visual Filters: Structural Principles and Frequency/Orientation
Selectivity.
IEEE Trans. Systems, Man, and Cybernetics. vol13, pp882-887. 1983.
H Knutsson and G H Granlund.
Texture analysis using two-dimensional quadrature filters.
IEEE Computer Society Workshop on Computer
Architecture for Pattern Analysis and Image Database Management,
1983, pp. 206-213.
Per-Erik Danielsson.
Rotation-Invariant Linear Operators with Directional Response.
5th Int'l Conf. Patt. Rec., Miami,
December, 1980.
Related (Multi-Scale, Oriented) Image Transforms
JL Starck, EJ Candès and DL Donoho.
The Curvelet Transform for Image Denoising.
IEEE Trans Image Processing. 11, 670--684, 2000.
R Navarro, A Tabernero, and G Cristobal.
Image Representations with Gabor Wavelets and its Applications.
Advances in Imaging and Electron Physics, vol 97, 1996.
E P Simoncelli & E H Adelson.
Non-separable Extensions of Quadrature Mirror Filters
to Multiple Dimensions
Proceedings of the IEEE, 78(4): 652-664, April, 1990.
Abstract
E P Simoncelli and E H Adelson.
Subband Image Coding with Hexagonal Quadrature Mirror Filters.
Proc. Picture Coding Symposium, Cambridge, MA, March 1990.
Abstract
M Porat and Y Zeevi.
The Generalized Gabor Scheme of Image
Representation in Biological and Machine Vision
IEEE Trans. PAMI. 10:452-468, 1988.
J G Daugman. Complete discrete 2-D Gabor transforms
by neural networks for image analysis and compression.
IEEE Trans. ASSP,
36(7): 1169-1179, 1988.
A B Watson. The cortex transform: rapid computation of
simulated neural images. Comp. Vis. Graph. Image Proc.
39:311-327, 1987.
Full text (pdf)
General Books on Wavelets
Barbara Burke Hubbard.
The World According to Wavelets
.
A.K. Peters, Wellesley MA, 1996.
Stephane Mallat
A Wavelet Tour of Signal Processing
.
Academic Press, 1998.
Gilbert Strang & Truong Nguyen.
Wavelets and Filter Banks.
Wellesley-Cambridge Press, Wellesley MA, 1996.
Martin Vetterli & Jelena Kovacevic.
Wavelets and Subband Coding
.
Prentice Hall, 1995.
|
Revised: 26 Jun 2008.
Created: mid 1995. |
|