Of course. Let’s embark on a detailed journey through the Fast Fourier Transform (FFT) algorithm, using your provided C++ code as our guide. We will break down the problem, the brilliant mathematical idea behind its solution, the mechanics of the algorithm itself, and finally, how each line of code masterfully implements this complex theory.
1. The Problem: The Slowness of Polynomial Multiplication
At its heart, the problem is to compute the product of two polynomials,
Let’s represent them in their coefficient form:
The product polynomial, let’s call it
How do we find the coefficient
This operation is known as a convolution of the coefficient sequences
A straightforward implementation of this formula would involve a nested loop:
1 | // Pseudocode for naive polynomial multiplication |
The outer loop runs about
2. The Core Idea: A Change in Perspective
The difficulty lies in the convolution operation in the coefficient domain. The brilliant insight of the FFT-based approach is to switch to a different representation of polynomials where multiplication is trivial.
This is the point-value representation.
A polynomial of degree
So, instead of representing
We choose
for for
Now, what is the point-value representation of the product
for
This point-wise multiplication takes only
This leads to a new three-step plan for polynomial multiplication:
- Evaluation (Coefficient to Point-Value): Given the coefficients of
and , pick a set of points and evaluate both polynomials at these points. This is also called the Discrete Fourier Transform (DFT). - Point-wise Multiplication: Create the point-value representation of
by multiplying the values of and at each point. This takes time. - Interpolation (Point-Value to Coefficient): Convert the point-value representation of
back into its coefficient form. This is the Inverse Discrete Fourier Transform (IDFT).
The challenge now is to make steps 1 and 3 fast. If we pick arbitrary points and evaluate naively (e.g., using Horner’s method), each evaluation takes
3. The Magic Ingredient: Complex Roots of Unity
The special points we choose are the complex roots of unity.
An
Here,
These roots have beautiful, symmetric properties that are key to the FFT algorithm:
- Periodicity:
. - Symmetry:
. - Reduction:
. This property is the cornerstone of the divide-and-conquer approach, as it connects the roots of size to the roots of size .
The Discrete Fourier Transform (DFT) of a coefficient vector
The Inverse DFT surprisingly has a very similar form. Given the values
Notice the only two differences: we use the negative powers of the roots (
4. The Algorithm: Fast Fourier Transform (FFT)
The FFT is a divide-and-conquer algorithm to compute the DFT in
We split the coefficients of
Let’s define two new polynomials of degree
Now we can write
The goal is to compute
Using the reduction property
This is a beautiful recurrence. We have expressed the DFT of a size-
Using the symmetry property
Let’s summarize for
Let $E_k = \text{DFT}(A_{even})k = A{even}(\omega_{N/2}^k)
Then:
This is the famous butterfly operation. We take two values from the smaller DFTs (
The recurrence relation for the runtime is
5. From Recursion to Iteration: Deconstructing the Code
Recursive FFT is elegant but can be slow due to function call overhead. The provided code uses a more efficient iterative, bottom-up approach. Let’s see how it maps to the theory.
5.1. main() - The Grand Plan
1 | int main() { |
This is a direct implementation of our three-step plan.
- Padding: The size of the transform,
N(calledlimin the code), must be a power of two large enough to hold the product polynomial’s coefficients. revInit(N): This pre-computes a permutation that is crucial for the iterative algorithm. We’ll examine it next.cdb: This is acomplex<double>. The coefficients are loaded into the real part.FFT(A, 1): The1signifies a forward FFT (DFT).FFT(A, -1): The-1signifies an inverse FFT (IDFT).- Rounding:
A[i].real() + 0.5is a standard trick to round a positive floating-point number to the nearest integer, mitigating small precision errors.
5.2. revInit() - The Bit-Reversal Permutation
1 | inline void revInit(int n_lim) { |
If you trace the recursive divide-and-conquer calls, you’ll notice an interesting pattern. At the base case (when subproblems have size 1), the coefficients are not in their original order. For
- Level 0:
{a0, a1, a2, a3, a4, a5, a6, a7} - Level 1:
{a0, a2, a4, a6}and{a1, a3, a5, a7} - Level 2:
{a0, a4},{a2, a6},{a1, a5},{a3, a7} - Level 3 (base):
{a0}, {a4}, {a2}, {a6}, {a1}, {a5}, {a3}, {a7}
The final order of indices is {0, 4, 2, 6, 1, 5, 3, 7}. Let’s look at their binary representations (for
- 0 (000) -> 0 (000)
- 1 (001) -> 4 (100)
- 2 (010) -> 2 (010)
- 3 (011) -> 6 (110)
- 4 (100) -> 1 (001)
- etc.
The final position of coefficient a_i is the index whose binary representation is the reverse of the binary representation of i. This is the bit-reversal permutation.
The iterative FFT starts by reordering the input array according to this permutation. This pre-shuffling allows it to build the solution bottom-up, combining adjacent pairs into blocks of 2, then blocks of 2 into blocks of 4, and so on, mimicking the recursive calls returning.
The line rev[i] = (rev[i >> 1] >> 1) | ((i & 1) << (L - 1)); is a clever dynamic programming way to compute this. The reversal of i is related to the reversal of i >> 1 (i.e., i without its last bit). We take the reversal of i>>1, shift it right to make space, and pop the last bit of i (i & 1) into the most significant position.
5.3. FFT() - The Iterative Engine
1 | inline void FFT(vector<cdb>& a, int op) { |
Let’s analyze this masterpiece of an algorithm.
Bit-Reversal: The first loop does the pre-shuffling we just discussed.
if (i < rev[i])ensures we swap each pair only once. The arrayais now in the order the recursive base cases would be.Iterative Loops: The algorithm now builds the DFT bottom-up.
for (int len = 1; len < lim; len <<= 1): This outer loop controls the size of the subproblems we are combining.lenis half the size of the blocks we are forming. It starts withlen=1(combining size-1 DFTs to make size-2 DFTs), thenlen=2(combining size-2 to make size-4), and so on, up tolen=lim/2.for (int i = 0; i < lim; i += (len << 1)): This loop iterates through the starting indices of the blocks. The block size is2*len. Forlen=1, it jumpsi=0, 2, 4, .... Forlen=2, it jumpsi=0, 4, 8, ....for (int j = 0; j < len; ++j): This is the core butterfly loop. It iterateslentimes to perform thelenbutterfly operations required to merge two sub-blocks of sizeleninto one large block of size2*len.u = a[i + j]: This is the result from the “even” part sub-DFT, corresponding to our. v = a[i + j + len] * w: This is the result from the “odd” part sub-DFT, multiplied by the twiddle factor, corresponding to our. a[i + j] = u + v;anda[i + j + len] = u - v;: This is the butterfly operation itself, storing the results back in place.
Twiddle Factors:
wlen: This is the principal root for the current stage,. The angle is PI / lenbecause the full circle is, and we are dividing it into parts. The opparameter neatly handles both DFT (op=1) and IDFT (op=-1, which conjugates the root by flipping the sign of the sine term).w: This is the current twiddle factor,. It starts at 1 and is updated in each step by multiplying with wlen.
Inverse FFT Scaling:
- As derived from the IDFT formula, if we’re doing an inverse transform (
op == -1), we must divide all resulting coefficients bylim(which is). This code does it conveniently at the very end.
- As derived from the IDFT formula, if we’re doing an inverse transform (
6. Standard Code
1 |
|
7. Conclusion: The Symphony of Code and Math
The FFT algorithm for polynomial multiplication is a testament to the power of algorithmic thinking and mathematical insight. By viewing polynomials not as lists of coefficients but as functions to be evaluated, we unlock a path to immense speedup.
The process is a perfectly choreographed dance:
- Setup: We determine the necessary size
N, a power of two, to avoid cyclic convolution issues and to fit the algorithm’s structure. - Transformation (DFT): We use the iterative FFT, powered by the bit-reversal permutation and butterfly operations based on the properties of complex roots of unity, to transform our coefficient representations into point-value representations in
time. - The Simple Step: We perform the trivial
point-wise multiplication, which corresponds to the complex convolution in the coefficient domain. - Transformation Back (IDFT): We use the exact same FFT algorithm, with a minor tweak to the angle and a final scaling step, to interpolate the resulting point-values back into the final coefficients, again in
time.
- Title:
- Author: Firsry
- Created at : 2025-09-14 19:20:29
- Updated at : 2025-09-14 19:23:08
- Link: https://firsryfan.github.io/2025/09/14/学术:浅谈快速傅里叶变换 FFT/
- License: This work is licensed under CC BY-NC-SA 4.0.