Matrix multiplication in C++
Track completion, mastery, and revision.
Matrix Multiplication in C++: Naive vs. Strassen's Algorithm
Matrix multiplication is a fundamental operation in linear algebra with widespread applications in computer graphics, machine learning, physics simulations, and scientific computing.
In this tutorial, we will explore two primary ways to implement matrix multiplication in C++:
- The Standard (Naive) Approach — $O(n^3)$ time complexity.
- Strassen's Algorithm — An optimized divide-and-conquer approach with $O(n^{2.81})$ time complexity.
1. Standard (Naive) Matrix Multiplication
The standard method directly implements the mathematical definition of matrix multiplication.
Algorithm Overview
For two matrices $A$ (of size $m \times n$) and $B$ (of size $n \times p$), their product $C$ (of size $m \times p$) is computed by taking the dot product of the rows of $A$ with the columns of $B$:
$$C[i][j] = \sum_{k=0}^{n-1} A[i][k] \times B[k][j]$
Compatibility Rule: Matrix multiplication $A \times B$ is only defined if the number of columns in matrix $A$ is equal to the number of rows in matrix $B$.
C++ Implementation
#include <iostream>
using namespace std;
const int MAX_SIZE = 5;
void multiply(int A[MAX_SIZE][MAX_SIZE], int B[MAX_SIZE][MAX_SIZE],
int result[MAX_SIZE][MAX_SIZE], int r1, int c1, int r2, int c2) {
if (c1 != r2) {
cout << "Matrix multiplication not possible!\n";
return;
}
for (int i = 0; i < r1; i++) {
for (int j = 0; j < c2; j++) {
result[i][j] = 0;
for (int k = 0; k < c1; k++) {
result[i][j] += A[i][k] * B[k][j];
}
}
}
}
void display(int matrix[MAX_SIZE][MAX_SIZE], int rows, int cols) {
for (int i = 0; i < rows; i++) {
for (int j = 0; j < cols; j++) {
cout << matrix[i][j] << " ";
}
cout << "\n";
}
}
int main() {
int A[MAX_SIZE][MAX_SIZE], B[MAX_SIZE][MAX_SIZE], C[MAX_SIZE][MAX_SIZE];
int r1, c1, r2, c2;
cout << "Enter rows and columns for first matrix: ";
cin >> r1 >> c1;
cout << "Enter rows and columns for second matrix: ";
cin >> r2 >> c2;
if (c1 != r2) {
cout << "Error: Columns of first matrix must equal rows of second matrix!\n";
return 1;
}
cout << "Enter elements of first matrix:\n";
for (int i = 0; i < r1; i++)
for (int j = 0; j < c1; j++)
cin >> A[i][j];
cout << "Enter elements of second matrix:\n";
for (int i = 0; i < r2; i++)
for (int j = 0; j < c2; j++)
cin >> B[i][j];
multiply(A, B, C, r1, c1, r2, c2);
cout << "Resultant matrix:\n";
display(C, r1, c2);
return 0;
}
Sample Output
Enter rows and columns for first matrix: 2 3
Enter rows and columns for second matrix: 3 2
Enter elements of first matrix:
1 2 3
4 5 6
Enter elements of second matrix:
7 8
9 10
11 12
Resultant matrix:
58 64
139 154
Complexity Analysis
- Time Complexity: $\mathcal{O}(n^3)$ — Three nested loops run up to $n$ times each (assuming square matrices of size $n \times n$).
- Space Complexity: $\mathcal{O}(n^2)$ — To store the input and output matrices. The auxiliary space used by the multiplication algorithm itself is $\mathcal{O}(1)$.
2. Strassen's Algorithm
Strassen's Algorithm is a divide-and-conquer method that reduces the number of recursive matrix multiplications required for square matrices.
Algorithm Overview
A standard divide-and-conquer approach splits two $n \times n$ matrices into eight $n/2 \times n/2$ submatrices, requiring 8 multiplications and 4 additions. This still results in a time complexity of $\mathcal{O}(n^3)$.
Strassen's key insight was that we can calculate the product using only 7 multiplications and 18 additions/subtractions.
By reducing the number of multiplications (which are computationally expensive compared to additions), the recurrence relation becomes:
$$T(n) = 7T(n/2) + O(n^2)$$
Using the Master Theorem, this yields a time complexity of:
$$\mathcal{O}(n^{\log_2 7}) \approx \mathcal{O}(n^{2.81})$$
Note on Implementation: A fully generic Strassen's algorithm recursively partitions matrices down to a base case. To keep the code clean and focus on the mathematical formulas, the implementation below demonstrates Strassen's algebraic formulas operating on $2 \times 2$ sub-blocks within $4 \times 4$ structures.
C++ Implementation
#include <iostream>
using namespace std;
void inputMatrix(double mat[4][4]) {
for(int i = 0; i < 4; i++) {
for(int j = 0; j < 4; j++) {
cin >> mat[i][j];
}
}
}
void strassenMultiply(double A[4][4], double B[4][4]) {
// Dividing into 2x2 submatrices
double A11 = (A[0][0] + A[1][1]) * (B[0][0] + B[1][1]);
double A12 = (A[0][1] + A[1][1]) * B[1][0];
double A21 = A[1][0] * (B[0][1] - B[1][1]);
double A22 = A[1][1] * (B[1][0] - B[0][0]);
double B11 = (A[0][0] + A[0][1]) * B[1][1];
double B21 = (A[1][0] - A[0][0]) * (B[0][0] + B[0][1]);
double B22 = (A[0][1] - A[1][1]) * (B[1][0] + B[1][1]);
// Calculating result submatrices
double C11 = A11 + A22 - B11 + B22;
double C12 = A21 + B11;
double C21 = A12 + A22;
double C22 = A11 + A21 - B11 - B21;
// Output result
cout << "Result Matrix:\n";
cout << C11 << " " << C12 << "\n";
cout << C21 << " " << C22 << "\n";
}
int main() {
double A[4][4], B[4][4];
cout << "Enter matrix A (4x4):\n";
inputMatrix(A);
cout << "Enter matrix B (4x4):\n";
inputMatrix(B);
strassenMultiply(A, B);
return 0;
}
Sample Output
Enter matrix A (4x4):
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
Enter matrix B (4x4):
16 15 14 13
12 11 10 9
8 7 6 5
4 3 2 1
Result Matrix:
386 444
962 1108
Comparison
| Metric | Naive Matrix Multiplication | Strassen's Algorithm | | :--- | :--- | :--- | | Time Complexity | $\mathcal{O}(n^3)$ | $\mathcal{O}(n^{2.81})$ | | Space Complexity | $\mathcal{O}(n^2)$ | $\mathcal{O}(n^2)$ (higher constant factor due to recursion stack) | | Implementation Complexity | Simple and intuitive | Complex, recursive division | | Best For | Small matrices ($n < 100$) | Large matrices | | Submatrix Multiplications | 8 multiplications | 7 multiplications |
Conclusion
The choice between these two algorithms depends heavily on the scale of your data:
- Use the Naive Method for small matrices. It has minimal overhead, is easy to debug, and benefits from excellent CPU cache locality because of its simple loop structures.
- Use Strassen's Algorithm when dealing with very large matrices where the asymptotic savings in multiplications outweigh the overhead of recursive function calls and extra memory allocations.
For state-of-the-art applications, hybrid approaches are often used: Strassen's algorithm recursively divides the matrices until they reach a threshold size (e.g., $64 \times 64$), at which point the system switches to the standard naive method to maximize hardware efficiency.
Finished reading?