As a starter, I haven't dive into the math details of compress sensing. At his stage, I think I only need to understand what norm represents to see the big picture, then I will go back to learn math details.
What is a norm?
“Mathematically a norm is a total size or length of all vectors in a vector space or matrix. For simplicity, we can say that the higher the norm is, the bigger the (value in) matrix or vector is.”
1. L0 norm
First, L0 norm is actually not a norm, because it does not obey the rules for norm()
Then, knowing that the L0 norm of a matrix is just the number of non-zero terms in the vector is enough.
So we can say that L0 norm represent the sparsity of a vector, if x is k-sparse, then L0(x) = k
Example: L0([3, 4, 0]) = 2, L0([7, 0]) = 1
L0 norm (Non-convex) in optimization is an NP-hard problem, in compress sensing, we convert it into an L1-minimization problem.
2. L1 norm
L1 norm of a vector: the absolute sum of all elements in this vector
Example: L2([3, 4]) = 7
L1 norm of a matrix: find the absolute sum of elements for each column, then pick the biggest one, it is the L1 norm
3. L2 norm
L2 norm of a vector: the length of the vector, is just the Euclidean distance that we usually use, is the shortest distance to go from one point to another.
Example: L2([3, 4]) = 5
4. L-∞ norm
L-∞ norm of a vector: the maximum absolute value of the element in this vector
Example: L∞([-6, 4, 2]) = 6