Convolution is a weighted moving average with one signal flipped back to front:
The equation is the same as for correlation except that the second signal (y[k - n]) is flipped back to front.
The diagram shows how the unknown signal can be identified.
The diagram shows how a single point of the convolution function is calculated:
Convolution requires a lot of calculations. If one signal is of length M and the other is of length N, then we need (N * M) multiplications, to calculate the whole convolution function.
Note that really, we want to multiply and then accumulate the result - this is typical of DSP operations and is called a 'multiply/accumulate' operation. It is the reason that DSP processors can do multiplications and additions in parallel.
Convolution is used for digital filtering.
The reason convolution is preferred to correlation for filtering has to do with how the frequency spectra of the two signals interact. Convolving two signals is equivalent to multiplying the frequency spectra of the two signals together - which is easily understood, and is what we mean by filtering. Correlation is equivalent to multiplying the complex conjugate of the frequency spectrum of one signal by the frequency spectrum of the other. Complex conjugation is not so easily understood and so convolution is used for digital filtering. Convolving by multiplying frequency spectra is called fast convolution.