Featured image of post Efficient use of computation

Efficient use of computation

Optimising code for fast operation on CPUs and GPUs

CPU

Efficient computation and hardware utilisation underpins most of the methods used in deep learning learning applications.

Different implementations of matrix multiplication on cpu

Why it is faster to access in the mkn pattern

Making things faster: multithreading

For this, we are chaning the problem to model the heat spread in a room

Room shown at different times

Now we can utilise all the cores in the machine to make the simulation go a lot faster.

Performance when using multiple cores, measured in Mega lattice updates/s

Making things even faster: using a gpu

The idea of using blocking is to allow each thread to work on more than one element of the C matrix before being released. This might improve performance, because the overhead of assigning new threads to calculate each element is higher than assigning new work to each thread.

Different implementations of matrix multiplication on cpu

Built with Hugo
Theme Stack designed by Jimmy