Performance Programming: Threading and Memory Latency

Marho Onothoja
2 min readMay 11, 2021


In the previous article of this series we saw of the major pitfalls of threading — race conditions caused while handling resources in threads. Read more about it here.

In today’s edition of threading pitfalls, we will be looking at another downsides of using threads and the solution which arose to rectify the problem. This problem is known as memory latency.

Take the following analogy for example

Imagine you had a drawer filled with dirty clothes and let’s assume you had the option to either wash them by hand or throw them into a washing machine and have it do the washing for you. Now if you were to wash by hand, you would have to wash the clothes one at a time, this is how code runs typically(one at a time).

Now if you wanted to make this activity concurrent using threads you would get more buckets and split up the clothes to wash into the various buckets, taking turns washing from each bucket one at a time for a given amount of time, say 1 minute(this is the scheduler).

Now you have multiple buckets and have split up the clothes to wash into these buckets, however, you are still one person and would have to move from bucket to bucket — this is known as context switching — every one minute according to the scheduler. Context switching — moving from one task to another — causes an overhead and this is what is referred to as memory latency.

For a smaller program, this may seem insignificant but as the workload your program handles grows so does this overhead generated by context switching.

One solution or rather a workaround — since it suffers from the same problem just at a significantly lower ratio — provided to tackle this particular problem is micro-threads.


What exactly are micro-threads, and how do they defer from threads???

Micro-threads are a static partition of a basic block into concurrently executing fragments, which execute on a single processor and share a micro-context. Fundamentally they function the same way threads do, in fact, the main difference between micro-threads and threads is the huge drop in memory latency — context switching overhead is much smaller for micro-threads.

Micro-threads mainly hide memory latency inside each core by over lapping computations with memory requests, this make micro-threads run several times faster than normal threads, reaching an upward of 3times the speed of a typical multi-threading setup.