Performance Programming: Threading and Resource Sharing.

Marho Onothoja
4 min readMar 26, 2021

--

In this post I hope to expand a bit on one of the core features of multi-threading and provide clarity as to how it works. So let’s get into it.

The first thing you need to realize is that although multi-threading and multiprocessing are aimed at the same thing, “speed”, Threads and Processes at not the same thing. Threads actually happen under processes. To put it in context as much as possible “Processes are the abstraction of running programs” while “Threads are the unit of execution in a process”. In short, all threads are processes — single-thread applications — or part of processes — multi-threaded applications.

Basically multi-threading is the act of taking a process(which is an instance of an application) with multiple sequences of actions(multiple things to do) and splitting it up. Remember a single process by default has/spins up a single thread. If the process has multiple tasks to execute it makes sense to assign some of those tasks to a different thread so that other activities can run without waiting for each other, this way your application is running concurrently: not simultaneously, but close.

But wait, if its all one code/application running what happens if input entered into the application undergoes several changes before a particular thread can get to it, while other threads have worked on it, wouldn’t everything fall apart?

Yes, under normal circumstances. However, since multiple threads can all run in one process, hence, they share resources and do not possess their own.

image from www.cs.uic.edu

This image gives a very detailed image of how threads look, having their own register(including its own program counter) and stack, but sharing everything else — address space, heap, static data, etc — including data from their parent process like the parent process id, the signal handling: find full comprehensive list here.

Let me give an illustration. Say you live in a room with 3 other people, everyone has to eat right? Now let’s assume everyone has to share the same refrigerator and because everyone are such good friends they prefer to share food instead of personalizing theirs. If John wanted to eat he would get to the fridge and throw something together from whatever is available. If James gets to the fridge afterward, he would have to make something to eat out of whatever ingredients are left or out of John’s leftovers(if they are any) — the content of the fridge is altered after John is done and altered more after James, but all other roommates have to make use of what the current content of the refrigerator holds. In short, if a thread modifies a particular variable, all other threads that work with that variable have to work with that modified data.

One question you’re probably asking from reading this illustration is

What happens when everybody wants to cook and eat at the same time?

Well, this brings up a new problem, what we refer to as racing conditions; one of the greatest pitfalls of working with threads.

The solution to this problem is a concept known as thread-safety: different threads accessing the same resources without producing unpredictable, nondeterministic results.

There are a few ways to achieve thread-safety:

  1. Don’t share data between thread
  2. Make the shared data immutable. This is similar to the principle that functional programming paradigm works with
  3. Synchronization

Multi-threaded programs should not be dependent on the relative timing between interfering threads, otherwise debugging becomes a nightmare as the running of code blocks become subject to chance, and the end result is nondeterministic. To be beat this, engineers must take time to carefully craft out their thread-safety solutions.

Synchronization and Deadlocks

Synchronization prevent threads from accessing the shared data at the same time which prevents corrupted state by making use of locks, to synchronize access to resources between threads, hence determine which thread has access to what resource at a particular point in the running of a program.

There are two things to keep in mind to ensure that a synchronized solution is thread-safe

  1. Every shared variable(mutable) must be guarded by some lock. The data must not be read or written except inside a synchronized block that acquires that lock. This will prevent unexpected results and a possible corruption of shared state.
  2. If a single constant involves multiple shared variables(mutable), then all the variables involved must be guarded by the same lock. Once a thread acquires the lock, the constant must be reestablished before releasing the lock.

However, another problem gets introduced in using locks for synchronization, this is called Deadlocks. The use of locks require threads to wait for locks to get released by the thread holding it, but this also makes it possible to get to a a situation where two threads are waiting for each other — preventing both from making progress and halting the running program all together.

Threads are not the end all be all of performance boosting — not even close. You must know when to use them, and in languages like python, they should be a last resort — when everything else fails.

--

--

Marho Onothoja
Marho Onothoja

No responses yet