Learn how to master Go's concurrency model with goroutines and channels. This guide covers mechanics, patterns, best practices, and performance optimization for building efficient, scalable concurrent applications.

seen from Netherlands
seen from United States
seen from United Kingdom
seen from China
seen from United States
seen from United States
seen from United Arab Emirates
seen from United States
seen from China

seen from United States
seen from China

seen from United States
seen from Chile
seen from China
seen from Russia
seen from China
seen from Türkiye
seen from United States

seen from United States
seen from China
Learn how to master Go's concurrency model with goroutines and channels. This guide covers mechanics, patterns, best practices, and performance optimization for building efficient, scalable concurrent applications.

Anya is live and ready to show you everything. Watch her strip, dance, and perform exclusive shows just for you. Interact in real-time and make your fantasies come true.
Free to watch • No registration required • HD streaming
Reminder, for anyone who has been waiting for it, the first concurrency article is out now. You can read it now. It actually exists.
Concurrent Computing 00: Concurrency, Threads, and OS Task Scheduling
Ever wondered how programs deal with multiple things at once? How can your device take in massive amounts of data, and manage to do something useful with it, in real time?
The answer is concurrency, but I'd forgive you if you don't fully understand what that means. We're going to take a close look at how computers handle concurrency, and how that term may not mean what you think it does.
This is the first in a series of articles, as there was too much to cover to fit it all in a single post. This isn't directly related to any of my current projects, and all this is just leading up to a rant about something I found interesting. However, as usual, the ultimate goal is simply to make information easily accessible to others.
Considering that I already mentioned that this is the first in a series, I probably don't need to say this, but this is going to be another long one. An entire series of long ones, in fact, with this likely being amongst the shortest.
CC-0.01 Multitasking and the OS's Role in Concurrency
Lets start by looking at how modern operating systems enable many processes to run, seemingly, and sometimes literally, at the same time.
While most contemporary machines have multiple processors and/or cores, modern operating systems can still maintain concurrency, even if only a single CPU is available. (Note that I will be using "core" and "CPU/processor" interchangeably, as the difference, from our perspective will be negligible.) In order to achieve this, the OS kernel uses a technique known as preemptive multitasking.
I plan to give an in-depth look at OS multitasking when I look at OS development, so we'll be taking a fairly surface-level look at it here. In the simplest terms, the OS divides work that needs to be performed into tasks. When the OS goes to work on a task, it starts a timer. If the timer runs out, the kernel will preempt the task, save its context (CPU registers, stack information, etc.), set the task aside, and switch to something else, a process known as context switching. The time for which a task is allowed to run before being preempted is known as its quantum.
Imagine a medical clinic. There may be only a single doctor available, but they will often still see multiple patients in a day. If a patient's problem requires more work than can done in a single day, the doctor will update their records, and reschedule them for another day. In this example, the doctor is our CPU, and the patients are our scheduled tasks. Although, for this analogy to reflect a real OS, the doctor would also need to take time to act as the receptionist, accountant, nurse, etc. as the OS (typically) uses the same CPU(s) for running the scheduler as it does to actually run the scheduled tasks, so it's a rather odd clinic.
Because tasks may be preempted at any time, and any waiting task may be scheduled to run, independent of all other tasks, all tasks can progress independent of one another. Even if one or more tasks are blocked, for example, waiting on I/O from a storage device, or a network connection, and is unable to progress, the OS may continue to work on other tasks.
This is concurrency: When tasks are executed in an arbitrary order, independent of each other. The tasks "share time", hence why early multitasking systems were often referred to as "time-sharing systems". Concurrency does not guarantee that the tasks will be executed at the same time (in parallel). It only ensures that tasks may progress asynchronously.
CC-0.02 Programs, Tasks, Processes, and Threads
So far, I've been referring to the work scheduled by the OS, generically, as "tasks". While the terms "thread" and "task" are generally used interchangeably (particularly when looking at things in Linux terms, as I tend to do), I will use "task" to mean "any piece of work scheduled by the OS", while I will use "thread" to refer to "scheduled work belonging to a specific process". To really understand what that means, lets look at how the OS executes programs.
In order to execute anything, we first need a program to run. This usually comes in the form of an executable binary file. These files contain segments for the text of the program, which contains the actual executable code, as well as all the other static and constant data that the program needs to actually function. The OS will locate the needed segments in the binary file, and map them into physical memory.
Once the OS has an in-memory image of the program we want to run, we can technically start executing the program. But, since we're executing many programs at once, we need to not only differentiate between them, but also isolate them from one another, so that they (ideally) can't trample one another. In order to deal with this, the OS assigns each running instance of a program a process, with a unique ID and its own address space, open file table, security privileges, etc..
Once the OS creates the process, it can schedule a new thread for that process that will begin executing the program from a special entry point that will handle any additional setup that needs to be done before calling the program's main() function (or whatever the equivalent point is for the the program we're executing).
From this perspective, a process is a running instance of a program, and acts as a container for the threads of that program, and provides the environment they're executing in. Threads within a process share the same address space, and are a part of the same instance of a program, but have their own call stacks, and their own context that gets saved when a context switch occurs. This enables threads to function independently of one another, but still directly interact with one another, and share all the resources and privileges associated with the process they belong to. This differs from how processes interact with one another, which requires them to, in some way, go through the OS.
Before someone comes after me for skipping over interpreted languages, in that case, the interpreter is the program, and the source being executed is used as instructions for that program. This also applies to languages using an intermediate bytecode representations, like Java. JIT (Just-In-Time) compilation is similar, but could be looked at as either generating a new, in-memory, executable image, or extending the existing image.
By default, a process is started with only a single thread, usually called the main thread. If a program wants leverage the OS to perform work concurrently, it has two options:
It can start, and cooperate with, other processes. This can be complex and expensive to deal with, but does work, and has been used to achieve concurrency since before threads became a standard inclusion in OSs. This also has the advantage that a failure in one process (usually) won't immediately cause other cooperating processes to fail (assuming the OS properly isolates processes). This is still a perfectly valid option, but has become less common due to the overhead associated with cooperation between, and management of, separate processes, and the widespread availability of multithreading APIs.
It can request that the OS start additional threads attached to the current process. This has the advantage that the threads don't need to (directly) go through the OS in order to share data and communicate, reducing the overhead required for cooperation. The disadvantage is that a single crash can take down all the threads, as they're all a part of the same process. It also introduces potentially subtle, and complex, issues that arise when dealing with concurrent tasks that directly share resources, especially regions of memory. This and the issues associated with it, will be our primary focus for these articles.
While concurrency alone can be extremely useful, particularly by allowing some tasks to block, while others continue to run, it would be even better if we could execute concurrent tasks in parallel.
The good news is that modern OSs have us covered there as well, providing support for executing concurrent tasks on many CPUs using Symmetric Multiprocessing (SMP). On multiprocessor systems, the OS will attempt to distribute the load of scheduled tasks across the available CPUs. This enables the system to, quite literally, do multiple things at once. This is more generally known as parallel processing. The "symmetric" term comes from the fact that all CPUs on the system share the same physical address space, and are treated as equal by the OS. There is nuance to this, but that is a topic for another time. Likewise, the alternative, Asymmetric Multiprocessing (AMP), is a subject that I would also like to discuss, at a later time.
CC-0.03 Applications of Concurrency via Multithreading
Thus far, we've mostly talked about what concurrency is, and how the OS achieves it. I've also alluded to the advantages of using threads to allow an application to perform tasks concurrently, such as the fact that one thread can block on I/O while the others continue, but haven't really explained what that means, or why we care.
Lets change that.
Let us consider a web server, handling hundreds, thousands, or even millions of simultaneous connections. The server needs to wait for, and accept new connections, while also receiving traffic from, and responding to, existing connections. If that sounds like a lot, that's because it is.
There are a number of ways an application can deal with this kind of situation. Because I have the space now, lets take a look at some of the more traditional options, then we'll look at how threads compare:
Non-blocking I/O - Typically, whenever we want to read or write on a file or network socket, I/O functions will cause our program to block if they can't immediately be completed. When this happens, the OS stops the program, and (typically) switches into to the kernel. If the task can't be completed, such as when waiting for network traffic, or waiting for a disk to retrieve or store data, the OS deschedules the task (similar to if its quantum had expired), marks it as "blocked", and won't reschedule it again until the requested operation is competed. If we consider our medical clinic analogy from earlier, this would be as if our doctor needed to wait for test results before proceeding to treat their patient, or perhaps if our patient needed to wait for a specialist to be available. Obviously, if we're handling many network connections, we don't want our program to hang forever waiting on traffic from one client while others are waiting. In order to work around this, we can ask the OS to mark our socket descriptors as non-blocking. When we try to perform I/O on a non-blocking descriptor, the function will always return immediately, even if the operation couldn't be completed. This fixes the issue of our application hanging on an I/O operation, but introduces the issue of backing off when an operation can't be completed. Worse yet, it also introduces the chance of partially completed operations, which will make our design much more complex. On top of all that, we now have the issue of our application needing to poll for when our files are ready using a busy wait (think of a child in the back of a car asking "Are we there yet?" every 5 nanoseconds), even if there's nothing happening. This wastes CPU time, making our application needlessly expensive. This can be improved somewhat by having the program sleep, often with exponential back off, when there's little to no traffic, however this can cause delays to be introduced when traffic arrives while our application is sleeping. This shows up on the client's end as additional latency, making the connection appear slower than it really is. It's also worth noting that our I/O operations may still incur the cost of making syscalls, even if they fail, which can result in excessive time spent context switching between our program and the kernel to try and perform our I/O operation (context switching is always expensive).
Polling Functions - Rather than trying our I/O operation on every file, and risking blocking, or individually polling on non-blocking files, we can instead use functions like select() and poll(). These functions let us tell the OS which file descriptors we're waiting on, and what events we're waiting for. The functions will block, similar to blocking I/O, but will return when any our our file descriptors are ready (or when a timer runs out), telling us which file descriptors are ready, so that we can be sure that we can actually perform our I/O operation. This can still result in I/O operations blocking, for example, if we're trying to read a specific amount of data, but less data than we want is available. However, when combined with non-blocking I/O, it makes for a pretty efficient solution, eliminating blocking, busy waiting, and unneeded syscalls. It doesn't, however, improve the problem of complexity when dealing with non-blocking I/O.
Asynchronous I/O - If we can't afford to block, but also don't want to deal with the issues of non-blocking I/O, we can instead use asynchronous I/O to inform the OS of the operation we want to perform. The OS will schedule the operation to be performed, and return immediately. Once the requested operation is complete, the OS will inform the process, usually using some form of signal. This sounds good in-theory, but if you've ever gotten into the weeds of dealing with signals, you'll know just how awful they can be (Woe, signal coalescing be upon ye.).
Client Processes - In this case, whenever a new client connection comes in, we start an entirely new process. The new process is then free to block on the I/O for the single client that it's responsible for. This actually works quite well, and is sometimes still used today. It has the advantage that the client processes can be scheduled to run in parallel on multiprocessor systems, an advantage none of the above offer. However, as I mentioned previously, there can be rather significant overhead involved in multiprocess applications, particularly if the processes need to communicate with one another, and if the client process needs to perform additional I/O that could block for a significant amount of time, we're right back where we started. For our example case, this would probably be fine, but we're not here to talk about multiprocess design.
While all of the above solutions can solve the problem of handling an arbitrary number of simultaneous connections, most of them introduce a frustrating level of complexity, with only limited scalability. Only the multiprocess design has the advantage of allowing us to continue using blocking I/O, while scaling well on modern systems, thanks to the fact that the OS is free to schedule the individual processes concurrently, or even in parallel on systems with multiple CPUs. However, as mentioned, if our tasks need to communicate, we have to call out to the OS to allow them to cooperate. It would be nice if we could skip that overhead, allow each task to run concurrently, and continue to use blocking I/O.
Good news! Multithreading allows us to do exactly that.
Rather than starting a full process for every new connection, we can ask the OS to schedule the work within the same address space by requesting it start additional threads. These threads can block in I/O calls without getting in each other's way, and share data by updating shared data withing the address space of the process (there's a catch to this, but we'll get to that in later articles). Threads also tend to have (marginally) reduced resource overhead when compared to full processes, as they share much of their resources. While the difference in resource overhead between processes and threads is typically minor, it can become much more significant when dealing with an especially large number of tasks.
So clearly threads are the magic solution, and we should use them for everything?
Well no. The entire next article will be dedicated entirely to the basics of threads, and designing ways to work around the problems they present. But, this article already pretty long, so I think it's about time to wrap up. However, I have just a couple more thing I wanted to mention before closing this out.
CC-0.04 Coroutines and Green Threads
It wouldn't be fair to talk about concurrency without mentioning coroutines.
Unlike threads, which are an example of preemptive multitasking (at least, in modern OSs), and are scheduled by the OS, coroutines are cooperative, and scheduled by the program itself. They allow the programmer to define special functions (routines, hence "coroutine" from "cooperative routine") which yield their time after reaching specific points, allowing other scheduled coroutines to run. This allows coroutines to run concurrently, in that a the routines will share the CPU time of their parent task and appear to progress independently of each other.
The advantage of coroutines is that the cooperative time sharing prevents long-running routines from hogging the CPU. Additionally, a coroutine implementation may prioritize routines to be ordered in a way that is more efficient for the CPU, though this isn't necessarily guaranteed. They also have very little overhead when compared to threads, as they don't require intervention from the OS. They also don't require many of the special synchronization primitives that full threads require, an issue that will be discussed in the upcoming articles.
The disadvantage of coroutines is that a routine will only yield its time when it reaches a point where it can do so. This means that a routine taking an especially long time can starve other waiting routines of CPU time. They also can't take advantage of SMP, as they're scheduled by a single thread of the application, not by the OS. This means that a blocked coroutine will block the entire thread, and prevent all other scheduled routines within the same thread from running, until the blocked routine can continue. However, there's no reason that a programmer couldn't combine the use of threads with coroutines to get the best of both.
Adjacent to coroutines are green threads, sometimes called user threads. The difference between coroutines and green threads isn't always obvious, and the terms are sometimes used interchangeably. However, I would draw the distinction that coroutines are purely cooperative, are usually included as part of a language, and allow complete routines to be defined as singular functions. Green threads, on the other hand are typically provided as an external library, may require breaking routines into smaller, individual functions, and in some more advanced implementations, may even provide primitive forms of preemptive multitasking (likely via some assembly language hackery). Though they still suffer from the limitation that they cannot, on their own, take advantage of SMP, as the threads are still scheduled by the application, not the OS.
Some early pthread implementations were implemented as green threads, as the POSIX standard doesn't define the actual scheduling mechanism to be used for the threads. Though, most modern systems use the pthread interface to provide their native threading implementation, so this shouldn't be something you'll have to worry about.
Final Thoughts
As I've said, this is just the first part of a series of articles. In the next one we'll actually start using threads, and run head-first into the problems that can show up when dealing with them. Particularly when they need to cooperate.
Despite all the work I've put into this article, it's hard to say I'm happy with where it is. Despite this, I don't think I can do better without feedback, so I suppose the time has come to simply release it, and see what others have to say. Chances are this article will continue to be revised as time goes on. I hope you found it interesting, if not informative.
For now, I think this is more than long enough. So, go be curious, ask questions, learn things, share knowledge, grow, and be exceptional.
Communicating Sequential Processes
Forget for a while about computers and computer programming, and think instead about objects in the world around us, which act and interact with us and with each other in accordance with some characteristic pattern of behaviour.
The two approaches share their most important characteristic, namely a sound mathematical basis for reasoning about specifications, designs and implementations; and either of them can be used for both theoretical investigations and for practical applications.
— C.A.R. Hoare
We don't talk about memory_order_consume.

Anya is live and ready to show you everything. Watch her strip, dance, and perform exclusive shows just for you. Interact in real-time and make your fantasies come true.
Free to watch • No registration required • HD streaming
Probably won't make a whole blog post about this for a long time. But I think the idea of "atomic variables" is silly, and felt like shitposting about it, so enjoy some memes. The word "atomic" describes behavior. How do you read and write the data? Atomically! The data itself doesn't have behavior. It's just a noun, it doesn't go around verbing on its own. It's the act of access that can be described as atomic or transactional, which is something you do to the data. And this is why, if you look at what's actually happening on the hardware, we have a variety of atomic instructions (like compare-and-swap), but no atomic types. I'm not saying we should take all our best practices from assembly, but this is an example of a design being coherent and honest, because you have to be when doing hardware. So why are atomic types a thing in multiple languages? Well, I think you can directly blame OOP, which is usually a safe choice in the same way that you can blame everything on capitalism and usually be correct. Ascribing behavior to nouns is classic OOP-brain. It's not just any flavor of deranged, it's the specific flavor where someone writes methods all day, and then needs an ergonomic abstraction for concurrency and says "oh heck I know just the hammer for this nail."
Python 3.14t: Free-Threaded Python and the End of the GIL
After decades of discussion and multiple failed attempts, Python has finally achieved what many thought impossible: true multi-threaded parallelism without the Global Interpreter Lock (GIL). Python 3.14t (the “t” stands for “threaded”) represents the culmination of PEP 703 and years of careful engineering. This comprehensive guide explores what free-threaded Python means for your applications,…
Java Concurrency – Iter Academy
👨🏫 Instructor: Iter Academy ⭐ Rating: ⭐⭐⭐⭐☆ (Popular for advanced Java learners) ⏱ Duration: Self-paced 📚 Category: Java • Concurrency • Multithreading
🔗 Free Course Link: https://t.me/LearnUdemyFree
💡 Master advanced Java concepts — threads, concurrency, synchronization, parallel processing — essential for high-performance applications! ⏳ Limited-time free coupon — enroll now before it expires!