Disclaimer: this guide focuses on upcoming features, currently only available via pull request.
Concurrent computing is a form of computing in which several computations are executed concurrently—during overlapping time periods—instead of sequentially—with one completing before the next starts.
This is a property of a system—whether a program, computer, or a network—where there is a separate execution point or “thread of control” for each process. A concurrent system is one where a computation can advance without waiting for all other computations to complete.
Concurrent computing is a form of modular programming. In its paradigm an overall computation is factored into subcomputations that may be executed concurrently. Pioneers in the field of concurrent computing include Edsger Dijkstra, Per Brinch Hansen, and C.A.R. Hoare.
[Source: English Wikipedia.]
In simpler terms, concurrent execution means two things happen at once. This is great, but how do you do it in OpenFL/Lime?
Choosing the right tool for the job
This guide covers three classes. Lime’s two concurrency classes, and Thread
, the standard class they’re based on.
But before you pick a class, first consider whether you should use threads at all.
- Can you detect any slowdown? If not, threads won’t help, and may even slow things down.
- How often do your threads interact with the outside world? The more often they transfer information, the slower and less safe they’ll be.
If you have a slow and self-contained task, that’s when you consider using threads.
Demo project
I think a specific example will make this guide easier to follow. Suppose I’m using libnoise to generate textures. I’ve created a feature-complete app, and the core of the code looks something like this:
private function generatePattern(workArea:Rectangle):Void {
//Allocate four bytes per pixel.
var bytes:ByteArray = new ByteArray(
Std.int(workArea.width) * Std.int(workArea.height));
//Run getValue() for every pixel.
for(y in Std.int(workArea.top)...Std.int(workArea.bottom)) {
for(x in Std.int(workArea.left)...Std.int(workArea.right)) {
//getValue() returns a value in the range [-1, 1], and we need
//to convert to [0, 255].
var value:Int = Std.int(128 + 128 * module.getValue(x, y, 0));
if(value > 255) {
value = 255;
} else if(value < 0) {
value = 0;
}
//Store it as a color.
bytes.writeInt(value << 16 | value << 8 | value);
}
}
//Draw the pixels to the canvas.
bytes.position = 0;
canvas.setPixels(workArea, bytes);
bytes.clear();
}
The problem is, this code makes the app lock up. Sometimes for a fraction of a second, sometimes for seconds on end. It all depends on which pattern it’s working on.
(If you have a beefy computer and this looks fine to you, try fullscreen.)
A good user interface responds instantly when the user clicks, rather than locking up. Clearly this app needs improvement, and since the bulk of the work is self-contained, I decide I’ll solve this problem using threads. Now I have two problems.
The easiest option is to use Haxe’s Thread
class. Since I know a single function is responsible for the freezing, all I need to do is change how I call that function.
-generatePattern(new Rectangle(0, 0, canvas.width, canvas.height));
+Thread.create(generatePattern.bind(new Rectangle(0, 0, canvas.width, canvas.height)));
View full changes
Thread.create()
requires a zero-argument function, so I use bind()
to supply the rectangle argument. With that done, create()
makes a new thread, and the app no longer freezes.
I’d love to show this in action, but it doesn’t work in HTML5. Sorry.
The downside is, the app now prints a bunch of “null pointer” messages. It turns out I’ve added a race condition.
Thread safety basics
The problem with Haxe’s threads is the fact that they’re just so convenient. You can access any variable from any thread, which is great if you don’t mind all the subtle errors.
My generatePattern()
function has two problem variables:
module
is a class variable, and the main thread updates it with every click. However, generatePattern()
assumes module
will stay the same the whole time. Worse, module
briefly becomes null each time it changes, and that can cause the “null pointer” race condition I mentioned above.
canvas
is also a class variable, which is modified during generatePattern()
. If multiple threads are going at once, it’s possible to modify canvas
from two threads simultaneously. canvas
is a BitmapData
, so I suspect it will merely produce a garbled image. If you do the same to other object types, it could permanently break that object.
Before I go into too much detail, let’s try a simple solution.
-Thread.create(generatePattern.bind(new Rectangle(0, 0, canvas.width, canvas.height)));
+lastCreatedThread = Thread.create(module, generatePattern.bind(new Rectangle(0, 0, canvas.width, canvas.height)));
-private function generatePattern(workArea:Rectangle):Void {
+private function generatePattern(module:ModuleBase, workArea:Rectangle):Void {
//Allocate four bytes per pixel.
var bytes:ByteArray = new ByteArray(
Std.int(workArea.width) * Std.int(workArea.height));
//Run getValue() for every pixel.
for(y in Std.int(workArea.top)...Std.int(workArea.bottom)) {
for(x in Std.int(workArea.left)...Std.int(workArea.right)) {
//getValue() returns a value in the range [-1, 1], and we need
//to convert to [0, 255].
var value:Int = Std.int(128 + 128 * module.getValue(x, y, 0));
if(value > 255) {
value = 255;
} else if(value < 0) {
value = 0;
}
//Store it as a color.
bytes.writeInt(value << 16 | value << 8 | value);
}
}
+ //If another thread was created after this one, don't draw anything.
+ if(Thread.current() != lastCreatedThread) {
+ return;
+ }
+
//Draw the pixels to the canvas.
bytes.position = 0;
canvas.setPixels(workArea, bytes);
bytes.clear();
}
View full changes
Step one, pass module
as an argument. That way, the function won’t be affected when the class variable changes. Step two, enforce a rule that only the last-created thread can modify canvas
.
Even then, there’s still at least one theoretical race condition in the above block of code. Can you spot it?
Whether or not you find it isn’t the point I’m trying to make. My point is that thread safety is hard, and you shouldn’t try to achieve it alone. I can spot several types of race condition, and I still don’t trust myself to write perfect code. No, if you want thread safety, you need some guardrails. Tools and design patterns that can take the guesswork out.
My favorite rule of thumb is that every object belongs to one thread, and only that thread may modify that value. And if possible, only that thread should access the value, though that’s less important. Oftentimes, this means making a copy of a value before passing it, so that the receiving thread can own the copy. This rule of thumb means generatePattern()
can’t call canvas.setPixels()
as shown above, since the main thread owns canvas
. Instead, it should send a thread-safe message back and allow the main thread to set the pixels.
And guess what? Lime’s Future
and ThreadPool
classes provide just the tools you need to do that. In fact, they’re designed as a blueprint for thread-safe code. If you follow the blueprint they offer, and you remember to copy your values when needed, your risk will be vastly reduced.
Lime’s Future
class is based on the general concept of futures and promises, wherein a “future” represents a value that doesn’t exist yet, but will exist in the future (hence the name).
For instance, BitmapData.loadFromFile()
returns a Future<BitmapData>
, representing the image that will eventually exist. It’s still loading for now, but if you add an onComplete
listener, you’ll get the image as soon as it’s ready.
I want to do pretty much the exact same thing in my sample app, creating a Future<BitmapData>
that will wait for the value returned by generatePattern()
. For this to work, I need to rewrite generatePattern()
so that it actually does return a value.
As discussed under thread safety basics, I want to take both module
and workArea
as arguments. However, Future
limits me to one argument, so I combine my two values into one anonymous structure named state
.
-private function generatePattern(workArea:Rectangle):Void {
+private static function generatePattern(state: { module:ModuleBase, workArea:Rectangle }):ByteArray {
//Allocate four bytes per pixel.
var bytes:ByteArray = new ByteArray(
Std.int(workArea.width) * Std.int(workArea.height));
//Run getValue() for every pixel.
for(y in Std.int(workArea.top)...Std.int(workArea.bottom)) {
for(x in Std.int(workArea.left)...Std.int(workArea.right)) {
//getValue() returns a value in the range [-1, 1], and we need
//to convert to [0, 255].
var value:Int = Std.int(128 + 128 * module.getValue(x, y, 0));
if(value > 255) {
value = 255;
} else if(value < 0) {
value = 0;
}
//Store it as a color.
bytes.writeInt(value << 16 | value << 8 | value);
}
}
- //Draw the pixels to the canvas.
- bytes.position = 0;
- canvas.setPixels(workArea, bytes);
- bytes.clear();
+ return bytes;
}
Now I call the function, listen for the return value, and draw the pixels.
-generatePattern(new Rectangle(0, 0, canvas.width, canvas.height));
+future = Future.withEventualValue(generatePattern, { module: module, workArea: new Rectangle(0, 0, canvas.width, canvas.height) }, MULTI_THREADED);
+
+//Store a copy of future
at this point in time.
+var expectedFuture:Future<ByteArray> = future;
+
+//Add a listener for later.
+future.onComplete(function(bytes:ByteArray):Void {
+ //If another thread was created after this one, don't draw anything.
+ if(future != expectedFuture) {
+ return;
+ }
+
+ //Draw the pixels to the canvas.
+ bytes.position = 0;
+ canvas.setPixels(new Rectangle(0, 0, canvas.width, canvas.height), bytes);
+ bytes.clear();
+});
View full changes
This event listener always runs on the main thread, meaning only the main thread ever updates canvas
, which is super helpful for thread safety. I still check whether another thread was created, but that’s only to make sure I’m drawing the right image, not because there’s a risk of two being drawn at once.
And this time, I can show you an HTML5 demo! Thanks to the use of threads, the app responds instantly after every click.
I should probably also mention that I set Future.FutureWork.maxThreads = 2
. This means you can have two threads running at once, but any more will have to wait. Click enough times in a row, and even fast patterns will become slow. Not because they themselves slowed down, but because they’re at the back of the line. The app has to finish calculating all the previous patterns first.
(If the problem isn’t obvious from the small demo, try fullscreen.)
This seems pretty impractical. Why would the app spend all this time calculating the old patterns when it knows it won’t display them? Well, the reason is that you can’t cancel a Future
once started. For that, and for other advanced features, you want to use ThreadPool
directly instead of indirectly.
Oh yeah, did I mention that Future
is built on top of ThreadPool
? Hang on while I go check. …Apparently I never mentioned it. Well, Future
is built on top of ThreadPool
. It tries to provide the same features in a more convenient way, but doesn’t provide all the features. Canceling jobs, sending progress updates,
Thread pools are a common way to make threads more efficient. It takes time to start up and shut down a thread, so why not reuse it instead? Lime’s ThreadPool
class follows this basic pattern, though it prioritizes cross-platform compatibility, thread safety, and ease of use over performance.
When using ThreadPool
, you’ll also need to be aware of its parent class, WorkOutput
, as that’s your ticket to thread-safe message transfer. You’ll receive a WorkOutput
instance as an argument (with the benefit that it can’t become null unexpectedly), and it has all the methods you need for communication.
sendComplete()
and sendError()
convey that your job succeeded/failed. When you call one of them, ThreadPool
dispatches onComplete
or onError
as appropriate, and then initiates the thread recycling process. Don’t call them if you aren’t done!
sendProgress()
works differently: you can call it as much as you like, with whatever type of data you like. It has no special meaning other than what you come up with. Unsurprisingly, sendProgress()
corresponds to onProgress
.
generatePattern()
only needs sendComplete()
, at least for now.
-private function generatePattern(workArea:Rectangle):Void {
+private static function generatePattern(state: { module:ModuleBase, workArea:Rectangle }, output:WorkOutput):Void {
//Allocate four bytes per pixel.
var bytes:ByteArray = new ByteArray(
Std.int(workArea.width) * Std.int(workArea.height));
//Run getValue() for every pixel.
for(y in Std.int(workArea.top)...Std.int(workArea.bottom)) {
for(x in Std.int(workArea.left)...Std.int(workArea.right)) {
//getValue() returns a value in the range [-1, 1], and we need
//to convert to [0, 255].
var value:Int = Std.int(128 + 128 * module.getValue(x, y, 0));
if(value > 255) {
value = 255;
} else if(value < 0) {
value = 0;
}
//Store it as a color.
bytes.writeInt(value << 16 | value << 8 | value);
}
}
- //Draw the pixels to the canvas.
- bytes.position = 0;
- canvas.setPixels(workArea, bytes);
- bytes.clear();
+ output.sendComplete(bytes, [bytes]);
}
Hmm, what’s up with “sendComplete(bytes, [bytes])
“? Looks kind of redundant.
Well, each of the “send” functions takes an optional array argument that improves performance in HTML5. It’s great for transferring ByteArray
s and similar packed data containers, but be aware that these containers will become totally unusable. That’s no problem at the end of the function, but be careful if using this with sendProgress()
.
With generatePattern()
updated, the next step to initialize my ThreadPool
.
//minThreads = 1, maxThreads = 1.
threadPool = new ThreadPool(1, 1, MULTI_THREADED);
threadPool.onComplete.add(function(bytes:ByteArray):Void {
//Draw the pixels to the canvas.
bytes.position = 0;
canvas.setPixels(new Rectangle(0, 0, canvas.width, canvas.height), bytes);
bytes.clear();
});
This time, I didn’t include a “latest thread” check. Instead, I plan to cancel old jobs, ensuring that they never dispatch an onComplete
event at all.
-generatePattern(new Rectangle(0, 0, canvas.width, canvas.height));
+threadPool.cancelJob(jobID);
+jobID = threadPool.run(generatePattern, { module: module, workArea: new Rectangle(0, 0, canvas.width, canvas.height) });
This works well enough in the simplest case, but the full app actually isn’t this simple. The full app actually has several classes listening for events, and they all receive each other’s events. To solve this, they each have to filter.
Allow me to direct your attention to ThreadPool.activeJob
. This variable is made available specifically during onComplete
, onError
, or onProgress
events, and it tells you where the event came from.
threadPool.onComplete.add(function(bytes:ByteArray):Void {
+ if(threadPool.activeJob.id != jobID) {
+ return;
+ }
+
//Draw the pixels to the canvas.
bytes.position = 0;
canvas.setPixels(new Rectangle(0, 0, canvas.width, canvas.height), bytes);
bytes.clear();
});
View full changes
Now, let’s see how the demo looks.
It turns out, setting maxThreads = 1
was a bad idea. Even calling cancelJob()
isn’t enough: the app still waits to finish the current job before starting the next. (As before, viewing in fullscreen may make the problem more obvious.)
When a function has already started, cancelJob()
does two things: (1) it bans the function call from dispatching events, and (2) it politely encourages the function to exit. There’s no way to force it to stop, so polite requests are all we get. If only generatePattern()
was more cooperative.
Virtual threads
Also known as green threads for historical reasons, virtual threads are what happens when you want thread-like behavior in a single-threaded environment.
As it happens, it was JavaScript’s definition of “async” that gave me the idea for this feature. JavaScript’s async
keyword runs a function right on the main thread, but sometimes puts that function on pause to let other functions run. Only one thing ever runs at once, but since they take turns, it still makes sense to call them “asynchronous” or “concurrent.”
Most platforms don’t support anything like the async
keyword, but we can imitate the behavior by exiting the function and starting it again later. Doesn’t sound very convenient, but unlike some things I tried, it’s simple, it’s reliable, and it works on every platform.
Exiting and restarting forms the basis for Lime’s virtual threads: instead of running a function on a background thread, run a small bit of that function each frame. The function is responsible for returning after a brief period, because if it takes too long the app won’t be able to draw the next frame in time. Then ThreadPool
or FutureWork
is responsible for scheduling it again, so it can continue. This behavior is also known as “cooperative multitasking” – multitasking made possible by functions voluntarily passing control to one another.
Here’s an outline for a cooperative function.
- The first time the function is called, it performs initialization and does a little work.
- By the end of the call, it stores its progress for later.
- When the function is called again, it checks for stored progress and determines that this isn’t the first call. Using this stored data, it continues from where it left off, doing a little more work. Then it stores the new data and exits again.
- Step 3 repeats until the function detects an end point. Then it calls
sendComplete()
or (if using Future
) returns a non-null value.
ThreadPool
or FutureWork
stops calling the function, and dispatches the onComplete
event.
This leaves the question of where you should store that data. In single-threaded mode, you can put it wherever you like. However, this type of cooperation is also useful in multi-threaded mode so that functions can be canceled, and storing data in class variables isn’t always thread safe. Instead, I recommend using the state
argument. Which is, incidentally, why I like to call it “state
.” It provides the initial input and stores progress.
Typically, state
will have some mandatory values (supplied by the caller) and some optional ones (initialized and updated by the function itself). If the optional ones are missing, that indicates it’s the first iteration.
-private static function generatePattern(state: { module:ModuleBase, workArea:Rectangle }, output:WorkOutput):Void {
+private static function generatePattern(state: { module:ModuleBase, workArea:Rectangle, ?y:Int, ?bytes:ByteArray }, output:WorkOutput):Void {
- //Allocate four bytes per pixel.
- var bytes:ByteArray = new ByteArray(
- Std.int(workArea.width) * Std.int(workArea.height));
+ var bytes:ByteArray = state.bytes;
+
+ //If it's the first iteration, initialize the optional values.
+ if(bytes == null) {
+ //Allocate four bytes per pixel.
+ state.bytes = bytes = new ByteArray(
+ Std.int(workArea.width) * Std.int(workArea.height));
+
+ state.y = Std.int(workArea.top);
+ }
+
+ //Each iteration, determine how much work to do.
+ var endY:Int = state.y + (output.mode == MULTI_THREADED ? 50 : 5);
+ if(endY > Std.int(workArea.bottom)) {
+ endY = Std.int(workArea.bottom);
+ }
//Run getValue() for every pixel.
- for(y in Std.int(workArea.top)...Std.int(workArea.bottom)) {
+ for(y in state.y...endY) {
for(x in Std.int(workArea.left)...Std.int(workArea.right)) {
//getValue() returns a value in the range [-1, 1], and we need
//to convert to [0, 255].
var value:Int = Std.int(128 + 128 * module.getValue(x, y, 0));
if(value > 255) {
value = 255;
} else if(value < 0) {
value = 0;
}
//Store it as a color.
bytes.writeInt(value << 16 | value << 8 | value);
}
}
+ //Save progress.
+ state.y = endY;
+
+ //Don't call sendComplete() until actually done.
+ if(state.y >= Std.int(workArea.bottom)) {
output.sendComplete(bytes, [bytes]);
+ }
}
Note that I do more work per iteration in multi-threaded mode. There’s no need to return too often; just often enough to exit if the job’s been canceled. It also incurs overhead in HTML5, so it’s best not to overdo it.
Single-threaded mode is the polar opposite. There’s minimal overhead, and you get better timing if the function is very short. Ideally, short enough to run 5+ times a frame with time left over. On a slow computer, it’ll automatically reduce the number of times per frame to prevent lag.
Next, I tell ThreadPool
to use single-threaded mode, and I specify a value of 3/4. This value indicates what fraction of the main thread’s processing power should be spent on this ThreadPool
. I’ve elected to take up 75% of it, leaving 25% for other tasks. Since I know those other tasks aren’t very intense, this is plenty.
-threadPool = new ThreadPool(1, 1, MULTI_THREADED);
+threadPool = new ThreadPool(1, 1, SINGLE_THREADED, 3/4);
View full changes
Caution: reduce this number if creating multiple single-threaded ThreadPool
s. workLoad
s from different pools add together, and can easily add up to 1 or more. That means 100% (or more) of the available time each frame gets spent on virtual threads, slowing the app down.
In any case, it’s time for another copy of the demo. Since we’re nearing the end, I also went ahead and implemented progress events. Now you can watch the progress in (closer to) real time.
These changes also benefit multi-threaded mode, so I created another multi-threaded version for comparison. With progress events, you can now see the slight pause when it spins up a new web worker (which isn’t that often, since it keeps two of them running).
(For comparison, here they both are in fullscreen: virtual threads, web workers.)
I don’t know, I like them both. Virtual threads have the benefit of being lighter weight, while web workers have the benefit of being real threads, meaning you could run eight in parallel without slowing the main thread.
My advice? Write code that works both ways, as shown in this guide. Keep your options open, since the configuration that works best for a small app may not be what works best for a big one. Good luck out there!