December 2021 – I Call Haxe!

Guide to threads in Lime

Disclaimer: this guide focuses on upcoming features, currently only available via Git.

Concurrent computing

Concurrent computing is a form of computing in which several computations are executed concurrently—during overlapping time periods—instead of sequentially—with one completing before the next starts.

This is a property of a system—whether a program, computer, or a network—where there is a separate execution point or “thread of control” for each process. A concurrent system is one where a computation can advance without waiting for all other computations to complete.

Concurrent computing is a form of modular programming. In its paradigm an overall computation is factored into subcomputations that may be executed concurrently. Pioneers in the field of concurrent computing include Edsger Dijkstra, Per Brinch Hansen, and C.A.R. Hoare.

[Source: English Wikipedia.]

In simpler terms, concurrent execution means two things happen at once. This is great, but how do you do it in OpenFL/Lime?

Choosing the right tool for the job

This guide covers three classes. Lime’s two concurrency classes, and Thread, the standard class they’re based on.

Class	`Thread`	`Future`	`ThreadPool`
Source	Haxe	Lime	Lime
Ease of use	★★★★★	★★★★☆	★★★☆☆
Thread safety	★☆☆☆☆	★★★★☆	★★★★☆
HTML5 support	No	Yes	Yes

But before you pick a class, first consider whether you should use threads at all.

Can you detect any slowdown? If not, threads won’t help, and may even slow things down.
How often do your threads interact with the outside world? The more often they transfer information, the slower and less safe they’ll be.

If you have a slow and self-contained task, that’s when you consider using threads.

Demo project

I think a specific example will make this guide easier to follow. Suppose I’m using libnoise to generate textures. I’ve created a feature-complete app, and the core of the code looks something like this:

private function generatePattern(workArea:Rectangle):Void {
    //Allocate four bytes per pixel.
    var bytes:ByteArray = new ByteArray(
        Std.int(workArea.width) * Std.int(workArea.height));
    
    //Run getValue() for every pixel.
    for(y in Std.int(workArea.top)...Std.int(workArea.bottom)) {
        for(x in Std.int(workArea.left)...Std.int(workArea.right)) {
            //getValue() returns a value in the range [-1, 1], and we need
            //to convert to [0, 255].
            var value:Int = Std.int(128 + 128 * module.getValue(x, y, 0));
            
            if(value > 255) {
                value = 255;
            } else if(value < 0) {
                value = 0;
            }
            
            //Store it as a color.
            bytes.writeInt(value << 16 | value << 8 | value);
        }
    }
    
    //Draw the pixels to the canvas.
    bytes.position = 0;
    canvas.setPixels(workArea, bytes);
    bytes.clear();
}

The problem is, this code makes the app lock up. Sometimes for a fraction of a second, sometimes for seconds on end. It all depends on which pattern it’s working on.

(If you have a beefy computer and this looks fine to you, try fullscreen.)

A good user interface responds instantly when the user clicks, rather than locking up. Clearly this app needs improvement, and since the bulk of the work is self-contained, I decide I’ll solve this problem using threads. ~~Now I have two problems.~~

Using `Thread`

The easiest option is to use Haxe’s Thread class. Since I know a single function is responsible for the freezing, all I need to do is change how I call that function.

-generatePattern(new Rectangle(0, 0, canvas.width, canvas.height));
+Thread.create(generatePattern.bind(new Rectangle(0, 0, canvas.width, canvas.height)));

View full changes

Thread.create() requires a zero-argument function, so I use bind() to supply the rectangle argument. With that done, create() makes a new thread, and the app no longer freezes.

I’d love to show this in action, but it doesn’t work in HTML5. Sorry.

The downside is, the app now prints a bunch of “null pointer” messages. It turns out I’ve added a race condition.

Thread safety basics

The problem with Haxe’s threads is the fact that they’re just so convenient. You can access any variable from any thread, which is great if you don’t mind all the subtle errors.

My generatePattern() function has two problem variables:

module is a class variable, and the main thread updates it with every click. However, generatePattern() assumes module will stay the same the whole time. Worse, module briefly becomes null each time it changes, and that can cause the “null pointer” race condition I mentioned above.
canvas is also a class variable, which is modified during generatePattern(). If multiple threads are going at once, it’s possible to modify canvas from two threads simultaneously. canvas is a BitmapData, so I suspect it will merely produce a garbled image. If you do the same to other object types, it could permanently break that object.

Before I go into too much detail, let’s try a simple solution.

-Thread.create(generatePattern.bind(new Rectangle(0, 0, canvas.width, canvas.height)));
+lastCreatedThread = Thread.create(module, generatePattern.bind(new Rectangle(0, 0, canvas.width, canvas.height)));

-private function generatePattern(workArea:Rectangle):Void {
+private function generatePattern(module:ModuleBase, workArea:Rectangle):Void {
    //Allocate four bytes per pixel.
    var bytes:ByteArray = new ByteArray(
        Std.int(workArea.width) * Std.int(workArea.height));
    
    //Run getValue() for every pixel.
    for(y in Std.int(workArea.top)...Std.int(workArea.bottom)) {
        for(x in Std.int(workArea.left)...Std.int(workArea.right)) {
            //getValue() returns a value in the range [-1, 1], and we need
            //to convert to [0, 255].
            var value:Int = Std.int(128 + 128 * module.getValue(x, y, 0));
            
            if(value > 255) {
                value = 255;
            } else if(value < 0) {
                value = 0;
            }
            
            //Store it as a color.
            bytes.writeInt(value << 16 | value << 8 | value);
        }
    }
    
+   //If another thread was created after this one, don't draw anything.
+   if(Thread.current() != lastCreatedThread) {
+       return;
+   }
+   
    //Draw the pixels to the canvas.
    bytes.position = 0;
    canvas.setPixels(workArea, bytes);
    bytes.clear();
}

View full changes

Step one, pass module as an argument. That way, the function won’t be affected when the class variable changes. Step two, enforce a rule that only the last-created thread can modify canvas.

Even then, there’s still at least one theoretical race condition in the above block of code. Can you spot it?

Whether or not you find it isn’t the point I’m trying to make. My point is that thread safety is hard, and you shouldn’t try to achieve it alone. I can spot several types of race condition, and I still don’t trust myself to write perfect code. No, if you want thread safety, you need some guardrails. Tools and design patterns that can take the guesswork out.

My favorite rule of thumb is that every object belongs to one thread, and only that thread may modify that value. And if possible, only that thread should access the value, though that’s less important. Oftentimes, this means making a copy of a value before passing it, so that the receiving thread can own the copy. This rule of thumb means generatePattern() can’t call canvas.setPixels() as shown above, since the main thread owns canvas. Instead, it should send a thread-safe message back and allow the main thread to set the pixels.

And guess what? Lime’s Future and ThreadPool classes provide just the tools you need to do that. In fact, they’re designed as a blueprint for thread-safe code. If you follow the blueprint they offer, and you remember to copy your values when needed, your risk will be vastly reduced.

Using `Future`

Lime’s Future class is based on the general concept of futures and promises, wherein a “future” represents a value that doesn’t exist yet, but will exist in the future (hence the name).

For instance, BitmapData.loadFromFile() returns a Future<BitmapData>, representing the image that will eventually exist. It’s still loading for now, but if you add an onComplete listener, you’ll get the image as soon as it’s ready.

I want to do pretty much the exact same thing in my sample app, creating a Future<BitmapData> that will wait for the value returned by generatePattern(). For this to work, I need to rewrite generatePattern() so that it actually does return a value.

As discussed under thread safety basics, I want to take both module and workArea as arguments. However, Future limits me to one argument, so I combine my two values into one anonymous structure named state.

-private function generatePattern(workArea:Rectangle):Void {
+private static function generatePattern(state: { module:ModuleBase, workArea:Rectangle }):ByteArray {
    //Allocate four bytes per pixel.
    var bytes:ByteArray = new ByteArray(
        Std.int(workArea.width) * Std.int(workArea.height));
    
    //Run getValue() for every pixel.
    for(y in Std.int(workArea.top)...Std.int(workArea.bottom)) {
        for(x in Std.int(workArea.left)...Std.int(workArea.right)) {
            //getValue() returns a value in the range [-1, 1], and we need
            //to convert to [0, 255].
            var value:Int = Std.int(128 + 128 * module.getValue(x, y, 0));
            
            if(value > 255) {
                value = 255;
            } else if(value < 0) {
                value = 0;
            }
            
            //Store it as a color.
            bytes.writeInt(value << 16 | value << 8 | value);
        }
    }
    
-    //Draw the pixels to the canvas.
-    bytes.position = 0;
-    canvas.setPixels(workArea, bytes);
-    bytes.clear();
+    return bytes;
}

Now I call the function, listen for the return value, and draw the pixels.

-generatePattern(new Rectangle(0, 0, canvas.width, canvas.height));
+future = Future.withEventualValue(generatePattern, { module: module, workArea: new Rectangle(0, 0, canvas.width, canvas.height) }, MULTI_THREADED);
+
+//Store a copy of `future` at this point in time.
+var expectedFuture:Future<ByteArray> = future;
+
+//Add a listener for later.
+future.onComplete(function(bytes:ByteArray):Void {
+   //If another thread was created after this one, don't draw anything.
+   if(future != expectedFuture) {
+       return;
+   }
+   
+   //Draw the pixels to the canvas.
+   bytes.position = 0;
+   canvas.setPixels(new Rectangle(0, 0, canvas.width, canvas.height), bytes);
+   bytes.clear();
+});

View full changes

This event listener always runs on the main thread, meaning only the main thread ever updates canvas, which is super helpful for thread safety. I still check whether another thread was created, but that’s only to make sure I’m drawing the right image, not because there’s a risk of two being drawn at once.

And this time, I can show you an HTML5 demo! Thanks to the use of threads, the app responds instantly after every click.

I should probably also mention that I set Future.FutureWork.maxThreads = 2. This means you can have two threads running at once, but any more will have to wait. Click enough times in a row, and even fast patterns will become slow. Not because they themselves slowed down, but because they’re at the back of the line. The app has to finish calculating all the previous patterns first.

(If the problem isn’t obvious from the small demo, try fullscreen.)

This seems pretty impractical. Why would the app spend all this time calculating the old patterns when it knows it won’t display them? Well, the reason is that you can’t cancel a Future once started. For that, and for other advanced features, you want to use ThreadPool directly instead of indirectly.

Oh yeah, did I mention that Future is built on top of ThreadPool? Hang on while I go check. …Apparently I never mentioned it. Well, Future is built on top of ThreadPool. It tries to provide the same features in a more convenient way, but doesn’t provide all the features. If you want to cancel jobs or send progress updates, you’ll need ThreadPool.

Using `ThreadPool`

Thread pools are a common way to make threads more efficient. It takes time to start up and shut down a thread, so why not reuse it instead? Lime’s ThreadPool class follows this basic pattern, though it prioritizes cross-platform compatibility, thread safety, and ease of use over performance.

When using ThreadPool, you’ll also need to be aware of its parent class, WorkOutput, as that’s your ticket to thread-safe message transfer. You’ll receive a WorkOutput instance as an argument (with the benefit that it can’t become null unexpectedly), and it has all the methods you need for communication.

sendComplete() and sendError() convey that your job succeeded/failed. When you call one of them, ThreadPool dispatches onComplete or onError as appropriate, and then initiates the thread recycling process. Don’t call them if you aren’t done!

sendProgress() works differently: you can call it as much as you like, with whatever type of data you like. It has no special meaning other than what you come up with. Unsurprisingly, sendProgress() corresponds to onProgress.

generatePattern() only needs sendComplete(), at least for now.

-private function generatePattern(workArea:Rectangle):Void {
+private static function generatePattern(state: { module:ModuleBase, workArea:Rectangle }, output:WorkOutput):Void {
    //Allocate four bytes per pixel.
    var bytes:ByteArray = new ByteArray(
        Std.int(workArea.width) * Std.int(workArea.height));
    
    //Run getValue() for every pixel.
    for(y in Std.int(workArea.top)...Std.int(workArea.bottom)) {
        for(x in Std.int(workArea.left)...Std.int(workArea.right)) {
            //getValue() returns a value in the range [-1, 1], and we need
            //to convert to [0, 255].
            var value:Int = Std.int(128 + 128 * module.getValue(x, y, 0));
            
            if(value > 255) {
                value = 255;
            } else if(value < 0) {
                value = 0;
            }
            
            //Store it as a color.
            bytes.writeInt(value << 16 | value << 8 | value);
        }
    }
    
-   //Draw the pixels to the canvas.
-   bytes.position = 0;
-   canvas.setPixels(workArea, bytes);
-   bytes.clear();
+   output.sendComplete(bytes, [bytes]);
}

Hmm, what’s up with “sendComplete(bytes, [bytes])“? Looks kind of redundant.

Well, each of the “send” functions takes an optional array argument that improves performance in HTML5. It’s great for transferring ByteArrays and similar packed data containers, but be aware that these containers will become totally unusable. That’s no problem at the end of the function, but be careful if using this with sendProgress().

With generatePattern() updated, the next step to initialize my ThreadPool.

//minThreads = 1, maxThreads = 1.
threadPool = new ThreadPool(1, 1, MULTI_THREADED);
threadPool.onComplete.add(function(bytes:ByteArray):Void {
    //Draw the pixels to the canvas.
    bytes.position = 0;
    canvas.setPixels(new Rectangle(0, 0, canvas.width, canvas.height), bytes);
    bytes.clear();
});

This time, I didn’t include a “latest thread” check. Instead, I plan to cancel old jobs, ensuring that they never dispatch an onComplete event at all.

-generatePattern(new Rectangle(0, 0, canvas.width, canvas.height));
+threadPool.cancelJob(jobID);
+jobID = threadPool.run(generatePattern, { module: module, workArea: new Rectangle(0, 0, canvas.width, canvas.height) });

This works well enough in the simplest case, but the full app actually isn’t this simple. The full app actually has several classes listening for events, and they all receive each other’s events. To solve this, they each have to filter.

Allow me to direct your attention to ThreadPool.activeJob. This variable is made available specifically during onComplete, onError, or onProgress events, and it tells you where the event came from.

threadPool.onComplete.add(function(bytes:ByteArray):Void {
+   if(threadPool.activeJob.id != jobID) {
+       return;
+   }
+   
    //Draw the pixels to the canvas.
    bytes.position = 0;
    canvas.setPixels(new Rectangle(0, 0, canvas.width, canvas.height), bytes);
    bytes.clear();
});

View full changes

Now, let’s see how the demo looks.

It turns out, setting maxThreads = 1 was a bad idea. Even calling cancelJob() isn’t enough: the app still waits to finish the current job before starting the next. (As before, viewing in fullscreen may make the problem more obvious.)

When a function has already started, cancelJob() does two things: (1) it bans the function call from dispatching events, and (2) it politely encourages the function to exit. There’s no way to force it to stop, so polite requests are all we get. If only generatePattern() was more cooperative.

Green/virtual threads

Green threads are what happens when you want thread-like behavior in a single-threaded environment. (“Virtual threads” can mean the same thing, but Java seems to be claiming the term for something else.)

As it happens, it was JavaScript’s definition of “async” that gave me the idea for this feature. JavaScript’s async keyword runs a function right on the main thread, but sometimes puts that function on pause to let other functions run. Only one thing ever runs at once, but since they take turns, it still makes sense to call them “asynchronous” or “concurrent.”

Most platforms don’t support anything like the async keyword, but we can imitate the behavior by exiting the function and starting it again later. Doesn’t sound very convenient, but unlike some things I tried, it’s simple, it’s reliable, and it works on every platform.

Exiting and restarting forms the basis for Lime’s green threads: instead of running a function on a background thread, run a small bit of that function each frame. The function is responsible for returning after a brief period, because if it takes too long the app won’t be able to draw the next frame in time. Then ThreadPool or FutureWork is responsible for scheduling it again, so it can continue. This behavior is also known as “cooperative multitasking” – multitasking made possible by functions voluntarily passing control to one another.

Here’s an outline for a cooperative function.

The first time the function is called, it performs initialization and does a little work.
By the end of the call, it stores its progress for later.
When the function is called again, it checks for stored progress and determines that this isn’t the first call. Using this stored data, it continues from where it left off, doing a little more work. Then it stores the new data and exits again.
Step 3 repeats until the function detects an end point. Then it calls sendComplete() or (if using Future) returns a non-null value.
ThreadPool or FutureWork stops calling the function, and dispatches the onComplete event.

This leaves the question of where you should store that data. In single-threaded mode, you can put it wherever you like. However, this type of cooperation is also useful in multi-threaded mode so that functions can be canceled, and storing data in class variables isn’t always thread safe. Instead, I recommend using the state argument. Which is, incidentally, why I like to call it “state.” It provides the initial input and stores progress.

Typically, state will have some mandatory values (supplied by the caller) and some optional ones (initialized and updated by the function itself). If the optional ones are missing, that indicates it’s the first iteration.

-private static function generatePattern(state: { module:ModuleBase, workArea:Rectangle }, output:WorkOutput):Void {
+private static function generatePattern(state: { module:ModuleBase, workArea:Rectangle, ?y:Int, ?bytes:ByteArray }, output:WorkOutput):Void {
-   //Allocate four bytes per pixel.
-   var bytes:ByteArray = new ByteArray(
-       Std.int(workArea.width) * Std.int(workArea.height));
+   var bytes:ByteArray = state.bytes;
+   
+   //If it's the first iteration, initialize the optional values.
+   if(bytes == null) {
+       //Allocate four bytes per pixel.
+       state.bytes = bytes = new ByteArray(
+           Std.int(workArea.width) * Std.int(workArea.height));
+       
+       state.y = Std.int(workArea.top);
+   }
+   
+   //Each iteration, determine how much work to do.
+   var endY:Int = state.y + (output.mode == MULTI_THREADED ? 50 : 5);
+   if(endY > Std.int(workArea.bottom)) {
+       endY = Std.int(workArea.bottom);
+   }
    
    //Run getValue() for every pixel.
-    for(y in Std.int(workArea.top)...Std.int(workArea.bottom)) {
+   for(y in state.y...endY) {
        for(x in Std.int(workArea.left)...Std.int(workArea.right)) {
            //getValue() returns a value in the range [-1, 1], and we need
            //to convert to [0, 255].
            var value:Int = Std.int(128 + 128 * module.getValue(x, y, 0));
            
            if(value > 255) {
                value = 255;
            } else if(value < 0) {
                value = 0;
            }
            
            //Store it as a color.
            bytes.writeInt(value << 16 | value << 8 | value);
        }
    }
    
+   //Save progress.
+   state.y = endY;
+   
+   //Don't call sendComplete() until actually done.
+   if(state.y >= Std.int(workArea.bottom)) {
        output.sendComplete(bytes, [bytes]);
+   }
}

Note that I do more work per iteration in multi-threaded mode. There’s no need to return too often; just often enough to exit if the job’s been canceled. It also incurs overhead in HTML5, so it’s best not to overdo it.

Single-threaded mode is the polar opposite. There’s minimal overhead, and you get better timing if the function is very short. Ideally, short enough to run 5+ times a frame with time left over. On a slow computer, it’ll automatically reduce the number of times per frame to prevent lag.

Next, I tell ThreadPool to use single-threaded mode, and I specify a workLoad of 3/4. This value indicates what fraction of the main thread’s processing power should be spent on this ThreadPool. I’ve elected to take up 75% of it, leaving 25% for other tasks. Since I know those other tasks aren’t very intense, this is plenty.

-threadPool = new ThreadPool(1, 1, MULTI_THREADED);
+threadPool = new ThreadPool(1, 1, SINGLE_THREADED, 3/4);

View full changes

Caution: reduce this number if creating multiple single-threaded ThreadPools. If two pools each have a workLoad of 3/4, then they’ll take up 150% of the allocated time per frame, and your app will slow down by (at least) 50%. Instead, try to keep the combined workLoad under 1.

In any case, it’s time for another copy of the demo. Since we’re nearing the end, I also went ahead and implemented progress events. Now you can watch the progress in (closer to) real time.

These changes also benefit multi-threaded mode, so I created another multi-threaded version for comparison. With progress events, you can now see the slight pause when it spins up a new web worker (which isn’t that often, since it keeps two of them running).

(For comparison, here they both are in fullscreen: green threads, web workers.)

I don’t know, I like them both. Green threads have the benefit of being lighter weight, while web workers have the benefit of being real threads, meaning you could run eight in parallel without slowing the main thread.

My advice? Write code that works both ways, as shown in this guide. Keep your options open, since the configuration that works best for a small app may not be what works best for a big one. Good luck out there!

Web Workers in Lime

If you haven’t already read my guide to threads, I suggest starting there.

I’ve spent the last month implementing web worker support in Lime. (Edit: and then I spent another month after posting this.) It turned out to be incredibly complicated, and though I did my best to include documentation in the code, I think it’s worth a blog post too. Let’s go over what web workers are, why you might want to use them, and why you might not want to use them.

To save space, I’m going to assume you’ve heard of threads, race conditions, and threads in Haxe.

About `BackgroundWorker` and `ThreadPool`

BackgroundWorker and ThreadPool are Lime’s two classes for safely managing threads. They were added back in 2015, and have stayed largely unchanged since. (Until this past month, but I’ll get to that.)

The two classes fill different roles. BackgroundWorker is ideal for one-off jobs, while ThreadPool is a bit more complex but offers performance benefits when doing multiple jobs in a row.

BackgroundWorker isn’t too different from calling Thread.create() – both make a thread and run a single job. The main difference is that BackgroundWorker builds in safety features.

Recently, Haxe added its own thread pool implementations: FixedThreadPool has a constant number of threads, while ElasticThreadPool tries to add and remove threads based on demand. Lime’s ThreadPool does a combination of the two: you can set the minimum and maximum number of threads, and it will vary within that range based on demand. Plus it offers structure and safety features, just like BackgroundWorker. On the other hand, ThreadPool lacks ElasticThreadPool‘s threadTimeout feature, so threads will exit instantly if they don’t have a job to do.

I always hate reinventing the wheel. Why does Lime need a ThreadPool class when Haxe already offers two? (Ignoring the fact that Lime’s came first.) Just because of thread safety? There are other ways to achieve that.

If only Haxe’s thread pools worked in JavaScript…

Web workers

Mozilla describes web workers as “a simple means for web content to run scripts in background threads.” “Simple” is a matter of perspective, but they do allow you to create background threads in JavaScript.

Problem is, they have two fundamental differences from Haxe’s threads, which is why Haxe doesn’t include them in ElasticThreadPool and FixedThreadPool.

Web workers use source code.
Web workers are isolated.

Workers use source code

Web workers execute a JavaScript file, not a JavaScript function. Fortunately, it is usually possible to turn a function back into source code, simply by calling toString(). Usually. Let’s start with how this works in pure JavaScript:

function add(a, b) {
    return a + b;
}

console.log(add(1, 2)); //Output: 3
console.log(add.toString()); //Output:
//function add(a, b) {
//    return a + b;
//}

That first log() call is just to show the function working. The second shows that we get the function source code as a string. It even preserved our formatting!

If we look at the examples, we find that it goes to great lengths to preserve the original formatting.

`toString()` input	`toString()` output
`function f(){}`	`"function f(){}"`
`class A { a(){} }`	`"class A { a(){} }"`
`function* g(){}`	`"function* g(){}"`
`a => a`	`"a => a"`
`({ a(){} }.a)`	`"a(){}"`
`({ [0](){} }[0])`	`"[0](){}"`
`Object.getOwnPropertyDescriptor({ get a(){} }, "a").get`	`"get a(){}"`
`Object.getOwnPropertyDescriptor({ set a(x){} }, "a").set`	`"set a(x){}"`
`Function.prototype.toString`	`"function toString() { [native code] }"`
`(function f(){}.bind(0))`	`"function () { [native code] }"`
`Function("a", "b")`	`"function anonymous(a\n) {\nb\n}"`

That’s weird. In two of those cases, the function body – the meat of the code – has been replaced with “[native code]”. (That isn’t even valid JavaScript!) As the documentation explains:

If the toString() method is called on built-in function objects or a function created by Function.prototype.bind, toString() returns a native function string

In other words, if we ever call bind() on a function, we can’t get its source code, meaning we can’t use it in a web worker. And wouldn’t you know it, Haxe automatically calls bind() on certain functions.

Let’s try writing some Haxe code to call toString(). Ideally, we want to write a function in Haxe, have Haxe translate it to JavaScript, and then get its JavaScript source code.

class Test {
    static function staticAdd(a, b) {
        return a + b;
    }
    
    function add(a, b) {
        return a + b;
    }
    
    static function main() {
        var instance = new Test();
        
        trace(staticAdd(1, 2));
        trace(instance.add(2, 3));
        
        #if js
        trace((cast staticAdd).toString());
        trace((cast instance.add).toString());
        #end
    }
    
    inline function new() {}
}

If you try this code, you’ll get the following output:

Test.hx:15: 3
Test.hx:16: 5
Test.hx:18: staticAdd(a,b) {
        return a + b;
    }
Test.hx:19: function() {
    [native code]
}

The first two lines prove that both functions work just fine. staticAdd is printed exactly like it appears in the JavaScript file. But instance.add is all wrong. Let’s look at the JS source to see why:

static main() {
    let instance = new Test();
    console.log("Test.hx:15:",Test.staticAdd(1,2));
    console.log("Test.hx:16:",instance.add(2,3));
    console.log("Test.hx:18:",Test.staticAdd.toString());
    console.log("Test.hx:19:",$bind(instance,instance.add).toString());
}

Yep, there it is. Haxe inserted a call to $bind(), a function that – perhaps unsurprisingly – calls bind().

Turns out, Haxe always inserts $bind() when you try to refer to an instance function. This is in fact required: otherwise, the function couldn’t access the instance it came from. But it also means we can’t use instance functions in web workers. Or can we?

After a lot of frustration and effort, I came up with ThreadFunction. Read the source if you want details; otherwise, the one thing to understand is that it can only remove the $bind() call if you convert to ThreadFunction ASAP. If you have a variable (or function argument) representing a function, that variable (or argument) must be of type ThreadFunction.

//Instead of this...
class DoesNotWork {
    public var threadFunction:Dynamic -> Void;
    
    public function new(threadFunction:Dynamic -> Void) {
        this.threadFunction = threadFunction;
    }
    
    public function runThread():Void {
        new BackgroundWorker().run(threadFunction);
    }
}

//...you want to do this.
class DoesWork {
    public var threadFunction:ThreadFunction<Dynamic -> Void>;
    
    public function new(threadFunction:ThreadFunction<Dynamic -> Void>) {
        this.threadFunction = threadFunction;
    }
    
    public function runThread():Void {
        new BackgroundWorker().run(threadFunction);
    }
}

class Main {
    private static function main():Void {
        new DoesWork(test).runThread(); //Success
        new DoesNotWork(test).runThread(); //Error
    }
    
    private static function test(_):Void {
        trace("Hello from a background thread!");
    }
}

Workers are isolated

Once we have our source code, creating a worker is simple. We take the string and add some boilerplate code, then construct a Blob out of this code, then create a URL for the blob, then create a worker for that URL, then send a message to the worker to make it start running. Or maybe it isn’t so simple, but it does work.

Web workers execute a JavaScript source file. The code in the file can only access other code in that file, plus a small number of specific functions and classes. But most of your app resides in the main JS file, and is off-limits to workers.

This is in stark contrast to Haxe’s threads, which can access anything. Classes, functions, variables, you name it. Sharing memory like this does of course allow for race conditions, but as mentioned above, BackgroundWorker and ThreadPool help prevent those.

For a simple example:

class Main {
    private static var luckyNumber:Float;
    
    private static function main():Void {
        luckyNumber = Math.random() * 777;
        new BackgroundWorker().run(test);
    }
    
    private static function test(_):Void {
        trace("Hello from a background thread!");
        trace("Your lucky number is: " + luckyNumber);
    }
}

On most targets, any thread can access the Main.luckyNumber variable, so test() will work. But in JavaScript, neither Main nor luckyNumber will have been defined in the worker’s file. And even if they were defined in that file, they’d just be copies. The value will be wrong, and the main thread won’t receive any changes made.

So… how do you transfer data?

Passing messages

I’ve glossed over this so far, but BackgroundWorker.run() takes up to two arguments. The first, of course, is the ThreadFunction to run. The second is a message to pass to that function, which can be any type. (And if you need multiple values, you can pass an array.)

Originally, BackgroundWorker was designed to be run multiple times, each time reusing the same function but working on a new set of data. It wasn’t well-optimized (ThreadPool is much more appropriate for that) nor well-tested, but it was very convenient for implementing web workers.

See, web workers also have a message-passing protocol, allowing us to send an object to the background thread. You know, an object like BackgroundWorker.run()‘s second argument:

class Main {
    private static var luckyNumber:Float;
    
    private static function main():Void {
        luckyNumber = Math.random() * 777;
        new BackgroundWorker().run(test, luckyNumber);
    }
    
    private static function test(luckyNumber:Float):Void {
        trace("Hello from a background thread!");
        trace("Your lucky number is: " + luckyNumber);
    }
}

The trick is, instead of trying to access Main.luckyNumber (which is on the main thread), test() takes an argument, which is the same value except copied to the worker thread. You can actually transfer a lot of data this way:

new BackgroundWorker().run(test, {
    luckyNumber: Math.random() * 777,
    imageURL: "https://www.example.com/image.png",
    cakeRecipe: File.getContent("cake.txt"),
    calendar: Calendar.getUpcomingEvents(10)
});

Bear in mind that your message will be copied using the structured clone algorithm, a deep copy algorithm that cannot copy functions. This sets limits on what kinds of messages you can pass. You can’t pass a function without first converting it to ThreadFunction, nor can you pass an object that contains functions, such as a class instance.

Copying your message is key to how JavaScript prevents race conditions: memory is never shared between threads, so two threads can’t accidentally access the same memory location at the wrong time. But if there’s no sharing, how does the main thread get any information back from the worker?

Returning results

Web workers don’t just receive messages, they can send them back. The rules are the same: everything is copied, no functions, etc.

The BackgroundWorker class provides three functions for this, each representing something different. sendProgress() for status updates, sendError() if something goes horribly wrong, and sendComplete() for the final product. (You may recall that workers don’t normally have access to Haxe functions, but these three are inlined. Inline functions work fine.)

It’s at about this point we need to talk about another problem with copying data. One common reason to use background threads is to process large amounts of data. Suppose you produce 10 MB of data, and you want to pass it back once finished. Your computer is going to have to make an exact copy of all that data, and it’ll end up taking 20 MB in all. Don’t get me wrong, it’s doable, but it’s hardly ideal.

It’s possible to save both time and memory using transferable objects. If you’ve stored your data in an ArrayBuffer, you can simply pass a reference back to the main thread, no copying required. The worker thread loses access to it, and then the main thread gains access (because unlike Haxe, JavaScript is very strict about sharing memory).

ArrayBuffer can be annoying to use on its own, so it’s fortunate that all the wrappers are natively available. By “wrappers,” I’m talking about Float32Array, Int16Array, UInt8Array, and so on. As long as you can represent your data as a sequence of numbers, you should be able to find a matching wrapper.

Transferring a buffer looks like this: backgroundWorker.sendComplete(buffer, [buffer]). I know that looks redundant, and at first I thought maybe backgroundWorker.sendComplete(null, [buffer]) could work instead. But the trick is, the main thread will only receive the first argument (a.k.a. the message). If the message doesn’t contain some kind of reference to buffer, then the main thread won’t have any way to access buffer.

That said, the two arguments don’t have to be identical. You can pass a wrapper (e.g., an Int16Array) as the message, and transfer the buffer inside: backgroundWorker.sendComplete(int16Array, [int16Array.buffer]). The Int16Array numeric properties (byteLength, byteOffset, and length) will be copied, but the underlying buffer will be moved instead.

Concurrent computing

Choosing the right tool for the job

Demo project

Using Thread

Thread safety basics

Using Future

Using ThreadPool

Green/virtual threads

About BackgroundWorker and ThreadPool

Web workers

Workers use source code

Workers are isolated

Passing messages

Returning results

Using `Thread`

Using `Future`

Using `ThreadPool`

About `BackgroundWorker` and `ThreadPool`