Status update: spring 2022

Last time, I said I was done with rabbit holes. I was wrong. Turns out, after escaping one rabbit hole, I promptly fell down the next.

First it was threads. I spent all of March working on an app to demonstrate threads in action. Sure I probably spent more time on it than strictly necessary, but I got a cool demo out of it. Plus, working on a practical example helps catch bugs and iron out rough edges.

With that done, I turned my focus to updating the 21 C++ libraries Lime uses. (Well, 19 out of the 21, anyway.) I forget exactly why I decided it needed to be done then and there, but I think a big part was the fact that people were raising concerns about large unresolved problems in OpenFL/Lime, and this was one I could solve. Technically I started last year, and this was me rolling up my sleeves and finishing it.

A couple stats for comparison: I spent ~4 months working on threads, and produced 53 commits. I spent ~2.5 months working on the C++ libraries, and produced 102 commits. And that’s why you can’t measure effort purely based on “amount of code written.” The thread classes took far more effort to figure out, but a lot of that effort was directed into figuring out which approach to take, and I only committed the version I settled on. The alternative options silently vanished into the ether.

All the while, I spent bits of time here and there working on issue reports. Tracking down odd bugs, adding small features, stuff like that. There are several changes in the pipeline, and if you want the technical details, check out my other post. (Scroll to the bottom for more on the big 19-library update.)

Big changes coming Lime’s way

Lime hasn’t received a new release version in over a year, even though development has kept going the whole time. As we gear up for the next big release, I wanted to take a look at what’s on the way.

Merged changes

These changes have already been accepted into the develop branch, meaning they’ll be available in the very next release.

Let’s cut right to the chase. We have a bunch of new features coming out:

80f83f6 by joshtynjala ensures that every change submitted to Lime gets tested against Haxe 3. That means everything else in this list is backwards-compatible.
#1456 by m0rkeulv, #1465 by me, and openfl#2481 by m0rkeulv enable “streaming” music from a file. That is to say, it’ll only load a fraction of the file into memory at a time, just enough to keep the song going. Great for devices with limited memory!
- Usage: simply open the sound file with Assets.getMusic() instead of Assets.getSound().
#1519 by Apprentice-Alchemist and #1536 by me update Android’s minimum-sdk-version to 21. (SDK 21 equates to Android 5, which is still very old. Only about 1% of devices still use anything older than that.) We’re trying to strike a balance between “supporting every device that ever existed” and “getting the benefit of new features.”
- Tip: to go back, set <config:android minimum-sdk-version="16" />.
#1510 by ninjamuffin99 and Cheemsandfriends adds support for changing audio pitch. Apparently this feature has been missing since OpenFL 4, but now it’s back!
- Usage: since AS3 never supported pitch, OpenFL probably won’t either. Use Lime’s AudioSource class directly.
81d682d by joshtynjala adds Window.setTextInputRect(), meaning that your text field will remain visible when an onscreen keyboard opens and covers half the app.
#1552 by me adds a brand-new JNISafety interface, which helps ensure that when you call Haxe from Java (using JNI), the code runs on the correct thread.
- Personal story time: back when I started developing for Android, I couldn’t figure out why my apps kept crashing when Java code called Haxe code. In the end I gave up and structured everything to avoid doing that, even at the cost of efficiency. For instance, I wrote my android6permissions library without any callbacks (because that would involve Java calling Haxe). Instead of being able to set an event listener and receive a notification later, you had to actively call hasPermission() over and over (because at least Haxe calling Java didn’t crash). Now thanks to JNISafety, the library finally has the callbacks it always should have had.
2e31ae9 by joshtynjala stores and sends cookies with HTTP requests. Now you can connect to a server or website and have it remember your session.
#1509 by me makes Lime better at choosing which icon to use for an app. Previously, if you had both an SVG and an exact-size PNG, there was no way to use the PNG. Nor was there any way for HXP projects to override icons from a Haxelib.
- Developers: if you’re making a library, set a negative priority to make it easy to override your icon. (Between -1 and -10 should be fine.) If you aren’t making a library and are using HXP, set a positive priority to make your icon override others.

…But that isn’t to say we neglected the bug fixes either:

02617a8 by Dimensionscape fixes an issue causing HTTP requests to slow down.
#1485 by justin-espedal fixes a widely-reported crash on iOS. On specific devices, the garbage collector would be initialized incorrectly, causing the app to crash on startup nearly 100% of the time.
2160962 by joshtynjala adds a missing framework on iOS. Before this, Lime wouldn’t compile to iOS at all. (Well, at least until you added Metal.framework yourself. But it’s much better if Lime does it for you.)
afbd7e1 by me and ShaharMS fixes an error when you click “cancel” in a file selection window.
#1470 by justin-espedal fixes an error when downloading assets from certain hosts (notably, itch.io).
#1533 by pozirk, #1496 by me, and #1453 by 414n fix various Android-related errors.

HashLink update

Pull request #1517 by Apprentice-Alchemist is an enormous change that gets its own section.

For context, HashLink is the new(er) virtual machine for Haxe, intended to replace Neko as the “fast compilation for fast testing” target. While I find compilation to be a little bit slower than Neko, it performs a lot better once compiled.

Prior to now, Lime compiled to HashLink 1.10, which was three years old. Pull request #1517 covers two versions and three years’ worth of updates. From the release notes, we can look forward to:

new HashLink CPU Profiler for JIT
improved 64 bit support – windows release now 64 bit by default
new GC architecture and improvements / bugs fixes
support for hot reload
better stack primitives for faster haxe throw
captured stack for closures (when debugger connected)
and more!

(As someone who uses HashLink for rapid testing, I’m always happy to see debugging/profiling improvements.)

Apprentice-Alchemist put a lot of effort into this one, and it shows. Months of testing, responding to CI errors, and making changes in response to feedback.

Perhaps most importantly in the long run, they designed this update to make future updates easier. That paid off on April 28, when HashLink 1.12 came out at 3:21 PM GMT, and Apprentice-Alchemist had the pull request up to date by 5:45!

Pending changes

Lime has several pull requests that are – as of this writing – still in the “request” phase. I expect most of these to get merged, but not necessarily in time for the next release.

Again, let’s start with new features:

#1518 by me overhauls Lime’s threading tools. This is a massive change, but I’ll spare you the details because I’ve already gone into depth.
#1472 by nixbody improves Unicode support on some targets.
#1529 by junsred and #1540 by me improve clipboard behavior on the HTML5 and Linux targets, respectively.
#1539 by me makes JNI a bit smarter, and adds useful notes.

…And then the bug fixes:

#1529 by arm32x fixes static debug builds on Windows.
#1538 by me fixes errors that came up if you didn’t cd to an app’s directory before running it. Now you can run it from wherever you like.
#1500 by me modernizes the Android build process. This should be the end of that “deprecated Gradle features were used” warning.

Submodule update

#1531 by me is another enormous change that gets its own section. By far the second biggest pull request I’ve submitted this year.

Lime needs to be able to perform all kinds of tasks, from rendering image files to drawing vector shapes to playing sound to decompressing zip files. It’s a lot to do, but fortunately, great open-source libraries already exist for each task. We can use Cairo for shape drawing, libpng to open PNGs, OpenAL to play sound, and so on.

In the past, Lime has relied on a copy of each of these libraries. For example, we would open up the public SDL repository, download the src/ and include/ folders, and paste their contents into our own repo. Then we’d tell Lime to use our copy as a submodule.

This was… not ideal. Every time we wanted to update, we had to manually download and upload all the files. And if we wanted to try a couple different versions (maybe 1.5 didn’t work and we wanted to see if 1.4 was any better), that’s a whole new download-upload cycle. Plus we sometimes made little customizations unique to our copy repos. Whenever we downloaded fresh copies of the files, we’d have to patch our customizations back in. It’s no wonder no one ever wanted to update the submodules!

If only there was a tool to make this easier. Something that can download files from GitHub and then upload them again. Preferably a tool capable of choosing the correct revision to download, and merging our changes into the downloaded files. I don’t know, some kind of software for controlling versions…

(It’s Git. The answer is Git.)

There was honestly no reason for us to be maintaining our own copy of each repo, much less updating our copies by hand. 20 out of Lime’s 21 submodules have an official public Git repo available, and so I set out to make use of them. It took a lot of refactoring, copying, and debugging. A couple times I had to submit bug reports to the project in question, while other times I was able to work around an issue by setting a specific flag. But I’m pleased to announce that Lime’s submodules now all point to the official repositories rather than some never-updated knockoffs. If we want to use an update, all we have to do is point Lime to the newer commit, and Git will download all the right files.

But I wasn’t quite done. Since it was now so easy to update these libraries, I did so. A couple of them are still a few versions out of date due to factors outside of Lime’s control, but the vast majority are fully up-to-date.

Other improvements that happened along the way:

More documentation, so that if anyone else stumbles across the C++ backend, they won’t be as lost as I was.
Support for newer Android NDKs. Well, NDK 21 specifically. Pixman’s assembly code prevents us from updating further. (Still better than being stuck with NDK 15 specifically!)
Updating zlib fixes a significant security issue (CVE-2018-25032).

Looking forward

OpenFL and Lime are in a state of transition at the moment. We have an expanded leadership team, we’ll be aiming for more frequent releases, and there’s plenty of active discussion of where to go next. Is Lime getting so big that we ought to split it into smaller projects? Maybe! Is it time to stop letting AS3 dictate what OpenFL can and can’t do? Some users say so!

Stay tuned, and we’ll find out soon enough. Or better yet, join the forums or Discord and weigh in!

Guide to threads in Lime

Disclaimer: this guide focuses on upcoming features, currently only available via Git.

Concurrent computing

Concurrent computing is a form of computing in which several computations are executed concurrently—during overlapping time periods—instead of sequentially—with one completing before the next starts.

This is a property of a system—whether a program, computer, or a network—where there is a separate execution point or “thread of control” for each process. A concurrent system is one where a computation can advance without waiting for all other computations to complete.

Concurrent computing is a form of modular programming. In its paradigm an overall computation is factored into subcomputations that may be executed concurrently. Pioneers in the field of concurrent computing include Edsger Dijkstra, Per Brinch Hansen, and C.A.R. Hoare.

[Source: English Wikipedia.]

In simpler terms, concurrent execution means two things happen at once. This is great, but how do you do it in OpenFL/Lime?

Choosing the right tool for the job

This guide covers three classes. Lime’s two concurrency classes, and Thread, the standard class they’re based on.

Class	`Thread`	`Future`	`ThreadPool`
Source	Haxe	Lime	Lime
Ease of use	★★★★★	★★★★☆	★★★☆☆
Thread safety	★☆☆☆☆	★★★★☆	★★★★☆
HTML5 support	No	Yes	Yes

But before you pick a class, first consider whether you should use threads at all.

Can you detect any slowdown? If not, threads won’t help, and may even slow things down.
How often do your threads interact with the outside world? The more often they transfer information, the slower and less safe they’ll be.

If you have a slow and self-contained task, that’s when you consider using threads.

Demo project

I think a specific example will make this guide easier to follow. Suppose I’m using libnoise to generate textures. I’ve created a feature-complete app, and the core of the code looks something like this:

private function generatePattern(workArea:Rectangle):Void {
    //Allocate four bytes per pixel.
    var bytes:ByteArray = new ByteArray(
        Std.int(workArea.width) * Std.int(workArea.height));
    
    //Run getValue() for every pixel.
    for(y in Std.int(workArea.top)...Std.int(workArea.bottom)) {
        for(x in Std.int(workArea.left)...Std.int(workArea.right)) {
            //getValue() returns a value in the range [-1, 1], and we need
            //to convert to [0, 255].
            var value:Int = Std.int(128 + 128 * module.getValue(x, y, 0));
            
            if(value > 255) {
                value = 255;
            } else if(value < 0) {
                value = 0;
            }
            
            //Store it as a color.
            bytes.writeInt(value << 16 | value << 8 | value);
        }
    }
    
    //Draw the pixels to the canvas.
    bytes.position = 0;
    canvas.setPixels(workArea, bytes);
    bytes.clear();
}

The problem is, this code makes the app lock up. Sometimes for a fraction of a second, sometimes for seconds on end. It all depends on which pattern it’s working on.

(If you have a beefy computer and this looks fine to you, try fullscreen.)

A good user interface responds instantly when the user clicks, rather than locking up. Clearly this app needs improvement, and since the bulk of the work is self-contained, I decide I’ll solve this problem using threads. ~~Now I have two problems.~~

Using `Thread`

The easiest option is to use Haxe’s Thread class. Since I know a single function is responsible for the freezing, all I need to do is change how I call that function.

-generatePattern(new Rectangle(0, 0, canvas.width, canvas.height));
+Thread.create(generatePattern.bind(new Rectangle(0, 0, canvas.width, canvas.height)));

View full changes

Thread.create() requires a zero-argument function, so I use bind() to supply the rectangle argument. With that done, create() makes a new thread, and the app no longer freezes.

I’d love to show this in action, but it doesn’t work in HTML5. Sorry.

The downside is, the app now prints a bunch of “null pointer” messages. It turns out I’ve added a race condition.

Thread safety basics

The problem with Haxe’s threads is the fact that they’re just so convenient. You can access any variable from any thread, which is great if you don’t mind all the subtle errors.

My generatePattern() function has two problem variables:

module is a class variable, and the main thread updates it with every click. However, generatePattern() assumes module will stay the same the whole time. Worse, module briefly becomes null each time it changes, and that can cause the “null pointer” race condition I mentioned above.
canvas is also a class variable, which is modified during generatePattern(). If multiple threads are going at once, it’s possible to modify canvas from two threads simultaneously. canvas is a BitmapData, so I suspect it will merely produce a garbled image. If you do the same to other object types, it could permanently break that object.

Before I go into too much detail, let’s try a simple solution.

-Thread.create(generatePattern.bind(new Rectangle(0, 0, canvas.width, canvas.height)));
+lastCreatedThread = Thread.create(module, generatePattern.bind(new Rectangle(0, 0, canvas.width, canvas.height)));

-private function generatePattern(workArea:Rectangle):Void {
+private function generatePattern(module:ModuleBase, workArea:Rectangle):Void {
    //Allocate four bytes per pixel.
    var bytes:ByteArray = new ByteArray(
        Std.int(workArea.width) * Std.int(workArea.height));
    
    //Run getValue() for every pixel.
    for(y in Std.int(workArea.top)...Std.int(workArea.bottom)) {
        for(x in Std.int(workArea.left)...Std.int(workArea.right)) {
            //getValue() returns a value in the range [-1, 1], and we need
            //to convert to [0, 255].
            var value:Int = Std.int(128 + 128 * module.getValue(x, y, 0));
            
            if(value > 255) {
                value = 255;
            } else if(value < 0) {
                value = 0;
            }
            
            //Store it as a color.
            bytes.writeInt(value << 16 | value << 8 | value);
        }
    }
    
+   //If another thread was created after this one, don't draw anything.
+   if(Thread.current() != lastCreatedThread) {
+       return;
+   }
+   
    //Draw the pixels to the canvas.
    bytes.position = 0;
    canvas.setPixels(workArea, bytes);
    bytes.clear();
}

View full changes

Step one, pass module as an argument. That way, the function won’t be affected when the class variable changes. Step two, enforce a rule that only the last-created thread can modify canvas.

Even then, there’s still at least one theoretical race condition in the above block of code. Can you spot it?

Whether or not you find it isn’t the point I’m trying to make. My point is that thread safety is hard, and you shouldn’t try to achieve it alone. I can spot several types of race condition, and I still don’t trust myself to write perfect code. No, if you want thread safety, you need some guardrails. Tools and design patterns that can take the guesswork out.

My favorite rule of thumb is that every object belongs to one thread, and only that thread may modify that value. And if possible, only that thread should access the value, though that’s less important. Oftentimes, this means making a copy of a value before passing it, so that the receiving thread can own the copy. This rule of thumb means generatePattern() can’t call canvas.setPixels() as shown above, since the main thread owns canvas. Instead, it should send a thread-safe message back and allow the main thread to set the pixels.

And guess what? Lime’s Future and ThreadPool classes provide just the tools you need to do that. In fact, they’re designed as a blueprint for thread-safe code. If you follow the blueprint they offer, and you remember to copy your values when needed, your risk will be vastly reduced.

Using `Future`

Lime’s Future class is based on the general concept of futures and promises, wherein a “future” represents a value that doesn’t exist yet, but will exist in the future (hence the name).

For instance, BitmapData.loadFromFile() returns a Future<BitmapData>, representing the image that will eventually exist. It’s still loading for now, but if you add an onComplete listener, you’ll get the image as soon as it’s ready.

I want to do pretty much the exact same thing in my sample app, creating a Future<BitmapData> that will wait for the value returned by generatePattern(). For this to work, I need to rewrite generatePattern() so that it actually does return a value.

As discussed under thread safety basics, I want to take both module and workArea as arguments. However, Future limits me to one argument, so I combine my two values into one anonymous structure named state.

-private function generatePattern(workArea:Rectangle):Void {
+private static function generatePattern(state: { module:ModuleBase, workArea:Rectangle }):ByteArray {
    //Allocate four bytes per pixel.
    var bytes:ByteArray = new ByteArray(
        Std.int(workArea.width) * Std.int(workArea.height));
    
    //Run getValue() for every pixel.
    for(y in Std.int(workArea.top)...Std.int(workArea.bottom)) {
        for(x in Std.int(workArea.left)...Std.int(workArea.right)) {
            //getValue() returns a value in the range [-1, 1], and we need
            //to convert to [0, 255].
            var value:Int = Std.int(128 + 128 * module.getValue(x, y, 0));
            
            if(value > 255) {
                value = 255;
            } else if(value < 0) {
                value = 0;
            }
            
            //Store it as a color.
            bytes.writeInt(value << 16 | value << 8 | value);
        }
    }
    
-    //Draw the pixels to the canvas.
-    bytes.position = 0;
-    canvas.setPixels(workArea, bytes);
-    bytes.clear();
+    return bytes;
}

Now I call the function, listen for the return value, and draw the pixels.

-generatePattern(new Rectangle(0, 0, canvas.width, canvas.height));
+future = Future.withEventualValue(generatePattern, { module: module, workArea: new Rectangle(0, 0, canvas.width, canvas.height) }, MULTI_THREADED);
+
+//Store a copy of `future` at this point in time.
+var expectedFuture:Future<ByteArray> = future;
+
+//Add a listener for later.
+future.onComplete(function(bytes:ByteArray):Void {
+   //If another thread was created after this one, don't draw anything.
+   if(future != expectedFuture) {
+       return;
+   }
+   
+   //Draw the pixels to the canvas.
+   bytes.position = 0;
+   canvas.setPixels(new Rectangle(0, 0, canvas.width, canvas.height), bytes);
+   bytes.clear();
+});

View full changes

This event listener always runs on the main thread, meaning only the main thread ever updates canvas, which is super helpful for thread safety. I still check whether another thread was created, but that’s only to make sure I’m drawing the right image, not because there’s a risk of two being drawn at once.

And this time, I can show you an HTML5 demo! Thanks to the use of threads, the app responds instantly after every click.

I should probably also mention that I set Future.FutureWork.maxThreads = 2. This means you can have two threads running at once, but any more will have to wait. Click enough times in a row, and even fast patterns will become slow. Not because they themselves slowed down, but because they’re at the back of the line. The app has to finish calculating all the previous patterns first.

(If the problem isn’t obvious from the small demo, try fullscreen.)

This seems pretty impractical. Why would the app spend all this time calculating the old patterns when it knows it won’t display them? Well, the reason is that you can’t cancel a Future once started. For that, and for other advanced features, you want to use ThreadPool directly instead of indirectly.

Oh yeah, did I mention that Future is built on top of ThreadPool? Hang on while I go check. …Apparently I never mentioned it. Well, Future is built on top of ThreadPool. It tries to provide the same features in a more convenient way, but doesn’t provide all the features. If you want to cancel jobs or send progress updates, you’ll need ThreadPool.

Using `ThreadPool`

Thread pools are a common way to make threads more efficient. It takes time to start up and shut down a thread, so why not reuse it instead? Lime’s ThreadPool class follows this basic pattern, though it prioritizes cross-platform compatibility, thread safety, and ease of use over performance.

When using ThreadPool, you’ll also need to be aware of its parent class, WorkOutput, as that’s your ticket to thread-safe message transfer. You’ll receive a WorkOutput instance as an argument (with the benefit that it can’t become null unexpectedly), and it has all the methods you need for communication.

sendComplete() and sendError() convey that your job succeeded/failed. When you call one of them, ThreadPool dispatches onComplete or onError as appropriate, and then initiates the thread recycling process. Don’t call them if you aren’t done!

sendProgress() works differently: you can call it as much as you like, with whatever type of data you like. It has no special meaning other than what you come up with. Unsurprisingly, sendProgress() corresponds to onProgress.

generatePattern() only needs sendComplete(), at least for now.

-private function generatePattern(workArea:Rectangle):Void {
+private static function generatePattern(state: { module:ModuleBase, workArea:Rectangle }, output:WorkOutput):Void {
    //Allocate four bytes per pixel.
    var bytes:ByteArray = new ByteArray(
        Std.int(workArea.width) * Std.int(workArea.height));
    
    //Run getValue() for every pixel.
    for(y in Std.int(workArea.top)...Std.int(workArea.bottom)) {
        for(x in Std.int(workArea.left)...Std.int(workArea.right)) {
            //getValue() returns a value in the range [-1, 1], and we need
            //to convert to [0, 255].
            var value:Int = Std.int(128 + 128 * module.getValue(x, y, 0));
            
            if(value > 255) {
                value = 255;
            } else if(value < 0) {
                value = 0;
            }
            
            //Store it as a color.
            bytes.writeInt(value << 16 | value << 8 | value);
        }
    }
    
-   //Draw the pixels to the canvas.
-   bytes.position = 0;
-   canvas.setPixels(workArea, bytes);
-   bytes.clear();
+   output.sendComplete(bytes, [bytes]);
}

Hmm, what’s up with “sendComplete(bytes, [bytes])“? Looks kind of redundant.

Well, each of the “send” functions takes an optional array argument that improves performance in HTML5. It’s great for transferring ByteArrays and similar packed data containers, but be aware that these containers will become totally unusable. That’s no problem at the end of the function, but be careful if using this with sendProgress().

With generatePattern() updated, the next step to initialize my ThreadPool.

//minThreads = 1, maxThreads = 1.
threadPool = new ThreadPool(1, 1, MULTI_THREADED);
threadPool.onComplete.add(function(bytes:ByteArray):Void {
    //Draw the pixels to the canvas.
    bytes.position = 0;
    canvas.setPixels(new Rectangle(0, 0, canvas.width, canvas.height), bytes);
    bytes.clear();
});

This time, I didn’t include a “latest thread” check. Instead, I plan to cancel old jobs, ensuring that they never dispatch an onComplete event at all.

-generatePattern(new Rectangle(0, 0, canvas.width, canvas.height));
+threadPool.cancelJob(jobID);
+jobID = threadPool.run(generatePattern, { module: module, workArea: new Rectangle(0, 0, canvas.width, canvas.height) });

This works well enough in the simplest case, but the full app actually isn’t this simple. The full app actually has several classes listening for events, and they all receive each other’s events. To solve this, they each have to filter.

Allow me to direct your attention to ThreadPool.activeJob. This variable is made available specifically during onComplete, onError, or onProgress events, and it tells you where the event came from.

threadPool.onComplete.add(function(bytes:ByteArray):Void {
+   if(threadPool.activeJob.id != jobID) {
+       return;
+   }
+   
    //Draw the pixels to the canvas.
    bytes.position = 0;
    canvas.setPixels(new Rectangle(0, 0, canvas.width, canvas.height), bytes);
    bytes.clear();
});

View full changes

Now, let’s see how the demo looks.

It turns out, setting maxThreads = 1 was a bad idea. Even calling cancelJob() isn’t enough: the app still waits to finish the current job before starting the next. (As before, viewing in fullscreen may make the problem more obvious.)

When a function has already started, cancelJob() does two things: (1) it bans the function call from dispatching events, and (2) it politely encourages the function to exit. There’s no way to force it to stop, so polite requests are all we get. If only generatePattern() was more cooperative.

Green/virtual threads

Green threads are what happens when you want thread-like behavior in a single-threaded environment. (“Virtual threads” can mean the same thing, but Java seems to be claiming the term for something else.)

As it happens, it was JavaScript’s definition of “async” that gave me the idea for this feature. JavaScript’s async keyword runs a function right on the main thread, but sometimes puts that function on pause to let other functions run. Only one thing ever runs at once, but since they take turns, it still makes sense to call them “asynchronous” or “concurrent.”

Most platforms don’t support anything like the async keyword, but we can imitate the behavior by exiting the function and starting it again later. Doesn’t sound very convenient, but unlike some things I tried, it’s simple, it’s reliable, and it works on every platform.

Exiting and restarting forms the basis for Lime’s green threads: instead of running a function on a background thread, run a small bit of that function each frame. The function is responsible for returning after a brief period, because if it takes too long the app won’t be able to draw the next frame in time. Then ThreadPool or FutureWork is responsible for scheduling it again, so it can continue. This behavior is also known as “cooperative multitasking” – multitasking made possible by functions voluntarily passing control to one another.

Here’s an outline for a cooperative function.

The first time the function is called, it performs initialization and does a little work.
By the end of the call, it stores its progress for later.
When the function is called again, it checks for stored progress and determines that this isn’t the first call. Using this stored data, it continues from where it left off, doing a little more work. Then it stores the new data and exits again.
Step 3 repeats until the function detects an end point. Then it calls sendComplete() or (if using Future) returns a non-null value.
ThreadPool or FutureWork stops calling the function, and dispatches the onComplete event.

This leaves the question of where you should store that data. In single-threaded mode, you can put it wherever you like. However, this type of cooperation is also useful in multi-threaded mode so that functions can be canceled, and storing data in class variables isn’t always thread safe. Instead, I recommend using the state argument. Which is, incidentally, why I like to call it “state.” It provides the initial input and stores progress.

Typically, state will have some mandatory values (supplied by the caller) and some optional ones (initialized and updated by the function itself). If the optional ones are missing, that indicates it’s the first iteration.

-private static function generatePattern(state: { module:ModuleBase, workArea:Rectangle }, output:WorkOutput):Void {
+private static function generatePattern(state: { module:ModuleBase, workArea:Rectangle, ?y:Int, ?bytes:ByteArray }, output:WorkOutput):Void {
-   //Allocate four bytes per pixel.
-   var bytes:ByteArray = new ByteArray(
-       Std.int(workArea.width) * Std.int(workArea.height));
+   var bytes:ByteArray = state.bytes;
+   
+   //If it's the first iteration, initialize the optional values.
+   if(bytes == null) {
+       //Allocate four bytes per pixel.
+       state.bytes = bytes = new ByteArray(
+           Std.int(workArea.width) * Std.int(workArea.height));
+       
+       state.y = Std.int(workArea.top);
+   }
+   
+   //Each iteration, determine how much work to do.
+   var endY:Int = state.y + (output.mode == MULTI_THREADED ? 50 : 5);
+   if(endY > Std.int(workArea.bottom)) {
+       endY = Std.int(workArea.bottom);
+   }
    
    //Run getValue() for every pixel.
-    for(y in Std.int(workArea.top)...Std.int(workArea.bottom)) {
+   for(y in state.y...endY) {
        for(x in Std.int(workArea.left)...Std.int(workArea.right)) {
            //getValue() returns a value in the range [-1, 1], and we need
            //to convert to [0, 255].
            var value:Int = Std.int(128 + 128 * module.getValue(x, y, 0));
            
            if(value > 255) {
                value = 255;
            } else if(value < 0) {
                value = 0;
            }
            
            //Store it as a color.
            bytes.writeInt(value << 16 | value << 8 | value);
        }
    }
    
+   //Save progress.
+   state.y = endY;
+   
+   //Don't call sendComplete() until actually done.
+   if(state.y >= Std.int(workArea.bottom)) {
        output.sendComplete(bytes, [bytes]);
+   }
}

Note that I do more work per iteration in multi-threaded mode. There’s no need to return too often; just often enough to exit if the job’s been canceled. It also incurs overhead in HTML5, so it’s best not to overdo it.

Single-threaded mode is the polar opposite. There’s minimal overhead, and you get better timing if the function is very short. Ideally, short enough to run 5+ times a frame with time left over. On a slow computer, it’ll automatically reduce the number of times per frame to prevent lag.

Next, I tell ThreadPool to use single-threaded mode, and I specify a workLoad of 3/4. This value indicates what fraction of the main thread’s processing power should be spent on this ThreadPool. I’ve elected to take up 75% of it, leaving 25% for other tasks. Since I know those other tasks aren’t very intense, this is plenty.

-threadPool = new ThreadPool(1, 1, MULTI_THREADED);
+threadPool = new ThreadPool(1, 1, SINGLE_THREADED, 3/4);

View full changes

Caution: reduce this number if creating multiple single-threaded ThreadPools. If two pools each have a workLoad of 3/4, then they’ll take up 150% of the allocated time per frame, and your app will slow down by (at least) 50%. Instead, try to keep the combined workLoad under 1.

In any case, it’s time for another copy of the demo. Since we’re nearing the end, I also went ahead and implemented progress events. Now you can watch the progress in (closer to) real time.

These changes also benefit multi-threaded mode, so I created another multi-threaded version for comparison. With progress events, you can now see the slight pause when it spins up a new web worker (which isn’t that often, since it keeps two of them running).

(For comparison, here they both are in fullscreen: green threads, web workers.)

I don’t know, I like them both. Green threads have the benefit of being lighter weight, while web workers have the benefit of being real threads, meaning you could run eight in parallel without slowing the main thread.

My advice? Write code that works both ways, as shown in this guide. Keep your options open, since the configuration that works best for a small app may not be what works best for a big one. Good luck out there!

Web Workers in Lime

If you haven’t already read my guide to threads, I suggest starting there.

I’ve spent the last month implementing web worker support in Lime. (Edit: and then I spent another month after posting this.) It turned out to be incredibly complicated, and though I did my best to include documentation in the code, I think it’s worth a blog post too. Let’s go over what web workers are, why you might want to use them, and why you might not want to use them.

To save space, I’m going to assume you’ve heard of threads, race conditions, and threads in Haxe.

About `BackgroundWorker` and `ThreadPool`

BackgroundWorker and ThreadPool are Lime’s two classes for safely managing threads. They were added back in 2015, and have stayed largely unchanged since. (Until this past month, but I’ll get to that.)

The two classes fill different roles. BackgroundWorker is ideal for one-off jobs, while ThreadPool is a bit more complex but offers performance benefits when doing multiple jobs in a row.

BackgroundWorker isn’t too different from calling Thread.create() – both make a thread and run a single job. The main difference is that BackgroundWorker builds in safety features.

Recently, Haxe added its own thread pool implementations: FixedThreadPool has a constant number of threads, while ElasticThreadPool tries to add and remove threads based on demand. Lime’s ThreadPool does a combination of the two: you can set the minimum and maximum number of threads, and it will vary within that range based on demand. Plus it offers structure and safety features, just like BackgroundWorker. On the other hand, ThreadPool lacks ElasticThreadPool‘s threadTimeout feature, so threads will exit instantly if they don’t have a job to do.

I always hate reinventing the wheel. Why does Lime need a ThreadPool class when Haxe already offers two? (Ignoring the fact that Lime’s came first.) Just because of thread safety? There are other ways to achieve that.

If only Haxe’s thread pools worked in JavaScript…

Web workers

Mozilla describes web workers as “a simple means for web content to run scripts in background threads.” “Simple” is a matter of perspective, but they do allow you to create background threads in JavaScript.

Problem is, they have two fundamental differences from Haxe’s threads, which is why Haxe doesn’t include them in ElasticThreadPool and FixedThreadPool.

Web workers use source code.
Web workers are isolated.

Workers use source code

Web workers execute a JavaScript file, not a JavaScript function. Fortunately, it is usually possible to turn a function back into source code, simply by calling toString(). Usually. Let’s start with how this works in pure JavaScript:

function add(a, b) {
    return a + b;
}

console.log(add(1, 2)); //Output: 3
console.log(add.toString()); //Output:
//function add(a, b) {
//    return a + b;
//}

That first log() call is just to show the function working. The second shows that we get the function source code as a string. It even preserved our formatting!

If we look at the examples, we find that it goes to great lengths to preserve the original formatting.

`toString()` input	`toString()` output
`function f(){}`	`"function f(){}"`
`class A { a(){} }`	`"class A { a(){} }"`
`function* g(){}`	`"function* g(){}"`
`a => a`	`"a => a"`
`({ a(){} }.a)`	`"a(){}"`
`({ [0](){} }[0])`	`"[0](){}"`
`Object.getOwnPropertyDescriptor({ get a(){} }, "a").get`	`"get a(){}"`
`Object.getOwnPropertyDescriptor({ set a(x){} }, "a").set`	`"set a(x){}"`
`Function.prototype.toString`	`"function toString() { [native code] }"`
`(function f(){}.bind(0))`	`"function () { [native code] }"`
`Function("a", "b")`	`"function anonymous(a\n) {\nb\n}"`

That’s weird. In two of those cases, the function body – the meat of the code – has been replaced with “[native code]”. (That isn’t even valid JavaScript!) As the documentation explains:

If the toString() method is called on built-in function objects or a function created by Function.prototype.bind, toString() returns a native function string

In other words, if we ever call bind() on a function, we can’t get its source code, meaning we can’t use it in a web worker. And wouldn’t you know it, Haxe automatically calls bind() on certain functions.

Let’s try writing some Haxe code to call toString(). Ideally, we want to write a function in Haxe, have Haxe translate it to JavaScript, and then get its JavaScript source code.

class Test {
    static function staticAdd(a, b) {
        return a + b;
    }
    
    function add(a, b) {
        return a + b;
    }
    
    static function main() {
        var instance = new Test();
        
        trace(staticAdd(1, 2));
        trace(instance.add(2, 3));
        
        #if js
        trace((cast staticAdd).toString());
        trace((cast instance.add).toString());
        #end
    }
    
    inline function new() {}
}

If you try this code, you’ll get the following output:

Test.hx:15: 3
Test.hx:16: 5
Test.hx:18: staticAdd(a,b) {
        return a + b;
    }
Test.hx:19: function() {
    [native code]
}

The first two lines prove that both functions work just fine. staticAdd is printed exactly like it appears in the JavaScript file. But instance.add is all wrong. Let’s look at the JS source to see why:

static main() {
    let instance = new Test();
    console.log("Test.hx:15:",Test.staticAdd(1,2));
    console.log("Test.hx:16:",instance.add(2,3));
    console.log("Test.hx:18:",Test.staticAdd.toString());
    console.log("Test.hx:19:",$bind(instance,instance.add).toString());
}

Yep, there it is. Haxe inserted a call to $bind(), a function that – perhaps unsurprisingly – calls bind().

Turns out, Haxe always inserts $bind() when you try to refer to an instance function. This is in fact required: otherwise, the function couldn’t access the instance it came from. But it also means we can’t use instance functions in web workers. Or can we?

After a lot of frustration and effort, I came up with ThreadFunction. Read the source if you want details; otherwise, the one thing to understand is that it can only remove the $bind() call if you convert to ThreadFunction ASAP. If you have a variable (or function argument) representing a function, that variable (or argument) must be of type ThreadFunction.

//Instead of this...
class DoesNotWork {
    public var threadFunction:Dynamic -> Void;
    
    public function new(threadFunction:Dynamic -> Void) {
        this.threadFunction = threadFunction;
    }
    
    public function runThread():Void {
        new BackgroundWorker().run(threadFunction);
    }
}

//...you want to do this.
class DoesWork {
    public var threadFunction:ThreadFunction<Dynamic -> Void>;
    
    public function new(threadFunction:ThreadFunction<Dynamic -> Void>) {
        this.threadFunction = threadFunction;
    }
    
    public function runThread():Void {
        new BackgroundWorker().run(threadFunction);
    }
}

class Main {
    private static function main():Void {
        new DoesWork(test).runThread(); //Success
        new DoesNotWork(test).runThread(); //Error
    }
    
    private static function test(_):Void {
        trace("Hello from a background thread!");
    }
}

Workers are isolated

Once we have our source code, creating a worker is simple. We take the string and add some boilerplate code, then construct a Blob out of this code, then create a URL for the blob, then create a worker for that URL, then send a message to the worker to make it start running. Or maybe it isn’t so simple, but it does work.

Web workers execute a JavaScript source file. The code in the file can only access other code in that file, plus a small number of specific functions and classes. But most of your app resides in the main JS file, and is off-limits to workers.

This is in stark contrast to Haxe’s threads, which can access anything. Classes, functions, variables, you name it. Sharing memory like this does of course allow for race conditions, but as mentioned above, BackgroundWorker and ThreadPool help prevent those.

For a simple example:

class Main {
    private static var luckyNumber:Float;
    
    private static function main():Void {
        luckyNumber = Math.random() * 777;
        new BackgroundWorker().run(test);
    }
    
    private static function test(_):Void {
        trace("Hello from a background thread!");
        trace("Your lucky number is: " + luckyNumber);
    }
}

On most targets, any thread can access the Main.luckyNumber variable, so test() will work. But in JavaScript, neither Main nor luckyNumber will have been defined in the worker’s file. And even if they were defined in that file, they’d just be copies. The value will be wrong, and the main thread won’t receive any changes made.

So… how do you transfer data?

Passing messages

I’ve glossed over this so far, but BackgroundWorker.run() takes up to two arguments. The first, of course, is the ThreadFunction to run. The second is a message to pass to that function, which can be any type. (And if you need multiple values, you can pass an array.)

Originally, BackgroundWorker was designed to be run multiple times, each time reusing the same function but working on a new set of data. It wasn’t well-optimized (ThreadPool is much more appropriate for that) nor well-tested, but it was very convenient for implementing web workers.

See, web workers also have a message-passing protocol, allowing us to send an object to the background thread. You know, an object like BackgroundWorker.run()‘s second argument:

class Main {
    private static var luckyNumber:Float;
    
    private static function main():Void {
        luckyNumber = Math.random() * 777;
        new BackgroundWorker().run(test, luckyNumber);
    }
    
    private static function test(luckyNumber:Float):Void {
        trace("Hello from a background thread!");
        trace("Your lucky number is: " + luckyNumber);
    }
}

The trick is, instead of trying to access Main.luckyNumber (which is on the main thread), test() takes an argument, which is the same value except copied to the worker thread. You can actually transfer a lot of data this way:

new BackgroundWorker().run(test, {
    luckyNumber: Math.random() * 777,
    imageURL: "https://www.example.com/image.png",
    cakeRecipe: File.getContent("cake.txt"),
    calendar: Calendar.getUpcomingEvents(10)
});

Bear in mind that your message will be copied using the structured clone algorithm, a deep copy algorithm that cannot copy functions. This sets limits on what kinds of messages you can pass. You can’t pass a function without first converting it to ThreadFunction, nor can you pass an object that contains functions, such as a class instance.

Copying your message is key to how JavaScript prevents race conditions: memory is never shared between threads, so two threads can’t accidentally access the same memory location at the wrong time. But if there’s no sharing, how does the main thread get any information back from the worker?

Returning results

Web workers don’t just receive messages, they can send them back. The rules are the same: everything is copied, no functions, etc.

The BackgroundWorker class provides three functions for this, each representing something different. sendProgress() for status updates, sendError() if something goes horribly wrong, and sendComplete() for the final product. (You may recall that workers don’t normally have access to Haxe functions, but these three are inlined. Inline functions work fine.)

It’s at about this point we need to talk about another problem with copying data. One common reason to use background threads is to process large amounts of data. Suppose you produce 10 MB of data, and you want to pass it back once finished. Your computer is going to have to make an exact copy of all that data, and it’ll end up taking 20 MB in all. Don’t get me wrong, it’s doable, but it’s hardly ideal.

It’s possible to save both time and memory using transferable objects. If you’ve stored your data in an ArrayBuffer, you can simply pass a reference back to the main thread, no copying required. The worker thread loses access to it, and then the main thread gains access (because unlike Haxe, JavaScript is very strict about sharing memory).

ArrayBuffer can be annoying to use on its own, so it’s fortunate that all the wrappers are natively available. By “wrappers,” I’m talking about Float32Array, Int16Array, UInt8Array, and so on. As long as you can represent your data as a sequence of numbers, you should be able to find a matching wrapper.

Transferring a buffer looks like this: backgroundWorker.sendComplete(buffer, [buffer]). I know that looks redundant, and at first I thought maybe backgroundWorker.sendComplete(null, [buffer]) could work instead. But the trick is, the main thread will only receive the first argument (a.k.a. the message). If the message doesn’t contain some kind of reference to buffer, then the main thread won’t have any way to access buffer.

That said, the two arguments don’t have to be identical. You can pass a wrapper (e.g., an Int16Array) as the message, and transfer the buffer inside: backgroundWorker.sendComplete(int16Array, [int16Array.buffer]). The Int16Array numeric properties (byteLength, byteOffset, and length) will be copied, but the underlying buffer will be moved instead.

Coding in multiple languages

Quick: what programming language is Run 3 written in? Haxe, of course. (It’s even in the title of my blog.) But that isn’t all. Check out this list:

Haxe
Neko
Java
JavaScript
ActionScript
C++
Objective-C
Python
Batch
Bash
Groovy

Those are the programming languages that are (or were) involved in the development of Run 3/Run Mobile. Some are on the list because of Haxe’s capabilities, others because of Haxe’s limits.

Haxe’s defining feature is its ability to compile to other languages. This is great if you want to write a game for multiple platforms. JavaScript runs in browsers, C++ runs on desktop, Neko compiles quickly during testing, ActionScript… well, we don’t talk about ActionScript anymore. And that’s why those four are on the list.

Batch and Bash are good at performing simple file operations. Copying files, cleaning up old folders, etc. That’s also why Python is on the list: I have a Python file that runs after each build and performs simple file operations. Add 1 to the build count, create a zip file from a certain folder, etc. Honestly it doesn’t make much difference which language you use for the simple stuff, and I don’t remember why I chose Python. Nowadays I’d definitely choose Haxe for consistency.

The rest are due to mobile apps. Android apps are written in Java or Kotlin, then compiled using Groovy. Haxe does compile to Java, but it has a reputation for being slow. Therefore, OpenFL tries to use as much C++ as possible, only briefly using Java to get the game up and running.

iOS is similar: apps are typically written in Objective-C or Swift, and Haxe doesn’t compile to either of those. But you can have a simple Objective-C file start the app, then switch to C++.

Even leaving aside Python, Batch, and Bash, that’s a lot of languages. Some of them are independent, but others have to run at the same time and even interact. How does all that work?

Source-to-source compilation

Let’s start with source-to-source compilation (the thing I said was Haxe’s defining feature) and what it means. Suppose I’m programming in Haxe and compiling to JavaScript.

Now, by default Haxe code can only call other Haxe code. Say there’s a fancy JavaScript library that does calligraphy, and I want to draw some large shiny letters. If I was writing JavaScript, I could call the drawCalligraphy() function no problem, but not in Haxe.

To accomplish the same thing in Haxe, I need some special syntax to insert JavaScript code. Something like this:

//Haxe code (what I write)
var fruitOptions = ["apple", "orange", "pear", "lemon"];
var randomFruit = fruitOptions[Std.int(Math.random() * fruitOptions.length)];
js.Syntax.code('drawCalligraphy(randomFruit);');
someHaxeFunction(randomFruit);

//JS code (generated when the above is compiled)
let fruitOptions = ["apple","orange","pear","lemon"];
let randomFruit = fruitOptions[Math.random() * fruitOptions.length | 0];
drawCalligraphy(randomFruit);
someHaxeFunction(randomFruit);

Note how similar the Haxe and JavaScript functions end up being. It almost feels like I shouldn’t need the special syntax at all. As you can see from the final line of code, function calls are left unchanged. If I typed drawCalligraphy(randomFruit) in Haxe, it would become drawCalligraphy(randomFruit) in JavaScript, which would work perfectly. Problem is, it doesn’t compile. `drawCalligraphy` isn’t a Haxe function, so Haxe throws an error.

Well, that’s where externs come in. By declaring an “extern” function, I tell Haxe “this function will exist at runtime, so don’t throw compile errors when you see it.” (As a side-effect, I’d better type the function name right, because Haxe won’t check my work.)

tl;dr: Since Haxe creates code in another programming language, you can talk to other code in that language. If you compile to JS, you can talk to JS.

Starting an iOS app

Each Android or iOS app has a single defined “entry point,” which has to be a Java/Kotlin/Objective-C/Swift file. Haxe can compile to (some of) these, but there’s really no point. It’s easier and better to literally type out a Java/Kotlin/Objective-C/Swift file, which is exactly what Lime does.

I’ve written about this before, but as a refresher, Lime creates an Xcode project as one step along the way to making an iOS app. At this point, the Haxe code has already been compiled into C++, in a form usable on iOS. Lime then copy-pastes in some Objective-C files and an Xcode project file, which Xcode compiles to make a fully-working iOS app. (And it’s a real project; you could even edit it in Xcode, though that isn’t recommended.)

And that’s enough to get the app going. When compiled side-by-side, C++ and Objective-C++ can talk to one another, as easily as JavaScript can communicate with JavaScript. Main.mm (the Objective-C entry point) calls a C++ function, which calls another C++ function, and so on until eventually one of them calls the compiled Haxe function. Not as simple as it could be, but it has the potential to be quite straightforward.

Unlike Android.

Shared libraries

A shared library or shared object is a file that is intended to be shared by executable files and further shared object files. Modules used by a program are loaded from individual shared objects into memory at load time or runtime, rather than being copied by a linker when it creates a single monolithic executable file for the program.

Traditionally, shared library/object files are toolkits. Each handles a single task (or group of related tasks), like network connections or 3D graphics. The “shared” part of the name means many different programs can use the library at once, which is great if you have a dozen programs connecting to the net and don’t want to have to download a dozen copies of the network connection library.

I mention this to highlight that Lime does something odd when compiling for Android. All of your Haxe-turned-C++ code goes in one big shared object file named libApplicationMain.so. But this “shared” object never gets shared. It’s only ever used by one app, because, well, it is that app. Everything outside of libApplicationMain.so is essentially window dressing; it’s there to get the C++ code started. I’m not saying Lime is wrong to do this (in fact, the NDK documentation tells you to do it), I’m just commenting on the linguistic drift.

To get the app started, Lime loads the shared object and then passes the name of the main C++ function to SDL, which loads the function and then calls it. Bit roundabout, but whatever works.

tl;dr: A shared library is a pre-compiled group of code. Before calling a function, you need two steps: load the library, then load the function. On Android, one of these functions is basically “run the entire app.”

Accessing Android/iOS features from Haxe

If your Haxe code is going into a shared object, then tools like externs won’t work. How does a shared object send messages to Java/Objective-C? I’ve actually answered this one before with examples, but I didn’t really explain why, so I’ll try to do that.

On Android, you call JNI.createStaticMethod() to get a reference to a single Java function, as long as the Java function is declared publicly. Once you have this reference, you can call the Java function. If you want more functions, you call JNI (Java Native Interface) multiple times.
On iOS, you call CFFI.load() to get a reference to a single C (Objective or otherwise) function, as long as the C function is a public member of a shared library. Once you have this reference, you can call the C function. If you want more functions, you call CFFI (C Foreign Function Interface) multiple times.

Gotta say, there are a lot of similarities, and I’m guessing that isn’t a coincidence. Lime is actually doing a lot of work under the hood in both cases, with the end goal of keeping them simple.

But wait a minute. Why is iOS using shared libraries all of a sudden? We’re compiling to C++ and talking to Objective-C; shouldn’t extern functions be enough? In fact, they are enough. Shared libraries are optional here, though recommended for organization and code consistency.

You might also note that last time I described calling a shared library, it took extra steps (load the library, load the function, call the function). This is some of the work Lime does under the hood. The CFFI class combines the “load library” and “load function” steps into one, keeping any open libraries for later use. (Whereas C++ doesn’t really do “convenience.”)

tl;dr: On Android, Haxe code can call Java functions using JNI. iOS extensions are designed to mimic this arrangement, though you use CFFI instead of JNI.

Why I wrote this post

Looking back after writing this, I have to admit it’s one of my less-informative blog posts. I took a deep dive into how Lime works, yes, but very little here is useful to an average OpenFL/Lime user. If you want to use CFFI or JNI, you’d be better off reading my old blog post instead.

Originally, this post was supposed to be a couple paragraphs leading into to another Android progress report. (And I’d categorized it under “development,” which is hardly accurate.) But the more I wrote, the clearer it became that I wasn’t going to get to the progress report. I almost abandoned this post, but I was learning new things, so I decided to put it out there.

(For instance, it had never occurred to me that CFFI was optional on iOS. It may well be the best option, but since it is just an option rather than mandatory, I’ll want to double-check.)

Advanced Layout

I previously wrote about Skylark Studio’s Layout library, explaining how it works and pointing out one of its many flaws. Today I’d like to cover more of its flaws, and hopefully convince you to use my library (“Advanced Layout”) instead of Skylark’s (“Skylark Layout”).

To begin, let’s take another look at how you’re meant to use Skylark Layout. If you’ve already read my “Not-so-simple SWF Layout” post, feel free to skip to the next section.

The SimpleSWFLayout sample project seems to be the only canonical example, so that’s what we’ll use. This project has two main parts:

Assets/layout.swf stores the layout. It has three colored rectangles arranged in a specific way.
Source/Main.hx has the code that scales the three rectangles.

When you run the sample, you get something like this:

As you can see, the three rectangles behave differently: the “background” rectangle always fills the stage, the “header” rectangle stretches across the top, and the “column” rectangle stretches across the left.

Simplifying

When I started to dig into the code, I realized that it’s a lot more verbose than it needs to be. Let’s simplify SimpleSWFLayout!

-import layout.Layout;
-import layout.LayoutItem;
+using layout.LayoutPreserver;

That isn’t much simpler; it’s just different.

-private var layout:Layout;
-layout = new Layout ();

You definitely need a Layout object. That’s what stores information about your layout, and without storing that information, the library won’t know how to arrange those rectangles.

But you don’t need to be the one to create that object. Advanced Layout will do this automatically, and only if necessary.

-layout.resize (stage.stageWidth, stage.stageHeight);
-stage.addEventListener (Event.RESIZE, stage_onResize);
-private function stage_onResize (event:Event):Void {
-   layout.resize (stage.stageWidth, stage.stageHeight);
-}

When using Skylark Layout, you have to call layout.resize() to make it have any effect. You also have to listen for RESIZE events, and call layout.resize() again each time one of those happens.

Once again, Advanced Layout handles all that for you, so you can safely delete those lines.

That just leaves three lines, but they aren’t as easy to deal with.

layout.addItem (new LayoutItem (clip.getChildByName ("Background"), STRETCH, STRETCH, false, false));
layout.addItem (new LayoutItem (clip.getChildByName ("Header"), TOP, STRETCH, true, false));
layout.addItem (new LayoutItem (clip.getChildByName ("Column"), STRETCH, LEFT, false, true));

LayoutItem is the set of instructions for one particular rectangle. (In this case, the background rectangle.) layout.addItem() is how the Layout knows about that LayoutItem.

Once again, just because these lines are required doesn’t mean you have to be the one to type them out.

Most of that code will be the same every time. You’ll always call the addItem() function, and you’ll always create a new LayoutItem instance. And also – spoiler alert – those two booleans are nearly meaningless.

Only three pieces of information per line are unique:

clip.getChildByName("name") is the rectangle that’s being scaled.
STRETCH or TOP indicates whether the rectangle should scale vertically or just stay at the top.
STRETCH or LEFT indicates whether the rectangle should scale horizontally or just stay at the side.

There’s no way around it*. These three things change each time, so you have to type them out. But what if that was all you had to type?

-layout.addItem (new LayoutItem (clip.getChildByName ("Background"), STRETCH, STRETCH, false, false));
+clip.getChildByName("Background").stickToAllSides();
-layout.addItem (new LayoutItem (clip.getChildByName ("Header"), TOP, STRETCH, true, false));
+var header = clip.getChildByName("Header");
+header.stickToTop();
+header.stickToLeftAndRight();
-layout.addItem (new LayoutItem (clip.getChildByName ("Column"), STRETCH, LEFT, false, true));
+var column = clip.getChildByName("Column");
+column.stickToLeft();
+column.stickToTopAndBottom();

Much better! You just saved a lot of characters and made your code more readable.

*Actually, there is a way around it, but that’s a different blog post.

Wording

So the next thing you’re probably wondering is, why are all these functions named “stickToXYZ“?

To answer that, let’s take another look at what this demo does:

Watch the header. It starts out a with a small margin between it and the three edges it almost touches. When it scales, those margins remain the same.

This isn’t an accident. When you tell Skylark Layout “TOP,” what you’re really saying is “don’t scale this vertically; just keep it exactly the same distance from the top at all times.” It looks at the header and makes a note of how far it currently is from the top: 16 pixels. Whenever the stage scales, it places the header 16 pixels from the top.

STRETCH is similar, but a bit more complicated. In this case, Skylark Layout looks at both the left and the right margin (12 and 15.8, respectively). Then when the stage scales, it places the header’s left edge at 12 pixels from the left, and then scales it until the right edge is 15.8 pixels from the right.

Skylark Layout also provides a CENTER option, even though it isn’t shown in the example. In that case, it keeps your rectangle’s center the same distance (and direction) from the center of the stage.

Advanced Layout provides all the same features, just with a different name. Click and drag in this demo to get a sense of how it works:

In particular, look at the centered objects. See how the same point stays in contact with the center line? It’s as if someone put a drop of glue on that point and glued it to the line.

Obviously there’s no actual glue involved, but I decided to use the metaphor anyway. Objects move around if you “stick” them to an edge and then move that edge. If you attach an object to two opposite edges, it will stretch when you move the edges apart.

If you attach an object to all four edges, then it will stretch in both directions. That’s why the background is set to stickToAllSides().

Conclusion

Here are the reasons you should use Advanced Layout rather than Skylark Layout:

No need to initialize everything yourself.
Easier-to-read code.
More ways to lay out your objects. (This post barely scratches the surface.)
You can remove objects from a Layout to avoid memory leaks.
You can nest Layouts.
Compatible with HaxeFlixel and HaxeUI. (Not well-tested.)
No extra empty space on big screens. (Unless you specifically enable it.)
Extensibility.

I’m not aware of any reason to use Skylark Layout rather than Advanced Layout. Everything Skylark can do, Advanced can do better.

Scaling Strategy 2: Stage Dimensions

Instead of the current screen’s resolution, you could look at the stage dimensions.

I know that the “screen resolution” method means that one inch on your screen is one inch on theirs. That can be nice to have, especially when dealing with tablets. On the other hand, it’s not so great for small screens. A one-inch-wide button on a low-end device could fill a quarter of the screen!

Plus, you can’t even get at the screen resolution in Flash, and HTML5 may or may not be able to access that information.

The “stage dimensions” method, on the other hand, works on all platforms, handles smaller screens gracefully, and is much easier to test.

Using the stage’s dimensions

Let’s say I’ve convinced you to give this method a shot. How does it work?

The first step is to take the stage dimensions and convert them into scaleX and scaleY values. Then scale all your objects using these values.

Flash defines four stage scale modes, each representing a different way to generate scaleX and scaleY. Let’s take a look.

NO_SCALE

Click and drag in this demo to get a sense of the NO_SCALE option. (Requires JavaScript.)

Somehow I doubt this is the one you’re looking for.

Implementing NO_SCALE

You already implemented it. Moving on!

EXACT_FIT

Quick: what’s the simplest way you can think of to handle scaling?

Remember, you have access to stageWidth and stageHeight, and you can set scaleX and scaleY.

…

If you guessed “scale x and y independently,” you’re right!

Hey, I never said it was a good way to handle scaling. I just said that it’s simple.

Implementing EXACT_FIT

var scaleX:Float = stage.stageWidth / 800;
var scaleY:Float = stage.stageHeight / 600;

Lib.current.scaleX = scaleX;
Lib.current.scaleY = scaleY;

That’s it!

Note: you can’t set stage.scaleX and stage.scaleY. In most cases, Lib.current works equally well, which is why I used that. (Just avoid adding children directly to the stage.)

NO_BORDER

Obviously, you don’t want to stretch your content like that. Somehow, you have to make sure that each individual object keeps the same proportions.

There are two main ways to do this, and NO_BORDER is one of them.

Implementing NO_BORDER

NO_BORDER considers the same two values as EXACT_FIT, but it picks only one. Specifically, it picks the larger one.

var scaleX:Float = stage.stageWidth / 800;
var scaleY:Float = stage.stageHeight / 600;

var scale:Float = Math.max(scaleX, scaleY);

Lib.current.scaleX = scale;
Lib.current.scaleY = scale;

Imagine your stage is made wider, so that scaleX comes out to 1.55, and scaleY is still 1.

NO_BORDER will choose the larger of the two values (1.55), making the stage the right horizontal size, but too large vertically.

SHOW_ALL

If NO_BORDER tends to make objects too large, SHOW_ALL errs on the side of keeping them small.

Implementing SHOW_ALL

Like NO_BORDER, except with Math.min().

//SHOW_ALL
var scaleX:Float = stage.stageWidth / 800;
var scaleY:Float = stage.stageHeight / 600;

var scale:Float = Math.min(scaleX, scaleY);

Lib.current.scaleX = scale;
Lib.current.scaleY = scale;

Given the choice between 1.55 and 1, SHOW_ALL will choose the smaller value. This ensures that everything will fit, though there will also be extra space.

In most apps, extra space is better than objects being cut off, making SHOW_ALL the clear winner.

Scaling Strategy 1: Pixel Density

In my previous post, I explained why the Layout library isn’t suited for high-resolution screens. The short version is, that library is designed for a constant pixel density. Any layout that looked good on a high-resolution screen would look terrible on a low-resolution one, and vice versa.

The SimpleSWFLayout sample project, as it appears on a high-resolution device.

Here, the header is not scaled vertically, which is why it appears so short. Ideally, we’d want a way to increase its height based on the device’s resolution.

Fortunately, OpenFL provides a way to access the pixel density of the current screen: Capabilities.screenDPI. Now all we need is a layout library that supports pixel density scaling. Conveniently, I happen to have made one!

To convert SimpleSWFLayout to my library, follow these instructions.

Enabling pixel-density-based scaling

Scale.stageScale.screenDPI();

Add that somewhere, and you’re done!

Also make sure not to compile for Flash or HTML5. Capabilities.screenDPI just doesn’t work on those platforms. :/

Same device as before, but now I can actually see the details!

Ok, so there aren’t any details to see. But if there were details, I’d totally be able to see them!

Not-so-simple SWF Layout

OpenFL provides a sample project named SimpleSWFLayout, demonstrating how you can set up a layout that scales along with the window.

Three items are being resized here:

The background scales to fit the screen.
The header scales horizontally to take up almost all of the window (except for a small margin on both sides). It doesn’t scale vertically.
The column scales vertically to take up most of the space below the header, and it doesn’t scale horizontally. It has a larger margin below it than above it.

All this is automatic, courtesy of the Layout library!

How it works

The Layout library uses some clever tricks to make this easier on you. But if you don’t know that these tricks exist, you can’t really take advantage of them, can you? Let’s fix that.

The project is called SimpleSWFLayout because it loads its three assets from a SWF file (layout.swf, found in the asset folder). For convenience, it also includes the layout.fla file used to generate the swf.

If you have Flash Professional, you can open up layout.fla and see that it has all three items already arranged correctly. The header already has a 12 pixel margin above it, and a 16 pixel margin on each side. The column is already placed 20 pixels below the header, 16 pixels from the left, and 54 pixels above the bottom.

So all you need to do is tell the library that you want to keep these margins constant. That’s what’s going on when the example specifies TOP, STRETCH for the header and STRETCH, LEFT for the column. (Traditionally, x comes before y, but for some reason the opposite is true here. I don’t know why.)

The base stage dimensions

If you’re paying really close attention, you may notice that the background in layout.fla is 800×600, while the SimpleSWFLayout project starts at 800×500.

Don’t worry! It doesn’t need to match.

The background is what defines the “base” stage size. The margins will be calculated based on the size of the background, and then scaled to fit the actual stage size. It’s just like if the window started at 800×600 and was promptly scaled down to 800×500.

Whenever you add a new object, it will use the base stage size to figure out what margins it needs. Even if you’ve already scaled the window, it’ll refer to the background’s original width and height.

Other alignments

The example demonstrates how to align something to the left and top. But that’s not much of a demonstration, is it? To keep an object 16 pixels away from the left edge, all the Layout library has to do is not change its x coordinate.

But you can align objects to the right, bottom, or center of the screen too. If you align to the right, the Layout library will update the x coordinate so that the object moves with the right edge of the screen. However far it started from the right, that’s how far it will be after resizing. Same with aligning to the bottom.

(Well, mostly. There’s some inconsistent behavior for small windows, but I won’t get into that.)

If you choose to align an object to the center of the screen, the Layout library will remember how far it was from the center, and keep it that far. I like to think of this as “pinning” the object to the center, because it’s like you stick a pin in the object wherever it crosses the center line. Whenever the center line moves, that point on the object follows that point on the center line. (The object doesn’t actually have to touch the center line to be “pinned,” but I still like the metaphor.)

One of my first criticisms of this library was that it didn’t allow you to place objects relative to one another, and keep them that way. But that just isn’t true.

Let’s say your main menu has three buttons right next to one another. You put all three in a line just below the center, and pin them (both horizontally and vertically). When you resize the stage, all three buttons will move by the same amount in the same direction.

They’re staying in the same position relative to one another, and they’re staying just below the center. Looks good!

The problem

There’s something wrong with this implementation. Have you figured it out yet?

Hint: there are only two options for scaling. Either you keep the size constant, or you fill the stage, minus a constant value.

Take a moment to think about it.

(Or just scroll down if you aren’t in the mood for games.)

…

The SimpleSWFLayout sample project, rendered in an enormous window. — The SimpleSWFLayout sample project, as it appears in an extra-large window. This screenshot was taken on a Nexus 6.

That right there is a lot of empty space.

The problem is those constant values I mentioned. A constant 16-pixel margin looks good enough on screens around 800×600, but when the screen is over 2000 pixels wide, it just gets lost. And a constant width and height for the column and header respectively means they take up less and less of the screen as it gets bigger.

In short, the Layout library is not, in itself, enough for your mobile layout needs. Fortunately, I’ve created a more advanced alternative. Documentation coming soon!

HXP Projects

If you look closely at the OpenFL “docs” page, you’ll notice it lists two options under “project files.” The first of the two is the XML format you know and love, or at least know, but what’s up with the second?

Before I answer that question, I have some questions for you to answer.

Is your project file getting out of hand? Do you wish the XML format supported more advanced string manipulation? Do you find yourself typing out multiple <set> tags to simulate the “or” operator? Have you ever tried and failed to copy a directory with the <template> tag?

If you answered “yes” to any of these questions, HXP files may be right for you. Consult your physician today!

So anyway, what are HXP files? The short answer is, they’re HX files with an extra P.

The P stands for “Project,” so the long answer is, they’re project files which let you use Haxe instead of XML. You can write functions, manipulate strings, declare variables, and even access Haxe’s standard library.

Let’s try it out!

Ok, find a sample project that uses an HXP file, and… oh. There aren’t any, are there?

Fine, we’ll use DisplayingABitmap. I’ve created an HXP version of the project file, available here. Save it as project.hxp, delete project.xml, and you’re ready to give it a try.

Lime recognizes the file format, so just run lime test neko and it’ll find the new project file. It should show the OpenFL logo in the middle of the screen.

Congratulations! Through no fault of your own, you have gotten an HXP project file to work!

Converting your XML file to HXP

Running a sample project is all well and good, but what about that much larger project file you want to convert? Or perhaps that large file that you don’t want to convert because it will be so much work, but you have to convert anyway? Sometimes it’s the latter.

To make your job easier, I recommend copying over these functions. With those in place, here’s a conversion guide:

- <set name="godMode" />
+ defines.set("godMode", 1);
+ //Or if you only need it for conditionals:
+ var godMode:Bool = true;

- <setenv name="DARK_AND_STORMY_NIGHT" />
+ setenv("DARK_AND_STORMY_NIGHT", "1");
+ //That second value must be a string and cannot be skipped.

- <setenv name="TRUE" value="FALSE" />
+ setenv("TRUE", "FALSE");
+ //Please do not do this.

- <haxedef name="Robert'); DROP TABLE Students;--" />
+ haxedefs["Robert'); DROP TABLE Students;--"] = 1;
+ //Please do not do this either.

- <haxedef name="dump=pretty" />
+ haxedefs["dump"] = "pretty";

- <haxeflag name="-dce" value="std" />
+ haxeflags.push("-dce std");

- <section if="flash"> </section>
+ if(target == Platform.FLASH) { }

- <section if="godMode" unless="mobile"> </section>
+ if(defines.exists("godMode") && platformType != PlatformType.MOBILE) { }

- <ndll name="regexp" />
+ ndll("regexp");

- <assets path="assets/text" rename="text" include="*.txt|*xml" exclude="*.rtf" />
+ includeAssets("assets/text", "text", ["*.txt", "*.xml"], ["*.rtf"]);

- <certificate path="correcthorse.keystore" password="batterystaple" alias="AzureDiamond" alias-password="hunter2" />
+ certificate = new Keystore("correcthorse.keystore", "batterystaple", "AzureDiamond", "hunter2");

- <dependency name="GooglePlayServices" path="pathtoGooglePlayServices" />
+ dependency("GooglePlayServices", "pathtoGooglePlayServices");

- <dependency name="GameKit.framework" />
+ dependency("GameKit.framework");

- <dependency path="/Library/Frameworks/LameKit.framework" />
+ dependencyByPath("/Library/Frameworks/LameKit.framework");

- <include path="industrial/application.xml" />
+ merge(new ProjectXMLParser("industrial/application.xml", defines));

- <template path="templates/CureForCancer.java" rename="src/com/jir/research/CureForCancer.java" />
+ templateFile("templates/CureForCancer.java", "src/com/jir/research/CureForCancer.java");

- <template path="templates/" rename="src/com/jir/research/" />
+ templatePaths.push("templates/");

Those are most of the existing tags, but if I missed one, you can always refer to the source.

Template directories

I also tossed in a new feature: template directories that actually do what you expect them to do.

If you’ve ever tried passing a folder to <template path="x">, you’ve probably been very disappointed. While Lime is quite good at taking a single file, processing it as a template, and putting it where you want, it absolutely refuses to do that with directories.To be specific, if you specify a directory, Lime expects to find a folder structure exactly matching its own template directory. A subdirectory for each target, and a very specific arrangement of files within those subdirectories.I guess that can be useful, but it really isn’t what you’d expect. If you can copy individual files into place…

<template path="templates/SourceFile0.java" rename="src/com/example/SourceFile0.java" />
<template path="templates/SourceFile1.java" rename="src/com/example/SourceFile1.java" />
<template path="templates/SourceFile2.java" rename="src/com/example/SourceFile2.java" />

…Why can’t you do it in bulk?

<template path="templates/" rename="src/com/example/" />

I honestly don’t know why XML projects work that way, but I do know you can make it work in HXP!

templateDirectory("templates/", "src/com/example/");

One step at a time

If this is too much work to do all at once, then keep your old project file around, and import it into the new one:

merge(new ProjectXMLParser("old_application.xml", defines));

With that done, you’re free to move items over one at a time. Just be warned that some features, such as the <template> tag for folders, may quit working correctly. If this happens, focus on converting them first.

Merged changes

HashLink update

Pending changes

Submodule update

Looking forward

Choosing the right tool for the job

Demo project

Using Thread

Thread safety basics

Using Future

Using ThreadPool

Green/virtual threads

About BackgroundWorker and ThreadPool

Web workers

Workers use source code

Workers are isolated

Passing messages

Returning results

Source-to-source compilation

Starting an iOS app

Shared libraries

Accessing Android/iOS features from Haxe

Why I wrote this post

Implementing NO_SCALE

Implementing EXACT_FIT

Implementing NO_BORDER

Implementing SHOW_ALL

Using `Thread`

Using `Future`

Using `ThreadPool`

About `BackgroundWorker` and `ThreadPool`