The Short Version
This post is inspired by the famous-to-me writeup on Building a self-contained game in C# under 8 kilobytes by Michal Strehovský. I love the idea of optimising and wanted to learn by following in his footsteps. So I took another simple and common game, Wordle, to produce a self-contained
.exe that I could begin shrinking. The results were:
This massive reduction of ~98% was achieved via code improvements from IL inspection, compiler flags, linker flags, and more. If you want the longer version, read on.
I started out this little project with these goals in mind:
- Make a fun little game in C#
- Make it self-contained
- Shrink it to 1MB or under
I've always like optimisation. A couple of examples:
- Teasing out efficiencies with micro-optimisations in C#
- Making sure the least number of utensils/dishes are dirtied while making food
- Factorio and Satisfactory
I ran across this fantastic post around making a fully self contained C# snake game in 8KB and thought "I want to do that!".
The 8KB game of Snake by Michal Strehovský uses some really cool things I didn't know were possible and I'll be emulating his success up until he replaces the base class libraries.
Or in other words, this goes beyond "just use the
<TrimMode>link</TrimMode> to shrink your
.exe", but not as far as rewriting deep and core .NET functionality. To shrink our
.exe we will be playing with the experimental native AOT, shaving off safeties, and making fewer guarantees. Or if we think of this as a car, we'd be getting rid of the spare tire, removing the doors, and throwing the seats out.
The self-contained part means this
.exe will be able to run on a machine without .NET installed. Which means all the dependencies and runtime has to go along and be embedded in it - which is a lot of heft.
What to apply this fun knowledge to? The inspirational post uses the classic game, Snake. I wanted something different, but also a game that's even simpler to code. Turns out when writing this, Wordle is well popular across the internet and great to write. A simple word game where the user gets six chances to guess a five letter word. Each attempt will tell the user if a letter is:
- Correct, but in the wrong place
From this, the user deduces what the word is.
A little bit of tinkering later, and the original TinyWordle version was born. It has various inefficiencies, dead code, and code smells, but I needed a starting point somewhere and this version worked as expected.
Feel free to check out the repo, however it's in a much more raw format compared to this post as I wanted to use it both as the experiment and notes for said experiment. You will see many recorded attempts which did nothing to help (there were also many more attempts left unrecorded) as they served as reminders for the next time.
I wanted to preserve each major attempt so I could quickly refer to them like an animator quickly flipping back to the page underneath which is why you'll see these attempts as separate folders, rather than branches.
Each of these attempts make an effort to link back to relevant documentation. Esoteric compiler/linker knowledge can be frustrating to re-find and to be honest, I don't fully understand a couple of points, but they'll be linked there waiting to be understood in the future.
From here, I'll talk about the defining attempts, how much they helped, and why they helped. Like mentioned earlier, this will be in a slightly different and more readable format compared to the repo readme via skipping or collating various attempts.
Each attempt will be published in Release mode and for 64bit Windows machines.
The original codebase was a fun afternoon of working out how to code Wordle. It's pretty close to the original sans having a list of legal words - which means "asfgp" would be a valid guess word. I decided on pre-loading seven words which get randomly selected so a player could have a potential week's worth of TinyWordle (but might get repeats, but oh well ¯\_(ツ)_/¯).
I learned a little about making console games from this post around Console Games - Snake.
It was also a great excuse to get on using the new
dotnet commands. For instance, the project started out as a simple
dotnet new console. Then I set up the
.csproj file with the following:
<PropertyGroup> <OutputType>Exe</OutputType> <TargetFramework>net6.0</TargetFramework> <ImplicitUsings>enable</ImplicitUsings> <Nullable>enable</Nullable> <PublishSingleFile>true</PublishSingleFile> </PropertyGroup>
With the focus on
<PublishSingleFile>true</PublishSingleFile> which gives us our final single file
dotnet publish -r win-x64 -c Release Total binary size: 62,091 KB
Coming in at 62,091 KB, the race to shrink the self-contained app is on. (Who am I racing? Just me I guess, but the whole experience felt like a race).
Attempt 1: Trimming Low Hanging Fruit
Trimming. If you look up how to shrink a .NET binary this is where you will probably end up. This is also where a lot of other write ups stop because it works really well and it's really simple. By just adding the following to your
.csproj file you could get massive size improvements:
<PropertyGroup> <PublishTrimmed>true</PublishTrimmed> </PropertyGroup>
In short, trimming is when unused code is removed. Since in this case the whole needed .NET is coming with us, there's a lot of stuff that can go, for example: LINQ. However, do watch out as this does not work in all scenarios and only works by static analysis. Check the incompatibilities documentation for more on the caveats. Thankfully in this case, the code is simple and takes well to trimming.
The resulting binary size is much smaller.
dotnet publish -r win-x64 -c Release Total binary size: 11,189 KB
Or -50,902 KB smaller than the original.
Attempt 3: Trimming the Game Code
Remember, I'll be skipping my more useless attempts - so jumping straight to attempt 3. Turns out somewhere in the code written there is a reference to allow more trimmed code. This is the first of the steps I don't fully understand, but feel free to read the documentation on trimming additional assemblies.
<ItemGroup> <TrimmableAssembly Include="TinyWordle" /> </ItemGroup>
dotnet publish -r win-x64 -c Release Total binary size: 11,173 KB
A whopping -16 KB.
Attempt 4: The Experimental Native AOT
This is some cool stuff and it's a fork of CoreCLR featured in the Snake post. It produces your binary as native machine code ready for a given environment with all libraries folded in together. This is opposed to creating IL bytecode that is interpreted by the JIT compiler. Read all about it from the runtime lab feature branch for Native-AOT.
It's a simple NuGet package (though not served from NuGet.org) install to get this up and running and the docs are more than helpful. It's great to see this so well documented even if it's an experiment.
You do however have to install the C++ Development module of Visual Studio for this to work. I also had to remove the
PublishSingleFile element from the
.csproj file otherwise the publish fails when using this.
After installing, and with no code changes:
dotnet publish -r win-x64 -c Release Total binary size: 4,348 KB
That's -6,825 KB for zero effort. Impressive.
Attempt 5: Uncultured
Looking at the root documentation it looks like we can optimise! Our first flag will be around
InvariantGlobalization which strips away all the culture data, string specifics, datetime formats, and more.
<PropertyGroup> <InvariantGlobalization>true</InvariantGlobalization> </PropertyGroup>
dotnet publish -r win-x64 -c Release Total binary size: 4,127 KB
A tidy -221 KB removed.
Attempt 6: Optimise for Small
A great straightforward flag called
IlcOptimizationPreference which can be set to optimise for file size.
<PropertyGroup> <IlcOptimizationPreference>Size</IlcOptimizationPreference> </PropertyGroup>
dotnet publish -r win-x64 -c Release Total binary size: 4,058 KB
A nice -69 KB.
Attempt 7: Removing Identical Code
Next up is setting
IlcFoldIdenticalMethodBodies which according to the docs can get a bit weird with stack traces as functions could point to unexpected function locations.
<PropertyGroup> <IlcFoldIdenticalMethodBodies>true</IlcFoldIdenticalMethodBodies> </PropertyGroup>
dotnet publish -r win-x64 -c Release Total binary size: 3,859 KB
Keeping steady with another -199 KB.
Attempt 8: Disabling Reflection
A big one with potentially big consequences too. Setting
true disables all the reflection based metadata generation. .NET uses a lot of reflection to get around and this flag bins everything except the most basic reflection calls such as
typeOf so use this one at your own risk.
<PropertyGroup> <IlcDisableReflection>true</IlcDisableReflection> </PropertyGroup>
dotnet publish -r win-x64 -c Release Total binary size: 1,167 KB
Best shrinkage for awhile, coming in at -2,692 KB.
Note: Graphs moving onwards will be zoomed in.
Attempt 10: Who Needs the Stack Trace?
Moving along to attempt 10,
IlcGenerateStackTraceData is set to
false. Good luck getting easily understandable and meaningful stack traces now.
<PropertyGroup> <IlcGenerateStackTraceData>false</IlcGenerateStackTraceData> </PropertyGroup>
dotnet publish -r win-x64 -c Release Total binary size: 1,038 KB
A cute -129 KB. Meaning we're getting really close to sub 1024 KB (or 1MB).
From here, the attempts start to shave of the slimmest of margins and in the GitHub readme, each attempt begins to be comprised of many sub-attempts.
Attempt 11: Modifying the Original Code
Seeing as a lot of flags had been used up, it was time to turn inwards and see what code I as a developer could remove. Seeing as I was only 14 KB away from my goal, this shouldn't be too hard... Right?
I began to use dnSpy to decompile and understand the resulting
.dll version of the
.exe. Why the
.dll? Because when I opened the
.exe it kept it's secrets from me and I wasn't able to get to the code.
I began to understand that some core libraries in .NET cannot be trimmed and will be left in the final
.exe regardless. Things like the garbage collector and necessary code aren't even looked at for trimming.
However, other functions are. The following image shows the first attempt at removing unneeded functions from the
string types by comparing references between the previous attempt (10) and the current attempt (11).
The space saving by removing the following calls from TinyWordle was decent:
Console.ReadKey(): -2 KB
Console.WriteLine(): -1 KB
String.ToLower(): -5 KB
String.Contains(): -512 B (just bytes)
Some of these choices sacrificed some usability such as removing
.ToLower() makes it so no upper case characters will ever match - as long as our user doesn't use any upper case they'll be fine.
Then I opted to use the cool
Random code from the Snake game. Which removed the dependency to the
Random class, saving another 2 KB.
All in all, the savings summed up to -10 KB from the
dotnet publish -r win-x64 -c Release Total binary size: 1,028 KB
Attempts 12, 13, and 14: The Kitchen Sink
A single picture can sum up the big amount of attempted changes, and the far bigger amount of unwritten changes:
Let's start with what didn't work:
- Changing things from
- More flags
- Swapping out calls with other calls, such as
String.Contains()with my own implementation
Console.SetCursorPosition()to do all the console work
- Trying to trim
- Prevent compiler optimisations. For instance, sometimes a
stringtypes will invoke
op_Equality(), but it seems even if this did happen, it's part of the base class that's never trimmed - maybe ¯\_(ツ)_/¯
- Removing references to
- Rewriting the code (awfully)
- Many, many more undocumented ideas
Then what did work?
- I noticed in dnSpy I could swap out
Console.Write(string)calls by making all the
.ToString(). -1 KB
[MethodImpl(MethodImplOptions.AggressiveInlining)]but only on some methods. - 1 KB
- Removing the manifest metadata. - 512 B
The last one around removing manifest metadata was interesting as I'd never realised it existed before. If you open the
.dll in Visual Studio it will show the manifest metadata.
This is an XML document that I don't think it needs in order to run. I understand it can have all sorts of things the
.exe needs such as requiring UAC, but that's not my problem for TinyWordle.
The original manifest looked like:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <assembly xmlns="urn:schemas-microsoft-com:asm.v1" manifestVersion="1.0"> <assemblyIdentity version="188.8.131.52" name="MyApplication.app"/> <trustInfo xmlns="urn:schemas-microsoft-com:asm.v2"> <security> <requestedPrivileges xmlns="urn:schemas-microsoft-com:asm.v3"> <requestedExecutionLevel level="asInvoker" uiAccess="false"/> </requestedPrivileges> </security> </trustInfo> </assembly>
After creating a custom one, it looked like. Well, a 0 byte file.
Just needed to reference this in the
<PropertyGroup> <ApplicationManifest>app.manifest</ApplicationManifest> </PropertyGroup>
And the savings were made.
Oh and as a side note, remember to delete the obj folder after changing the manifest or else you might get this:
LINK : fatal error LNK1123: failure during conversion to COFF: file invalid or corrupt
All in all, after all of that we're -2 KB. While -512 B were visible in the file size, didn't round in a way that reflected it.
dotnet publish -r win-x64 -c Release Total binary size: 1,026 KB
It was around here that I became stuck. These three attempts and all of their sub attempts ran me out of ideas. Roughly two weeks of daily tinkering occurred.
I know the Snake guy went in and used some linker magic to remove base class implementations saving a huge amount of space... But I didn't understand enough and I wanted to prove that it was possible without breaking into deep command line and linker magic.
Attempt 15: We Did It
Adrift and lost for ideas, I trawled through the GitHub project for the Snake game, I noticed in the
.csproj file, a reference to
LinkerArg. Could there be a flag I could switch without having to know the deep magics?
Yes, and it's in the Microsoft C++ documentation as Linker options. There are a fair few of them and I read and tried them all. Luckily one flag worked.
/DYNAMICBASE:NO, to quote the documentation:
The /DYNAMICBASE option modifies the header of an executable image, a .dll or .exe file, to indicate whether the application should be randomly rebased at load time
It's on by default, so let's see what happens if we turn it off...
<ItemGroup> <LinkerArg Include="/DYNAMICBASE:NO" /> </ItemGroup>
dotnet publish -r win-x64 -c Release Total binary size: 1,011 KB
TIME! After two weeks of fruitless trying, TinyWordle is finally under 1024 KB thanks to the -15 KB from removing address randomisation. And that's it, that's the goal met. 🎊
Celebration, Takeaways, and Conclusion 🥳
I was elated to finally get under the 1MB goal. Originally I had no idea if it was even feasible without going into replacing the base libraries. Maybe if I attempt this in a few more years I'll have more knowledge to make it even tinier.
The full graph and table of data:
I learned a lot. Whether it was having a better understanding of IL compilation, knowing more of what happens when I hit "build" in Visual Studio, or just knowing some of the available levers to flip. Almost like that stereotypical school frog dissection to see how it works. Prodding and poking at one place causes another to move. 🐸
Now and again I also used SharpLab to see what IL would pop out if I tried something. Though then remembering I had dnSpy meant I could try for real right there and then with my code.
Where to from here? It would probably be a clean rewrite as a single file then replacing the base class libraries just as the Snake post did. Maybe that will be a post one day when/if I get to properly understanding that.
Oh and the graphs were generated from chart.xkcd.
Thank you for reading this far, this has been the most fun little project I've done in awhile for something I really enjoy doing.