How to Natively Read .tgz Files With the New C# TarReader Class
In .NET 7 we can now natively decompress/unpack and open .tgz/.tar.gz files/archives without third party libraries or complex code.
We now have the System.Formats.Tar namespace which allows us to natively interact with .tar files.
This was a fantastic API proposal by Carlos Sanchez which aligns with making .NET cross platform as *nix based systems often deal with tarballs for archiving.
Since just reading .tar
files is literally in the documentation, let's mix this new API with the existing GZipStream API to open another commonly seen *nix file type: The compressed GZipped version of a .tar
called .tgz
.
using var gzip = new GZipStream(tgzStream, CompressionMode.Decompress);
using var unzippedStream = new MemoryStream();
await gzip.CopyToAsync(unzippedStream);
unzippedStream.Seek(0, SeekOrigin.Begin);
using var reader = new TarReader(unzippedStream);
while (reader.GetNextEntry() is TarEntry entry)
{
Console.WriteLine($"Entry name: {entry.Name}, entry type: {entry.EntryType}");
//entry.ExtractToFile(destinationFileName: Path.Join("D:/MyExtractionFolder/", entry.Name), overwrite: false);
}
Notes
tgzStream
is anyStream
object that represents the.tgz
file.- Why copy over the
GZipStream
to aMemoryStream
? We can't seek theGZipStream
. - The commented line is ripped straight from the documentation and is how to save the iterated file from the archive.
And finally, the output of the WriteLine()
calls above:
I also really like how the while
loop is written because originally I had a loose TarEntry?
variable above to hold onto the iteration value, due to newer patterns we can put it all in the same line! Inspiration from Nick Craver:
Troubleshooting
You might run into the following:
System.IO.InvalidDataException: 'Found truncated data while decoding.'
You may need to seek the tgzStream
variable to the beginning like so before using it in the code above:
tgzStream.Seek(0, SeekOrigin.Begin);
To Conclude
I hope you found this post useful. I had to write it after I realised how easy it was nowadays when opening up NPM .tgz
payloads for my project to create Pokémon spritesheets: