How to Get the Standard Metadata for a PDF With iText7 and C#

In this post I'll take you through getting the standard set of metadata for a PDF with iText7 and C# in a way you can understand what's happening and just copy it into your codebase.

How to Get the Standard Metadata for a PDF With iText7 and C#

The following is a plug and play block of code that makes no assumptions on your code base. There will most probably be a more efficient way to incorporate this for your needs such as passing more things around by reference, or having guard clauses. This example also uses the newer C#8 using declaration syntax, but feel free to swap that out with the usual using statement syntax.

This code below takes in a byte[] of the PDF, safely opens it and extracts the default metadata into a Dictionary<string, string> for you to consume. If you already have the PdfDocument object in your code, then your task is even easier.

public Dictionary<string, string> GetStandardMetadata(byte[] pdf)
{
    var metadataDictionary = new Dictionary<string, string>();
    using var inputStream = new MemoryStream(pdf);
    using var reader = new PdfReader(inputStream);
    using var document = new PdfDocument(reader);

    var documentInfo = document.GetDocumentInfo();

    metadataDictionary.Add("Title", documentInfo.GetTitle());
    metadataDictionary.Add("Author", documentInfo.GetAuthor());
    metadataDictionary.Add("Subject", documentInfo.GetSubject());
    metadataDictionary.Add("Creator", documentInfo.GetCreator());
    metadataDictionary.Add("Producer", documentInfo.GetProducer());
    metadataDictionary.Add("Keywords", documentInfo.GetKeywords());

    return metadataDictionary;
}

Want to quickly spin this up to play with it? You can get your PDF byte[] via:

var pdf = File.ReadAllBytes(@"C:\document.pdf");