Core PDF document management covers the operations that appear in nearly every PDF workflow: opening or creating a document, accessing pages and their content, reading and writing annotations, extracting text, and working with interactive actions. Aspose.PDF FOSS for .NET provides a .NET 8+ API that handles all of these tasks through a consistent object model centered on the Document and Page types.
Document Lifecycle: Create, Open, Save
Every workflow starts with either creating a new document or loading an existing one. Document.Create() returns a new, empty Document instance. Document.Open(data) accepts a byte[] or Stream and parses the PDF structure:
using var doc = Document.Create();
doc.Pages.Add();
var page = doc.Pages[1];
var action = PdfAction.CreateUri("https://aspose.com");
page.Annotations.AddLinkAnnotation(new Rectangle(50, 700, 200, 720), action);
using var ms = new MemoryStream();
doc.Save(ms);
ms.Position = 0;
using var doc2 = Document.Open(ms.ToArray());
var annot = (LinkAnnotation)doc2.Pages[1].Annotations[1];
Console.WriteLine(annot.Uri); // https://aspose.com
This snippet opens a minimal PDF, adds a link annotation to page 1, saves to a MemoryStream, and verifies that the annotation is preserved after reload. The Document.Save(Stream) overload writes the full updated structure; Document.ToArray() returns the bytes directly.
Page Access and the Pages Collection
Pages are accessed through the Pages property using a 1-based integer index: doc.Pages[1] returns the first page, doc.Pages[doc.Pages.Count] returns the last. Each Page object exposes:
Annotations— theAnnotationCollectionfor that page, supportingAddLinkAnnotationand enumeration of existing annotations by type.- Content stream operators — low-level access to page drawing commands via the
Operatorscollection.
The 1-based index is consistent throughout the API: annotation lookups, GoTo action destination page indices, and all other page references use the same convention.
Text Extraction with TextFragmentAbsorber
TextFragmentAbsorber is the standard entry point for reading text from a PDF page. Use the no-argument constructor to extract all fragments, or pass a search phrase to filter to matching text:
using var doc = Document.Open("path/to/document.pdf");
var absorber = new TextFragmentAbsorber("Hello");
absorber.Visit(doc.Pages[1]);
foreach (var fragment in absorber.TextFragments)
{
Console.WriteLine($"Found: {fragment.Text} size={fragment.FontSize}");
}
Call absorber.Visit(page) to process a single page. The resulting TextFragments collection contains TextFragment objects; each exposes Text (the string content) and FontSize (the rendered point size). A second overload TextFragmentAbsorber(searchPhrase, isRegex) accepts a regular expression flag for pattern-based search.
Interactive Actions
PdfAction is the factory class for all standard PDF action types. Four static factory methods cover the most common cases:
PdfAction.CreateUri(uri)— opens a URL in the user’s browser.PdfAction.CreateGoTo(pageIndex, fitType)— navigates to a specific page within the document.PdfAction.CreateJavaScript(script)— executes a JavaScript string in the PDF viewer.PdfAction.CreateLaunch(filePath)— launches an external file or application.
Actions are attached to pages through annotations. AnnotationCollection.AddLinkAnnotation(rectangle, action) places a LinkAnnotation over the specified rectangular region with the given action attached:
using var doc = Document.Create();
doc.Pages.Add();
var page = doc.Pages[1];
var action = PdfAction.CreateUri("https://example.org");
var annot = page.Annotations.AddLinkAnnotation(new Rectangle(10, 10, 100, 30), action);
Console.WriteLine(page.Annotations.Count); // 1
Console.WriteLine(annot.AnnotationType); // Link
AnnotationType.Link confirms the annotation was created as a link type. The Rectangle constructor takes left, bottom, right, and top coordinates in PDF user space units.
Document Metadata and Licensing
The Document class exposes IsLicensed (returns true in the FOSS build), JavaScript (a JavaScriptCollection for document-level scripts), and DocumentActions for lifecycle events such as BeforeClosing and BeforeSaving.
The FOSS package is installed via NuGet:
dotnet add package Aspose.Pdf.Foss --version 0.1.0-alpha
No license key is required. The MIT license permits use in commercial applications without attribution requirements beyond preserving the license notice.