Available for: UI for ASP.NET MVC | UI for ASP.NET AJAX | UI for Blazor | UI for WPF | UI for WinForms | UI for Silverlight | UI for Xamarin | UI for WinUI | UI for ASP.NET Core | UI for .NET MAUI

New to Telerik Document Processing? Download free 30-day trial

SummarizationProcessor

The SummarizationProcessor class enables you to generate concise summaries of PDF documents using Large Language Models (LLMs). It inherits from the abstract AIProcessorBase class, which provides common functionality for all AI processors. It automatically handles large documents by splitting them into smaller chunks when needed, making it suitable for documents of any size.

Public API

Property Description
Settings Gets or sets the settings that will be used for summarization.
Method Description
Task Summarize(ISimpleTextDocument document) Generates a summary of the provided document. The parameter document is an ISimpleTextDocument containing the text to be summarized.
Event Description
EventHandler SummaryResourcesCalculated Triggered before the actual summarization process begins, providing information about the estimated resource usage. The SummaryResourcesCalculatedEventArgs provides properties: EstimatedCallsRequired (number of API calls required), EstimatedTokensRequired (number of tokens to be processed), and ShouldContinueExecution (boolean flag indicating whether to proceed with summarization, default is true for single-call and false for multi-call operations).

SummarizationProcessorSettings

The SummarizationProcessorSettings class provides configuration options for the summarization process:

  • PromptAddition: Gets or sets an addition for the prompt used for summarization. It can be used for clarification purposes.

[C#] Example 1: Configuring SummarizationProcessorSettings

// Create a summarization processor with settings
using (SummarizationProcessor summarizationProcessor = new SummarizationProcessor(iChatClient, maxTokenCount))
{
    // Configure the summarization settings
    summarizationProcessor.Settings.PromptAddition = "Focus on the key points and main arguments. ";

    // Rest of the code...
}

Usage Example

The following example demonstrates how to use the SummarizationProcessor to generate a summary of a PDF document. To set up the AI client as shown in this example, see the AI Provider Setup section.

Handling Large Documents

For large documents that exceed the token limit of the model, SummarizationProcessor automatically splits the document into smaller chunks and processes them separately:

  1. The document is split into chunks that fit within the model's token limit.
  2. Each chunk is summarized individually.
  3. The individual summaries are combined and sent for a final summarization.

This approach allows the processor to efficiently handle documents of any size, but it increases the number of API calls required. The SummaryResourcesCalculated event provides information about the expected resource usage, allowing you to decide whether to proceed with the operation.

[C#] Example 2: Using SummarizationProcessor

public async void SummarizeDocument()
{
    // Load the PDF document
    string filePath = @"path\to\your\document.pdf";
    PdfFormatProvider formatProvider = new PdfFormatProvider();
    RadFixedDocument fixedDocument;

    using (FileStream fs = File.OpenRead(filePath))
    {
        fixedDocument = formatProvider.Import(fs, TimeSpan.FromSeconds(10));
    }

    // Convert the document to a simple text representation
    ISimpleTextDocument plainDoc = fixedDocument.ToSimpleTextDocument();

    // Set up the AI client (Azure OpenAI in this example)
    string key = "AZUREOPENAI_KEY";
    string endpoint = "AZUREOPENAI_ENDPOINT";
    string model = "gpt-4o-mini";

    Azure.AI.OpenAI.AzureOpenAIClient azureClient = new AzureOpenAIClient(
        new Uri(endpoint),
        new Azure.AzureKeyCredential(key),
        new Azure.AI.OpenAI.AzureOpenAIClientOptions());
    ChatClient chatClient = azureClient.GetChatClient(model);

    IChatClient iChatClient = new OpenAIChatClient(chatClient);
    int maxTokenCount = 128000;

    using (SummarizationProcessor summarizationProcessor = new SummarizationProcessor(iChatClient, maxTokenCount))
    {
        // Configure the summarization settings (optional)
        summarizationProcessor.Settings.PromptAddition = "Focus on the key points and main arguments. ";

        // Subscribe to the SummaryResourcesCalculated event to monitor token usage
        summarizationProcessor.SummaryResourcesCalculated += (sender, e) =>
        {
            Console.WriteLine($"This summarization will require approximately {e.EstimatedTokensRequired} tokens " +
                             $"and {e.EstimatedCallsRequired} API calls.");

            // For large documents, you need to explicitly approve the operation
            // to avoid unexpected API usage and costs
            if (e.EstimatedCallsRequired > 1)
            {
                Console.WriteLine("Document is large and will require multiple API calls.");

                // Set to true to proceed with summarization, or leave as false to cancel
                e.ShouldContinueExecution = false;
            }
        };

        try
        {
            // Generate the summary
            string summary = await summarizationProcessor.Summarize(plainDoc);
            Console.WriteLine("Document Summary:");
            Console.WriteLine(summary);
        }
        catch (OperationCanceledException)
        {
            Console.WriteLine("Summarization was cancelled.");
        }
    }
}

See Also

In this article