Read Saved PDF File
The automation scenario requires a saved PDF file to be opened, then to read its content and to validate it.
The described approach using a 3rd party and open source dll is applicable for WPF and Desktop tests. Test Studio web tests let's you validate the PDF file in the browser out-of-the-box.
Solution
-
We prepared a sample using the third party dll iTextSharp.dll. It can be downloaded from its official page here.
Note
You can choose any other external library with similar functionality and implement the proper code for the actions that suits your needs. Copy the unzipped dll into the project root folder and add reference to it in the Test Studio project from that location.
-
Create a coded step in your test and add the following usings, or Imports for VB.Net, on top:
using iTextSharp.text.pdf; using iTextSharp.text.pdf.parser; using System.IO; ```
Imports iTextSharp.text.pdf Imports iTextSharp.text.pdf.parser Imports System.IO ```
-
The below snippet opens a PDF file, which is stored locally on the disc, then reads and outputs its content to the test execution log file. The sample uses C# StringBuidler Class to append the text from the PDF file.
// Define the name of the file to open string fileName = "C:\\pathToYourPDF\\PDFName.pdf"; // Define the file to store the read from PDF content StringBuilder text = new StringBuilder(); // Verify if the PDF file exists and open it if (File.Exists(fileName)) { // Initilize the pdfReader PdfReader pdfReader = new PdfReader(fileName); // Go through the pages of the PDF file, read its content and append it for (int page = 1; page <= pdfReader.NumberOfPages; page++) { ITextExtractionStrategy strategy = new SimpleTextExtractionStrategy(); string currentText = PdfTextExtractor.GetTextFromPage(pdfReader, page, strategy); currentText = Encoding.UTF8.GetString(ASCIIEncoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.Default.GetBytes(currentText))); text.Append(currentText); } // Output the collected text in the test execution log file Log.WriteLine(text.ToString()); // Close the pdfReader pdfReader.Close(); } ```
' Define the name of the file to open Dim fileName AsString = "folder\\pdfFileName.pdf" ' Define the file to store the read from PDF content Dim text AsNew StringBuilder() ' Verify if the PDF file exists and open it If File.Exists(fileName) Then ' Initilize the pdfReader Dim pdfReader AsNew PdfReader(fileName) ' Go through the pages of the PDF file, read its content and append it For page AsInteger = 1 To pdfReader.NumberOfPages Dim strategy As ITextExtractionStrategy = New SimpleTextExtractionStrategy() Dim currentText AsString = PdfTextExtractor.GetTextFromPage(pdfReader, page, strategy) currentText = Encoding.UTF8.GetString(ASCIIEncoding.Convert(Encoding.[Default], Encoding.UTF8, Encoding.[Default].GetBytes(currentText))) text.Append(currentText) Next ' Output the collected text in the test execution log file Log.WriteLine(text.ToString()) ' Close the pdfReader pdfReader.Close() EndIf ```