New to Telerik Document Processing? Download free 30-day trial

Removing Hyperlinks from Text with RadFlowDocument

Environment

Version Product Author
2023.3.1106 RadWordsProcessing Desislava Yordanova

Description

This article demonstrates a sample approach how to remove hyperlinks from text in an HTML document using the RadFlowDocument from RadWordsProcessing.

Before After
Text with Hyperlinks Text without Hyperlinks

Solution

The hyperlinks are stored with the help of FieldCharacter in RadFlowDocument. More information about the internal structure of the hyperlink fields is available in the following article: Hyperlink Field.

To remove hyperlinks from text in an HTML document using RadFlowDocument, follow these steps:

  1. Load the HTML document into RadFlowDocument using the HtmlFormatProvider.
  2. Enumerate the FieldCharacters elements in the document and delete the content of the hyperlink fields. The DeleteContent method removes the hyperlink field elements and leave only the text run that store the text itself.
  3. Enumerate the Run elements in the document with the custom Hyperlink style and change their style to Normal.
        private static RadFlowDocument RemoveHyperLinksFromHtml(string filePath = "sample.html")
        {
            Telerik.Windows.Documents.Flow.Model.RadFlowDocument document;
            using (Stream input = File.OpenRead(filePath))
            {
                HtmlFormatProvider provider = new HtmlFormatProvider();
                document = provider.Import(input);
                RadFlowDocumentEditor editor = new RadFlowDocumentEditor(document);
                var hyperlinkElements = document.EnumerateChildrenOfType<FieldCharacter>().Where(x => x.FieldCharacterType == FieldCharacterType.Start).ToList();
                foreach (FieldCharacter hyperlink in hyperlinkElements)
                {
                    editor.DeleteContent(hyperlink.FieldInfo.End, hyperlink.FieldInfo.End);
                    editor.DeleteContent(hyperlink.FieldInfo.Start, hyperlink.FieldInfo.Separator);
                }
                var hyperlinkRuns = document.EnumerateChildrenOfType<Run>().Where(x => x.StyleId.Contains("Hyperlink")).ToList();
                foreach (Run r in hyperlinkRuns)
                {
                    r.StyleId = "Normal";
                }
                string rawHtmlContent = provider.Export(document);
            }
            return document;
        }
In this article