Open and Compare the XML Contents of Word Documents (.docx)
Environment
Product Version | 2024.2.524 |
Product | Telerik UI for WPF |
Description
How to get the XML of Word (.docx) documents and compare their contents using Visual Studio.
Solution
The .docx file format is an archive that can be unzipped. This gives several folders and files with the document contents, styles and other settings. To access the content, you can find and open the document.xml file. Then, you can compare the text in the document.xml files of the corresponding files.
This article describes how to do this using the Open XML Package Editor for Modern Visual Studios extension tool for easier access to the files of the .docx file, without the need to unzip it. Visual Studio built-in diff tool can be used to compare the extracted XML contents.
Install the Open XML Package Editor for Modern Visual Studios Visual Studio extension. This will allow you to open the .docx files content directly in Visual Studio.
Create a new Blank Solution in Visual Studio. This is required in order to use the Visual Studio's Compare Selection feature.
-
Include the two .docx files that will be compared in the solution.
-
Create two new empty .xml files in the solution.
-
Double click the first .docx file in the solution and open the /word/documents.xml file.
If the file is not formatted you can do that by using the Edit-->Advanced-->Format Selection option. The default shortcut for this action is Ctrl+K, Ctrl+F.
-
Copy the content of the documents.xml file to one of the empty .xml files and save it.
Repeat step 5 also for the other .docx file and copy its XML contents to the other empty .xml file, and then save it.
-
Select the .xml files in the Solution Explorer and right click on any of the files to display the context menu. From the menu select the Compare Selection option.
This will open a new diff view showing the comparison between the two .xml files