The Word Unmunger is a small Python program which removes much of the HTML cruft produced by Microsoft Word 2002 (Word version 10), making the files much easier to edit by hand. It removes XML namespace declarations, smart tags, meta tags, HTML comments, style sheets, DIVs, the Microsoft Office file list, CSS classes, and Microsoft Office grammar and spelling error markers.
| Tags | Text Processing |
|---|---|
| Licenses | MIT/X |
| Implementation | Python |
Recent releases


Changes: The program no crashes on larger documents due to limitations in Python's default regular expression implementation (sre). The pre implementation is now used instead. A debug mode that prints regular expressions as they're used was added, along with more robust handling of command line arguments.


Changes: Based on a request from a user, Word Unmunger now features a batch mode for automatic processing of several files at once. The code has also been cleaned up to allow new unmunging rules to be added more easily.


Changes: This release adds a new filter for files exported from Word X for Macintosh. Word X puts in a large number of <![ ... ]> tags for conditionals. These are now removed.
Islamic toolbar for watching Islamic TV channels, listen Islamic radio, Islamic Search engine and more..