Wordup
Wordup is a single-purpose online tool I built to convert content from Word documents into HTML or Markdown.
First, it uses the built-in paste tools of CKEditor 4. Second, it passes through vanilla JS to clean up spacing and replaces some strings. Finally, it spits out clean HTML. Check a box and Turndown.js converts to Markdown.
But converting Word documents to HTML is a solved problem, right?
- Search for "Word to HTML conversion"
- Search for "Word to clean HTML conversion"
- Consider pasting Word document contents into mystery text boxes on several online conversion tools
- Wonder how these tools actually work
- Wonder about the privacy policies of these tools
- Close browser, open Word document
- Copy content, paste into text editor
- Begin wrapping text in HTML tags
- Give up
- Open Word document, save as HTML
- Open HTML in text editor
- Cry a little
- Start using find and replace to remove extra markup
- Graduate to regex searches
- Eventually arrive at relatively clean HTML
- Realize that someone sent an updated version of the Word document while working through steps 1-15
- Cry a little
WYSIWYG editors in most CMS platforms deal with pasting Word documents, right?
- Search for JavaScript-based WYSIWYG editors
- Pick one
- Create an HTML page with two
<textarea>
fields - Hook WYSIWYG editor into first
<textarea>
- Read documentation
- Figure out how to get converted text out of WYSIWYG
<textarea>
and into second<textarea>
as HTML - Notice converted HTML still needs some love
- Write extra white space and string replacement rules to send converted text through
- End up with really clean HTML in the second
<textarea>
- Cry a little
- Wonder what else you can do
- Add markdown conversion and link helpers
- Tell people about it
Take a look at the Wordup code over on GitHub.