easy way to clean up html

10 Mar 2008

Posted by acrollet

While I was working on a website migration for work, I found myself wanting a way to preserve formatting without keeping really poorly formatted html. (probably made by Word, a lot of blockquotes instead of list tags, etc.)

Anyway, I had good luck running the files through links (or lynx) -dump, and then using txt2html to re-htmlize them. Produced nice clean, well-formatted html. I still had to strip out the titles with sed, but this made a quick way of doing a clean job on lots of files with a minimum of effort.

Comments

tidy


Add new comment

The content of this field is kept private and will not be shown publicly.
By submitting this form, you accept the Mollom privacy policy.