PBworks allows the user the download a .zip file of all of the pages from a wiki. My downloaded backup contained 44 .html files, many of which were nested into subfolders. Instead of figuring out to recursively loop thought the subfolders, I used a find command, which searches subfolders by default. In my script below, the find command is inserted using command substitution. The converted files are saved to the original subdirectory, keeping .html in the filename, but adding .md as the file extension.
I tried out two scripts to do the text conversion. First, I tried html2text, which worked great. Out of curiosity, I also tried using Pandoc. I ended up preferring how Pandoc formatted the final Markdown text. However, one feature of html2text I liked was the option to use --ignore-links, since most of the links were relative to the PBworks domain and would be broken when used offline. I decided it might be useful to see where the original link pointed to, so I decided to skip the --ignore-link option.
Here is the script I created:
3 # Usage: html2md /path/to/file
5 # Set $IFS so that filenames with spaces don't break the loop
7 IFS=$(echo -en "nb")
9 # Loop through path provided as argument
10 for x in $(find $@ -name '*.html')
12 pandoc -f html -t markdown -o $x.md $x
15 # Restore original $IFS
Line 6 is necessary so that the script will work with filenames that contain spaces. The trick, as suggested in a Linux forum, is to set the internal field separator not to use spaces.
For a paid account, PBworks allows the user to download all pages, past revisions and files, but I was using a free account. ↩
Recently, my wife had students in her library create Photostory projects. This wasn’t her first choice of applications for a student project, but the Mac lab was in use for testing. Photostory outputs .wmv files, but my wife wanted to be able to merge the files using iMovie so that teachers could cue up one movie on their classroom presentation stations, which are Macs.
My wife thought she would need to use a service such as Zamzar to convert the files from .wmv into a format that iMovie could import, which seemed like a tedious, impractical task. I thought that perhaps ffmpeg, a command line tool, could help.
Steven Frank, the co-founder of Panic Software, creators of Transmit, my favorite ftp program, recently published an ebook titled How to Count: Programming for Mere Mortals, Vol. 1. I highly recommend buying a copy as unprotected pdf or epub for ¢299. While there are many introductory articles available for free that explain how to count in binary, this book quickly moves on to more advanced topics such as hexadecimal, signed integers and floats.