After migrating the site to a new host, I faced the arduous task of relocating all the blog articles. Unfortunately, the original Jekyll site files were lost, leaving me with only the site’s HTML files to work from. To expedite the transition, I created a shell script that runs on the host server, leveraging the powerful Pandoc tool. This script iterates through all the .html files in the current directory and converts them into Markdown format.

If you haven’t already, you’ll need to install Pandoc. Fortunately, it’s available through most package managers. Here’s how you can do it on different platforms:

  1. Windows:
    • Download the latest installer from the Pandoc website.
    • Run the installer and follow the setup wizard.
  2. macOS:
    • Visit the Pandoc download page and choose the macOS package installer.
    • Alternatively, you can use Homebrew: brew install pandoc.
  3. Linux:
    • Check if Pandoc is available in your package manager (e.g., Debian, Ubuntu, Fedora, etc.).
    • For example, on Ubuntu/Debian: sudo apt update && sudo apt install pandoc.

Once Pandoc is installed, your shell script should work smoothly. Happy converting! 😊

#!/bin/bash

# Loop through all HTML files in the current directory
for html_file in *.html; do
    # Generate the corresponding Markdown filename
    md_file="${html_file%.html}.md"
    # Convert HTML to Markdown using pandoc
    pandoc -f html -t markdown_github-raw_html-native_divs-native_spans "$html_file" -o "$md_file"
    echo "Converted $html_file to $md_file"
done