Tuesday, February 12, 2013

How to save a web page as pdf preserving the hyperlinks in Ubuntu

When trying to save a web page as a pdf using 'Print to File' printer, it doesn't preserve the hyperlinks. So, we won't be able to click on them and navigate to the web page of the respective url.

I was searching for a solution for this and came across an answer [1] at askubuntu which mentioned wkhtmltopdf! [2]. It preserves the urls in the exported pdf!

Below explains how to get it installed. The steps are taken from the mentioned askubuntu answer.

Ubuntu repos contain the wkhtmltopdf package, but it doesn't provide the hyperlink-preserving-functionality we need. Nevertheless, we need to install it to get the required dependencies.
sudo apt-get install wkhtmltopdf
Then, we can get the wkhtmltopdf static binary from [3], extract it and put it in the /usr/bin directory.
sudo mv wkhtmltopdf-i386 /usr/bin/wkhtmltopdf-static
Now we can convert a web page to pdf preserving the hyperlinks!
wkhtmltopdf-static http://en.wikipedia.org/ wikipedia.pdf
