How To Find Broken Links In PDF Files

0 Comments
Editor Ratings:
User Ratings:
[Total: 2 Average: 4]




This tutorial explains how to find broken links in PDF files. To do this I will use a free command line software called PDFx. Though, it is not meant to find broken links in a PDF file, but it has a feature to check links in a PDF file. Using this simple utility, you can list all the links that are listed in a PDF file and also download them if it is a direct link. It will show all the broken links, alongwith page number on which broken link is present.

This PDF link checker requires Python to be installed on your PC, which is very easy to install. After that, you need to simply execute this software from command line, provide path of PDF file in which you want to check broken links, and it will immediately show which all links are broken. You can even save the broken links report to a text file.

Find Broken Links In PDF Files

Even though there are many software to check broken links on websites, or even check broken links in Bookmarks, but checking broken links in PDF files is a whole different ball game. I tried a lot looking for a GUI based software that could do the same, but found only this command line software to check broken links in PDF files. Despite being command line software, it is pretty easy to use. Also, as this software just requires Python, so actually you can use it not only on Windows, but on Mac and Linux as well.

How To Find Broken Links In PDF Files?

PDFx was originally meant to download all the references which are given in a PDF files. Apart from that, it comes with a link checker module that helps to find broken links inside a PDF file. You can easily find broken links in a PDF file along with the error code and page number.

To use PDFx, you will need to install Python in your PC. Once installed, follow the steps below to find broken links in a PDF file.

Step 1: Open Windows Command prompt and type the following command in it. After running that command, it will install PDFx on your PC (this will be automatically installed by Python, so it is important to first install Python). If the installation goes well, then it will show the successfully finished installing message in the end.

easy_install pdfx

PDFX installing

Now, PDFx has been installed in your PC. You can access it from any location of your PC using Command prompt.

Step 2: Navigate to the folder which has your  PDF file in which you want to find broken links. Use Shift+right-click to open the Command prompt there.

Opening command window

Step 3: To check all the broken links, type the following command in the Command window and hit Enter. After that, it will start scanning the links and list the broken links with the error code (like, 404, 403, etc.) and page number. See the below screenshot.

pdfx [PDF_filename] -c

PDFX finding broken links

If you want to store the results of the broken links in a text file, then just append “> filename.txt” at the end of the command.

PDFX finding broken links with details in text file

So, in this way you can easily find broken links in PDF files using PDFx. The software does what it promises by listing all the broken links along with the error code and page number.

Do note that it tests only web links for broken links, and not links to other files.

Conclusion

PDFx makes it pretty easy to find broken links in PDF files. I really like the fact that it gives error code as well as page number also with each broken link. I wish the output was formatted a bit better, so that I could open it as Excel or CSV. Nevertheless, this is the only software I was able to find that could even find broken links in PDF files, so I will take what it gives. If you know of some other software that can find 404 errors in PDF files, or can check multiple PDF files for broken links together, do let me know in comments below.

Editor Ratings:
User Ratings:
[Total: 2 Average: 4]
Works With: Windows, Linux, Mac
Free/Paid: Free

Leave A Reply

 

Get 100 GB FREE

Provide details to get this offer