10 best broken link checker tools to check your entire website

PERFORMED CHECKS

HTTP links (http:, https:)
After connecting to the given HTTP server the given path
or query is requested. All redirections are followed, and
if user/password is given it will be used as authorization
when necessary.
Permanently moved pages issue a warning.
All final HTTP status codes other than 2xx are errors.
HTML page contents are checked for recursion.
Local files (file:)
A regular, readable file that can be opened is valid. A readable
directory is also valid. All other files, for example device files,
unreadable or non-existing files are errors.
HTML or other parseable file contents are checked for recursion.
Mail links (mailto:)
A mailto: link eventually resolves to a list of email addresses.
If one address fails, the whole list will fail.
For each mail address we check the following things:
  1) Check the adress syntax, both of the part before and after
     the @ sign.
  2) Look up the MX DNS records. If we found no MX record,
     print an error.
  3) Check if one of the mail hosts accept an SMTP connection.
     Check hosts with higher priority first.
     If no host accepts SMTP, we print a warning.
  4) Try to verify the address with the VRFY command. If we got
     an answer, print the verified address as an info.
FTP links (ftp:)

  
  For FTP links we do:
  
  1) connect to the specified host
  2) try to login with the given user and password. The default
     user is «anonymous«, the default password is «anonymous@«.
  3) try to change to the given directory
  4) list the file with the NLST command

Telnet links («telnet:«)

  
  We try to connect and if user/password are given, login to the
  given telnet server.

NNTP links («news:«, «snews:«, «nntp«)

  
  We try to connect to the given NNTP server. If a news group or
  article is specified, try to request it from the server.

Unsupported links («javascript:«, etc.)

  
  An unsupported link will only print a warning. No further checking
  will be made.
  
  The complete list of recognized, but unsupported links can be found
  in the linkcheck/checker/unknownurl.py source file.
  The most prominent of them should be JavaScript links.

Work with the found links

Depending on the settings you select, the add-in will find good, suspicious, and broken links in the document.

  1. You can choose to work with broken, suspicious, or valid links by clicking the Filter icon and selecting the type of the links you need:
  2. Click on the link to see it in your Word document. If the link is suspicious, broken, or external, there will be a comment to it in the Description section:
  3. If you need to see where the link refers, you can right-click it in the pane and pick the Go to destination option from the list:
  4. To edit a link, right-click it and select Edit hyperlink. You will see the standard Word dialog box, that allows you to introduce the necessary changes.

    Note. It is not possible to edit cross-references.

  5. You can also remove a link. Right-click it in the list and choose the Remove link(s) option.

    Note. This will completely remove the selected link(s) from your document.

    Tip. You can delete several links at once by keeping the Ctrl key on your keyboard pressed and selecting the links you want to remove.

  6. Sometimes you may need to see the code field of all the links instead of their values. You can do this by right-clicking a link and selecting the Show field codes instead of their values option in the context menu. It can be useful if the add-in detects a broken link but when you click on it in the pane it doesn’t go anywhere. If you choose this option, you will see the codes of these «invisible» links so that you can edit or delete them.

Find and fix link problems the easy way!

This free on-line Checker not just tells you which of your hyperlinks are dead, but it will also show to your where exactly those stale references locate in your HTML code! This unique feature makes our checking service stand out among other available problem detection tools — by making it so easy for webmasters to find bad URLs and clean them up in no time.It’s never been so easy to locate and fix dead weblinks!

This free website validation tool reports only things that are really broken — unlike other popular solutions that list both good and bad hyperlinks all mixed together, making it very hard to comprehend and work with such «noisy» information. Also, our linking problem finder analyses the entire site — its web-content as a whole, keeps track of issues already reported and doesn’t repeat the same invalid URL unless you ask about it specifically: the URL Checker tool is very flexible!

Link checking

After a click on the Start button the validation process begins.

As you see the bookmark icons have changed.

       The nookmark is placed on bookmark validation queue and awaits to validation.

       Linkman validates this bookmark at the moment.

       All ok: Linkman could successfully get the web page.

       Page moved: The web page has moved, the bookmark path has been updated, the old path can be found in the bookmark Properties dialog.

       Page not found: The request for the web page was NOT successful, the page is probably offline or you have no authorization for it. You can look into bookmark Properties to see the exact status code.

       Connection timed out: The server has NOT responded at all within the desired time. The timeout value can be set in bookmark validation dialog.

       Transaction timed out: The server has responded to the request, but the connection has timed out during the transfer of the web page.

While the validation proceeds you can still manipulate the database.

Link Checking Tips

Best use of the Check for page changes feature

Linkman will indicate that many pages have changed after every bookmark validation (except of the first validation where no changes can be reported), but if you visit some sites you will see no significant change. What is the reason for this?

There are many pages available that change on every access — not the content of the page changes, but some statistical data, time stamps, visit-counters, advert links etc… Linkman removes by default most of this information, but it’s impossible to be 100% exact in this process. It would be best to exclude the sites where Linkman reports a change on every bookmark validation from the page change check (but, not from the validation itself).

You can perform this by turning on the Ignore page changes flag in the Properties dialog of every bookmark. This would be a pain for more than 10 bookmarks, but there is a trick to do it the easy way:

1. Validate the whole database

2. Validate the whole database immediately thereafter

Linkman will underline some bookmarks — they do most probably belong to the described category of always changed sites (unless the page has really changed within the last minutes).

Select all Query items (CTRL + A hotkey or shortcut menu item) and use the Set status function (Tools menu) to turn the Ignore page changes flag on.

Now, after a bookmark validation you can be sure that the underlined webpages have new content.

In the lmd file included with Linkman some bookmarks have already the Ignore page changes flag turned on.

WARNINGS

file-missing-slash
The file: URL is missing a trailing slash.
file-system-path
The file: path is not the same as the system specific path.
ftp-missing-slash
The ftp: URL is missing a trailing slash.
http-auth-unknonwn
Unsupported HTTP authentication method.
http-cookie-store-error
An error occurred while storing a cookie.
http-decompress-error
An error occurred while decompressing the URL content.
http-empty-content
The URL had no content.
http-moved-permanent
The URL has moved permanently.
http-robots-denied
The http: URL checking has been denied.
http-unsupported-encoding
The URL content is encoded with an unknown encoding.
http-wrong-redirect
The URL has been redirected to an URL of a different type.
https-certificate-error
The SSL certificate is invalid or expired.
ignore-url
The URL has been ignored.
mail-no-connection
No connection to a MX host could be established.
mail-no-mx-host
The mail MX host could not be found.
mail-unverified-address
The mailto: address could not be verified.
nntp-no-newsgroup
The NNTP newsgroup could not be found.
nntp-no-server
No NNTP server was found.
url-anchor-not-found
URL anchor was not found.
url-content-size-unequal
The URL content size and download size are unequal.
url-content-size-zero
The URL content size is zero.
url-content-too-large
The URL content size is too large.
url-effective-url
The effective URL is different from the original.
url-error-getting-content
Could not get the content of the URL.
url-obfuscated-ip
The IP is obfuscated.
url-warnregex-found
The warning regular expression was found in the URL contents.
url-whitespace

The URL contains leading or trailing whitespace.

OUTPUT TYPES

—verbose

text
Standard text logger, logging URLs in keyword: argument fashion.
html
Log URLs in keyword: argument fashion, formatted as HTML.
Additionally has links to the referenced pages. Invalid URLs have
HTML and CSS syntax check links appended.
csv
Log check result in CSV format with one URL per line.
gml
Log parent-child relations between linked URLs as a GML sitemap graph.
dot
Log parent-child relations between linked URLs as a DOT sitemap graph.
gxml
Log check result as a GraphXML sitemap graph.
xml
Log check result as machine-readable XML.
sitemap
Log check result as an XML sitemap whose protocol is documented at
http://www.sitemaps.org/protocol.html.
sql
Log check result as SQL script with INSERT commands. An example
script to create the initial SQL table is included as create.sql.
blacklist
Suitable for cron jobs. Logs the check result into a file
~/.linkchecker/blacklist which only contains entries with invalid
URLs and the number of times they have failed.
none
Logs nothing. Suitable for debugging or checking the exit code.

Why you need our online Link Checker

Due to lack of adequate problem detection tools (aka URL validators, web spiders, HTML crawlers, website’s health analyzers, etc.) it’s really very hard to identify what exact local and external (outbound) hyperlinks became dead, and it’s even harder to fix those because for cleaning you need to know precise location of the broken linking tag in the HTML code: without that you will need to scan through thousands source lines to find exact HREF (or other subtag) that causes the problem.

This is exactly where our on-line Spider tool truly shines: it will crawl your entire site — check all pages searching for issues and will detect invalid webpage references on your website, telling you precisely where to fix those! For each bad hyperlink found (both internal and outgoing) you will see a screen that contains page source and highlights the actual HTML tag containing the non-working url, so you can correct the rot right away and eventually repair your blog very quickly. That way your customers won’t be annoyed with ‘Page Not Found’ errors anymore when clicking your weblinks.

As 100% online tool running in the Internet, our free Website Scanner & Problem Detector can be used on any computer no matter if it’s Mac, PC, notebook / laptop, iPad (or even iPhone), Android or some other mobile device, and whether it runs Microsoft Windows, MacOS, Apple iOS, Android, Chrome OS, Linux or good old UNIX. Our deep Dead Link Checker can be used with all popular browsers, including (but not limited to) Chrome, Firefox, Safari, MS Edge, Opera, and Internet Explorer (IE). All this makes this analyzer a true cross-platform SEO tool always ready for your use! Because of that, thousands of web-developers, QA specialists, and webmasters use our validator for testing their internet projects — to quickly detect broken links and address those. Moreover, our HTTP server spider is capable of crawling and checking any website no matter if it’s coded by hand — with pure HTML / XHTML, or is based on PHP, ASP, JSP, Cold Fusion or is built using Drupal, Joomla!, WordPress, DotNetNuke, Magento, Blogger, TYPO3, or any other CMS and e-Commerce platforms.

We are getting a lot of positive feedback and many webmasters name this service as one of the best solutions available on the World-Wide-Web. Try it yourself to see why it’s so popular!

More cool & free features are coming soon including higher limits, MS Excel export, and for SEO experts — more configurable parameters.

There is also a no-limit commercial version of this online Link Checking Tool available that allows on-demand scanning without any limitations of this free demo edition. In addition, it’s able to:

  • validate sites of any size — large or even huge (it’s not a no-limit one, but the page limit can be set as high as you need),
  • scan individual sub-folders (URLs with slashes),
  • create advanced reporting layouts with column sorting and filtering,
  • get results exported in CSV format compatible with any spreadsheet apps,
  • have scheduled runs with delivery by email,
  • get automated reports sent to multiple recipients — daily, weekly, monthly, etc.

Analysis of freshly published articles using RSS and ATOM feeds is available as well. For details, please contact us anytime.

RECURSION¶

Before descending recursively into a URL, it has to fulfill several
conditions. They are checked in this order:

  1. A URL must be valid.

  2. A URL must be parseable. This currently includes HTML files, Opera
    bookmarks files, and directories. If a file type cannot be determined
    (for example it does not have a common HTML file extension, and the
    content does not look like HTML), it is assumed to be non-parseable.

  3. The maximum recursion level must not be exceeded. It is configured
    with the option and is unlimited per default.

  4. It must not match the ignored URL list. This is controlled with the
    option.

  5. The Robots Exclusion Protocol must allow links in the URL to be
    followed recursively. This is checked by searching for a “nofollow”
    directive in the HTML header data.

Как искать неработающие ссылки

Находить битые ссылки на сайте можно по-разному. Далее мы подробно расскажем о тех способах, которые не требуют платного софта и гарантируют максимально точную проверку. В обзор включены 100% бесплатные инструменты — без триалов, ограничений по количеству проверок и без урезанного функционала.

Яндекс.Вебмастер

Проверить сайт на битые ссылки можно в Яндекс.Вебмастере. Для этого нужно открыть панель управления и перейти в разделы «Индексирование» — «Статистика обхода».

Далее, переключиться на вкладку «Все страницы», в фильтре «Код ответа» выбрать 404 Not Found и нажать «Применить».

Список проблемных ссылок можно выгрузить в формате XLS или CSV таблиц.

Google Search Console

Аналогичную процедуру можно провести с помощью Google Search Console. Здесь работает следующий алгоритм действий. Заходим в панель управления, переходим в разделы «Индекс» — «Покрытие», и во вкладке «Ошибки» получаем развернутый отчет о текущих недочетах на сайте.

Xenu Link Sleuth

Проверенный инструмент для эффективного поиска битых ссылок. Xenu Link Sleuth — это полностью бесплатная программа, устанавливаемая на ПК. По функциональным возможностям она является полноценным аналогом таких популярных коммерческих анализаторов как Netpeak Spider или Screaming Frog SEO Spider.

Несмотря на то, что Xenu не обновляется с 2010 года, она по-прежнему ценится многими специалистами как многофункциональный инструмент для всестороннего аудита внутренней структуры сайта. Помимо поиска неработающих ссылок с ее помощью можно решать много других SEO-задач:

  • автоматически генерировать Sitemap;
  • находить страницы с большой задержкой отдачи;
  • фильтровать документы с неуникальными тайтлами;
  • искать страницы с большим уровнем вложенности;
  • смотреть статистику внутренних и внешних ссылок по конкретной странице;
  • искать картинки с непрописанным атрибутом alt, оптимизируя изображения на сайте.

При всех своих возможностях программа предельно проста в освоении. Из относительных недостатков — устанавливается только на Windows.

LinkChecker

Ссылочный валидатор со свободной GPL-лицензией. Инструмент доступен в виде десктопной программы. С его помощью можно проверить ссылки как на отдельных страницах, так и полностью просканировать весь сайт. В отличие от Xenu Link Sleuth этот софт можно использовать не только на Windows, но и на Linux или MacOS. Из недостатков: придется потратить время на то, чтобы разобраться с инсталляцией и понять, как работает программа. Интерфейс на английском.

Screaming Frog SEO Spider

SEO фрог – это известный каждому оптимизатору инструмент для проведения анализа сайта. Программа устанавливается на компьютер, бесплатная версия имеет ограничение на 500 проверяемых урлов. Отлично подходит для владельцев сайтов, которые занимаются самостоятельным продвижением собственного проекта. В случае больших команд обязательно необходима покупка лицензии.

Brokenlinkcheck

Среди онлайн-инструментов для поиска битых ссылок лучшим из решений, пожалуй, является сервис www.brokenlinkcheck.com. Сразу отметим, что это условно бесплатный инструмент, но с весьма привлекательными условиями пользования. Free-версия предусматривает бесплатную проверку 3000 страниц сайта без ограничений в количестве сканируемых ссылок. Для использования инструмента на потоке такого лимита окажется недостаточно, но он отлично подойдет тем, кто хочет провести единоразовый аудит собственного проекта.

Сервис максимально прост, понятен и не требует регистрации. Все что нужно — ввести адрес интересующего сайта и запустить сканирование. Алгоритм подготовит развернутый отчет, в котором будет представлена информация не только по внутренним, но и внешним URL. Валидатор с высокой точностью определит битые ссылки и покажет, где именно находятся проблемные элементы в вашем HTML-коде.

Broken Link Checker — плагин для WordPress

Разговор о ссылочных валидаторах был бы неполным без затрагивания темы плагинов для WordPress. Лучшим бесплатным решением в этом вопросе единогласно считают модуль Broken Link Checker. Он не самый простой в освоении, но потратить время на то, чтобы разобраться в нем — точно стоит, если вы владелец сайта на WordPress. Основное назначение плагина поиск и отслеживание битых ссылок. Он сканирует весь контент на сайте, включая комментарии, блогроллы, содержимое пользовательских полей и т.д., уведомляя о наличии выявленных проблем в панели управления или по почте. Помимо этого плагин обнаруживает недоступные изображения и ошибочные редиректы. Редактировать битые ссылки можно прямо из панели модуля, что значительно ускоряет работу.

What is the nature of invalid hyperlinks?

With the growth of web-site content, it’s getting harder and harder to manage relations between individual webpages and keep track of weblinks within the site. Unfortunately, there are no perfect web-site integrity tools or services that can check and enforce a proper relationship between pages, keep track of moved content / renamed webpages, and update each corresponding URL automatically. With time this causes some of your internal links to become obsolete, stale, odd, dangling, and simply — dead because they don’t lead to valid resources anymore. Modern content management systems (CMS like Joomla) and blog software may aggravate the problem even more — by replicating the same broken links across numerous webpages which they generate dynamically, so people will be getting 404 errors much more frequently. Your visitors are going to get 404 error codes and see ‘Page Not Found’ messages (or other unsuccessful HTTP responses) each time when they try to access those resources.
With outbound references, the situation is even worse: the website you are linking to can change names and locations of their pages any time without any notice breaking previously working backlinks. The external servers can be brought down (temporarily or forever) or their domains expire or be sold. Alas, you don’t have any control over such things, so the only good remedy would be performing regular sanity tests probing each single outgoing reference to make sure it’s still alive and NOT pointing at some non-existing content.

PROXY SUPPORT¶

To use a proxy on Unix or Windows set the , or
environment variables to the proxy URL. The URL should be of
the form
http://host.
LinkChecker also detects manual proxy settings of Internet Explorer
under Windows systems, and GNOME or KDE on Linux systems. On a Mac use
the Internet Config to select a proxy.
You can also set a comma-separated domain list in the
environment variables to ignore any proxy settings for these domains.

Setting a HTTP proxy on Unix for example looks like this:

$ export http_proxy="http://proxy.example.com:8080"

Proxy authentication is also supported:

$ export http_proxy="http://user1:mypass@proxy.example.org:8081"

Setting a proxy on the Windows command prompt:

Find links

Link Checker looks for valid, suspicious, and broken links in Word documents. Perform the following actions to find links in your file:

  1. Open your MS Word document with the links you want to check.
  2. Open the add-in by clicking on its icon under the AbleBits.com tab:
  3. Select the range where the add-in will search for links:
    • You can choose the Entire document option to scan through the whole Word file.
    • You can scan only a selected piece of the document if you choose Selection from the drop-down menu.
    • To check only the page you are on, select Current page.
    • If you want to scan some particular pages of the document, choose Pages and specify the initial and final page numbers in the corresponding fields.
  4. Tick the Check external links box to search both bookmarks and hyperlinks:
  5. Click the Find links button and you will see the list of all the found links.

    Note. If you cancel the search when it is still in process, some links may be left unchecked. To review them, press the Filter icon and then choose Unchecked:

RECURSION

1. A URL must be valid.

2. A URL must be parseable. This currently includes HTML files,
   Opera bookmarks files, and directories. If a file type cannot
   be determined (for example it does not have a common HTML file
   extension, and the content does not look like HTML), it is assumed
   to be non-parseable.

3. The URL content must be retrievable. This is usually the case
   except for example mailto: or unknown URL types.

4. The maximum recursion level must not be exceeded. It is configured
   with the —recursion-level option and is unlimited per default.

5. It must not match the ignored URL list. This is controlled with
   the —ignore-url option.

6. The Robots Exclusion Protocol must allow links in the URL to be
   followed recursively. This is checked by searching for a
   «nofollow» directive in the HTML header data.

Note that the directory recursion reads all files in that
directory, not just a subset like index.htm*.

Наши клиенты

В экономике нет ни одной отрасли, которая могла бы бесперебойно функционировать без средств защиты информации от вирусозависимых компьютерных инцидентов (ВКИ) — то есть без антивируса.

С 1992 года Dr.Web обеспечивает базовую потребность бизнеса — защиту цифровых активов предприятий, помогает им работать в безопасной цифровой среде.

Dr.Web защищает:С 2000 года – более 400 000 ПК Министерства обороны России;С 2003 года – защита всех устройств всех избирательных комиссий РФ специальным антивирусом, разработанным для защиты ГАС «Выборы»;С 2010 года –более 100 000 устройств Судебного департамента при Верховном суде РФ;С 2012 года – более 40 000 устройств в государственных учреждениях Республики Татарстан;С 2017 года – более 30 000 устройств в органах государственной власти Республики Удмуртия;С 2018 года – более 50 000 устройств во ФСИН России;С 2019 года – более 30 000 устройств в государственных организациях Пермского края;С 2019 года – 50 000 устройств на предприятиях, входящих в структуру Госкорпорации «Росатом»;С 2020 года – более 20 000 устройств в государственных структурах Белгородской области;С 2020 года – более 27 000 устройств в государственных структурах Омской области.

Обширная география пользователей Dr.Web и клиенты, представляющие все отрасли экономики, свидетельствуют о высокой степени доверия к продуктам компании «Доктор Веб».

COOKIE FILES

Scheme (optional)
Sets the scheme the cookies are valid for; default scheme is http.
Host (required)
Sets the domain the cookies are valid for.
Path (optional)
Gives the path the cookies are value for; default path is .
Set-cookie (optional)
Set cookie name/value. Can be given more than once.

Multiple entries are separated by a blank line.
The example below will send two cookies to all URLs starting with
http://example.com/hello/ and one to all URLs starting
with https://example.org/:

 Host: example.com
 Path: /hello
 Set-cookie: ID=»smee»
 Set-cookie: spam=»egg»

 Scheme: https
 Host: example.org
 Set-cookie: baggage=»elitist»; comment=»hologram»

Conclusion

In this article, we have discussed the Top Website Broken Link Checker tools that are available in the market.

These above-discussed tools are most popular, and their features and pricing can be affordable as per the industry needs. The dead link checker tools are the quickest, easily installable and free of cost tools that are available for users. You can choose any tool as per your needs.

From our research, Xenu’s Link Sleuth is the best value for Windows users, while Screaming Frog SEO Spider comes after that. For Mac Users, Integrity Tool is the best choice for all type of industries.

=>> Contact us to suggest a listing here.

EXAMPLES¶

The most common use checks the given domain recursively:

$ linkchecker http://www.example.com/

Beware that this checks the whole site which can have thousands of
URLs. Use the option to restrict the recursion depth.

Don’t check URLs with /secret in its name. All other links are
checked as usual:

$ linkchecker --ignore-url=/secret mysite.example.com

Checking a local HTML file on Unix:

$ linkchecker ../bla.html

Checking a local HTML file on Windows:

C:\> linkchecker c:empest.html

You can skip the http:// url part if the domain starts with
www.:

$ linkchecker www.example.com

You can skip the ftp:// url part if the domain starts with ftp.:

$ linkchecker -r0 ftp.example.com

Generate a sitemap graph and convert it with the graphviz dot utility:

Reviews

http-equiv=»Content-Type» content=»text/html;charset=UTF-8″>lass=»plugin-reviews»>

Verify just 500 link and very expensive the professional version.

Works but very limited with free version and so expensive for «pro».
=> uninstall

The lite one is useless, and the professional hasn’t a price ( nothing to guide about), depends on site size, I think.
Very disappointed.

Initial impressions after download and a test are very good.

Straightforward interface, quick scan and understandable results window.

While many features look promising, this can only be run on a live website.

This lends to «cowboy coding», or possibly breaking a live website.

Thus it is not useful when making big changes to one’s website, such as if you want to change thumbnail image sizes. In this case you might need to «break» the website by first regenerating thumbnails, then «fix» it by updating the newly broken links with th eupdated image url’s.

However for many people, and many other use cases, this plugin could work well.
For example, if your live website already has broken links, and you just want to clean them up, it would be fine.
And if you don’t use a dev server ever, then this restriction will not impact you whatsoever.
Finally, if you do your dev work on a live dev server, accessible to other websites, then this could also be a great plugin you you!!

I created a new website with all links in https and all the plugins tell me is: The check of your website failed with the error: URL not reachable.

I heard it is maybe an server issue and not a issue of the plugin. I don’t know, but I give the plugin 3 stars for the hours of programming.

Оцените статью
Рейтинг автора
5
Материал подготовил
Андрей Измаилов
Наш эксперт
Написано статей
116
Добавить комментарий