30+ парсеров для сбора данных с любого сайта

Crawl Overview Report

Under the new ‘reports’ menu discussed above, we have included a little ‘crawl overview report’. This does exactly what it says on the tin and provides an overview of the crawl, including total number of URLs encountered in the crawl, the total actually crawled, the content types, response codes etc with proportions. This will hopefully provide another quick easy way to analyse overall site health at a glance. Here’s a third of what the report looks like –

We have also changed the max page title length to 65 characters (although it seems now to be based on pixel image width), added a few more preset mobile user agents, fixed some bugs (such as large sitemaps being created over the 10Mb limit) and made other smaller tweaks along the way.
As always, we love any feedback and thank you again for all your support. Please do update your to try out all the new features.

3) View The Source Of The Broken Links By Clicking The ‘Inlinks’ Tab

Obviously you’ll want to know the source of the broken links discovered (which URLs on the website link to these broken links), so they can be fixed. To do this, simply click on a URL in the top window pane and then click on the ‘Inlinks’ tab at the bottom to populate the lower window pane.

You can click on the above to view a larger image. As you can see in this example, there is a broken link to the BrightonSEO website (https://www.brightonseo.com/people/oliver-brett/), which is linked to from this page – https://www.screamingfrog.co.uk/2018-a-year-in-review/.

Here’s a closer view of the lower window pane which details the ‘inlinks’ data –

‘From’ is the source where the 404 broken link can be found, while ‘To’ is the broken link. You can also see the anchor text, alt text (if it’s an image which is hyperlinked) and whether the link is followed (true) or nofollow (false).

It looks like the only broken links on our website are external links (sites we link out to), but obviously the SEO Spider will discover any internal broken links if you have any.

macOS

Open a terminal, found in the Utilities folder in the Applications folder, or directly using spotlight and typing: ‘Terminal’.

There are two ways to start the SEO Spider from the command line. You can use either the open command or the ScreamingFrogSEOSpiderLauncher script. The open command returns immediately allowing you to close the Terminal after. The ScreamingFrogSEOSpiderLauncher logs to the Terminal until the SEO Spider exits, closing the Terminal kills the SEO Spider.

To start the UI using the open command:

open "/Applications/Screaming Frog SEO Spider.app"

To start the UI using the ScreamingFrogSEOSpiderLauncher script:

/Applications/Screaming\ Frog\ SEO\ Spider.app/Contents/MacOS/ScreamingFrogSEOSpiderLauncher

To see a full list of the command line options available:

/Applications/Screaming\ Frog\ SEO\ Spider.app/Contents/MacOS/ScreamingFrogSEOSpiderLauncher --help

The following examples we show both ways of launching the SEO Spider.

To open a saved crawl file:

open "/Applications/Screaming Frog SEO Spider.app" --args /tmp/crawl.seospider
/Applications/Screaming\ Frog\ SEO\ Spider.app/Contents/MacOS/ScreamingFrogSEOSpiderLauncher /tmp/crawl.seospider

To start the UI and immediately start crawling:

open "/Applications/Screaming Frog SEO Spider.app" --args --crawl https://www.example.com/
/Applications/Screaming\ Frog\ SEO\ Spider.app/Contents/MacOS/ScreamingFrogSEOSpiderLauncher --crawl https://www.example.com/

To start headless, immediately start crawling and save the crawl along with Internal->All and Response Codes->Client Error (4xx) filters:

open "/Applications/Screaming Frog SEO Spider.app" --args --crawl https://www.example.com --headless --save-crawl --output-folder /tmp/cli --export-tabs "Internal:All,Response Codes:Client Error (4xx)"
/Applications/Screaming\ Frog\ SEO\ Spider.app/Contents/MacOS/ScreamingFrogSEOSpiderLauncher --crawl https://www.example.com --headless --save-crawl --output-folder /tmp/cli --export-tabs "Internal:All,Response Codes:Client Error (4xx)"

Please see the full list of available to supply as arguments for the SEO Spider.

3) Multi-Select Details & Bulk Exporting

You can now select multiple URLs in the top window pane, view specific lower window details for all the selected URLs together, and export them. For example, if you click on three URLs in the top window, then click on the lower window ‘inlinks’ tab, it will display the ‘inlinks’ for those three URLs.

You can also export them via the right click or the new export button available for the lower window pane.

Obviously this scales, so you can do it for thousands, too.

This should provide a nice balance between everything in bulk via the ‘Bulk Export’ menu and then filtering in spreadsheets, or the previous singular option via the right click.

Configuring The Crawl

You don’t need to adjust the configuration to crawl a website, as the SEO Spider is set-up by default to crawl in a similar way to Google.

However, there are a myriad of ways that you can configure the crawl to get the data you want. Check out the options under ‘Configuration‘ within the tool and refer to our user guide for detail on each setting.

Some of the most common ways to control what’s crawled are to , use the (to avoid crawling URLs by URL pattern) or the features.

If your website relies on JavaScript to populate content, you can also switch to mode under ‘Configuration > Spider > Rendering’.

This will mean JavaScript is executed and the SEO Spider will crawl content and links within the rendered HTML.

Troubleshooting

If set up correctly, this process should be seamless but occasionally Google might catch wind of what you’re up too and come down to stop your fun with an annoying anti-bot captcha test.

If this happens just pause your crawl, load up a PSI page in a browser to solve the captcha, then jump back in the tool highlight the URLs that did not extract any data right click > Re-Spider.

If this continues the likelihood is you have your crawl speed set too high, if you lower it down a bit in the options mentioned above it should put you back on track.

I’ve also noticed a number of comments reporting the PSI page not properly rendering and nothing being extracted. If this happens it might be worth a clear to the default config (File > Configuration > Clear to default). Next, make sure the user-agent is set to ScreamingFrog. Finally, ensure you have the following configuration options ticked (Configuration > Spider):

  • Check Images
  • Check CSS
  • Check JavaScript
  • Check SWF
  • Check External Links

If for any reason, the page is rendering correctly but some scores weren’t extracted,  double check the Xpaths have been entered correctly and the dropdown is changed to ‘Extract Text’. Secondly, it’s worth checking PSI actually has that data by loading it in a browser — much of the real-world data is only available on high-volume pages.

3) XML Sitemap & Sitemap Index Crawling

The SEO Spider already allows crawling of XML sitemaps in list mode, by uploading the .xml file (number 8 in the ‘10 features in the SEO Spider you should really know‘ post) which was always a little clunky to have to save it if it was already live (but handy when it wasn’t uploaded!).

So we’ve now introduced the ability to enter a sitemap URL to crawl it (‘List Mode > Download Sitemap’).

Previously if a site had multiple sitemaps, you’d have to upload and crawl them separately as well.

Now if you have a sitemap index file to manage multiple sitemaps, you can enter the sitemap index file URL and the SEO Spider will download all sitemaps and subsequent URLs within them!

This should help save plenty of time!

Other Updates

Version 10.0 also includes a number of smaller updates and bug fixes, outlined below.

  • You’re now able to automatically load new URLs discovered via Google Analytics and Google Search Console, into a crawl. Previously new URLs discovered were only available via the orphan pages report, this now configurable. This option can be found under ‘API Access > GA/GSC > General’.
  • ‘Non-200 Hreflang URLs’ have now been moved into a filter under the ‘‘ tab.
  • You can disable under the advanced configuration (to retrieve true redirect status codes easier, rather than an internal 307) .
  • The ‘Canonical Errors’ report has been renamed to ‘Non-Indexable Canonicals’ and is available under in the top level menu.
  • The ‘rel=”next” and rel=”prev” Errors’ report has been adjusted to ‘Pagination’ > ‘Non-200 Pagination URLs’ and ‘Unlinked Pagination URLs’ reports.
  • Hard disk space has been reduced by around 30% for crawls in database storage mode.
  • Re-spidering of URLs in bulk on larger crawls is faster and more reliable.
  • There are new ‘bulk exports’ for Sitemaps and AMP as you would expect.
  • The main URL address bar at the top is now much wider.
  • Donut charts and right click highlighting colours have been updated.
  • There’s a new ‘‘ configuration item for list mode auditing.
  • The 32k character limit for custom extraction has been removed.
  • ‘rel=”next” and rel=”prev” are now available in the ‘Internal’ tab.
  • ‘Max Redirects To Follow’ configuration has been moved under the ‘Limits’ tab.
  • There’s now a ‘resources’ lower window tab, which includes (you guessed it), resources.
  • The website profiles list is now searchable.
  • The and configuration, now have a ‘test’ tab to help test your regex pre a crawl.
  • There’s a new splash screen on start-up.
  • There’s a bunch of new right click options for popular checks with other tools such as PageSpeed Insights, the Mobile Testing Tool etc.

That’s everything. If you made it this far and are still reading, thank you for caring. Thank you to everyone for the many feature requests and feedback, which have helped the SEO Spider improve so much over the past 8 years.

If you experience any problems with the new version, then please do just let us know via support and we can help.

Now, go and download version 10.0 of the Screaming Frog SEO Spider.

4) Ajax Crawling #!

Some of you may remember an older version of the SEO Spider which had an iteration of Ajax crawling, which was removed in a later version. We have redeveloped this feature so the SEO Spider can now crawl Ajax as per Google’s Ajax crawling scheme also sometimes (annoyingly) referred to as hashbang URLs (#!).

There is also an Ajax tab in the UI, which shows both the ugly and pretty URLs, with filters for hash fragments. Some pages may not use hash fragments (such as a homepage), so the ‘fragment’ meta tag can be used to recognise an Ajax page. In the same way as Google, the SEO Spider will then fetch the ugly version of the URL.

Discovering Errors & Issues

The right hand window ‘overview’ tab displays a summary of crawl data contained within each tab and filter. You can scroll through each of these sections to identify potential errors and issues discovered, without needing to click into each tab and filter.

The number of URLs that are affected is updated in real-time during the crawl for most filters and you can click on them to be taken directly to the relevant tab and filter.

The SEO Spider doesn’t tell you how to do SEO, it provides you with data to make more informed decisions. The ‘filters’ do however provide hints on specific issues that should be addressed or at least considered further in the context of your site.

Explore these hints and if you’re unsure about the meaning of a tab or filter, just refer to our user guide. Every tab has a section (such as , and ) which explain every column and filter.

2) In-App Memory Allocation

First of all, apologies for making everyone manually edit a .ini file to increase memory allocation for the last 8-years. You’re now able to set memory allocation within the application itself, which is a little more user-friendly. This can be set under ‘Configuration > System > Memory’. The SEO Spider will even communicate your physical memory installed on the system, and allow you to configure it quickly.

Increasing memory allocation will enable the SEO Spider to crawl more URLs, particularly when in RAM storage mode, but also when storing to database. The memory acts like a cache when saving to disk, which allows the SEO Spider to perform quicker actions and crawl more URLs.

ScrapeBox

Программа поиска площадок для размещения ссылок и комментариев.

Осуществляет парсинг релевантных URL по поисковой выдаче более 30 систем, в том числе Google, Yahoo и Bing, собирает список ссылок, которые соответствуют ключевым словам из вашего запроса, проводит поиск ключевых слов на основе данных 10 сервисов (например, Google Suggest) и генерирует слова с «длинными хвостами». Позволяет создавать и размещать автоматические и полуавтоматические комментарии с обратными ссылками как на вашей собственной площадке, так и на внешних. Проверяет индексацию страниц, ранжирование и весомость сайтов, всю массу существующих обратных ссылок на ваш ресурс и анкор-ссылок. Ищет свободные прокси, что дает возможность предотвратить бан во время работы. 

Особенности: 

  • интерфейс программы на английском, но она также поддерживает русский, французский, немецкий, португальский, китайский и японский языки,
  • в поиске есть возможность назначить платформу для подбора (Google, Bing, AOL, Yahoo),
  • есть дополнительные возможности скачивания видео, генерирования карты сайта на основе URL, обнаружения свободных для регистрации доменов, сбора email-ов с сайтов и др.,
  • работает на Windows и MacOs, но не будет работать с WINE,
  • техническая поддержка 24/7.

55% скидка на курс для SEO-специалиста
По теме
55% скидка на курс для SEO-специалиста

Viewing Crawl Data

Data from the crawl populates in real-time within the SEO Spider and is displayed in tabs. The ‘‘ tab includes all data discovered in a crawl for the website being crawled. You can scroll up and down, and to the right to see all the data in various columns.

The tabs focus on different elements and each have filters that help refine data by type, and by potential issues discovered.

The ‘Response Codes’ tab and ‘Client Error (4xx) filter will show you any 404 pages discovered for example.

You can click on URLs in the top window and then on the tabs at the bottom to populate the lower window pane.

These tabs provide more detail on the URL, such as their inlinks (the pages that link to them), outlinks (the pages they link out to), images, resources and more.

In the example above, we can see inlinks to a broken link discovered during the crawl.

2) Database Storage Crawl Auto Saving & Rapid Opening

Last year we introduced mode, which allows users to choose to save all data to disk in a database rather than just keep it in RAM, which enables the SEO Spider to crawl very large websites.

Based upon user feedback, we’ve improved the experience further. In database storage mode, you no longer need to save crawls (as an .seospider file), they will automatically be saved in the database and can be accessed and opened via the ‘File > Crawls…’ top-level menu.

The ‘Crawls’ menu displays an overview of stored crawls, allows you to open them, rename, organise into project folders, duplicate, export, or delete in bulk.

The main benefit of this switch is that re-opening the database files is significantly quicker than opening an .seospider crawl file in database storage mode. You won’t need to load in .seospider files anymore, which previously could take some time for very large crawls. Database opening is significantly quicker, often instant.

You also don’t need to save anymore, crawls will automatically be committed to the database. But it does mean you will need to delete crawls you don’t want to keep from time to time (this can be done in bulk).

You can export the database crawls to share with colleagues, or if you’d prefer export as an .seospider file for anyone using memory storage mode still. You can obviously also still open .seospider files in database storage mode as well, which will take time to convert to a database (in the same way as version 11) before they are compiled and available to re-open each time almost instantly.

Export and import options are available under the ‘File’ menu in database storage mode.

To avoid accidentally wiping crawls every time you ‘clear’ or start a new crawl from an existing crawl, or close the program – the crawl is stored. This leads us nicely onto the next enhancement.

Closing Thoughts

The guide above should help you identify JavaScript websites and crawl them efficiently using the Screaming Frog SEO Spider tool in mode.

While we have performed plenty of research internally and worked hard to mimic Google’s own rendering capabilities, a crawler is still only ever a simulation of real search engine bot behaviour.

We highly recommend using log file analysis and Google’s own URL Inspection Tool, or using the relevant version of Chrome to fully understand what they are able to crawl, render and index, alongside a JavaScript crawler.

Additional Reading

  • Understand the JavaScript SEO Basics – From Google.
  • Core Principles of JS SEO – From Justin Briggs.
  • Progressive Web Apps Fundamentals Guide – From Builtvisible.
  • Crawling JS Rich Sites – From Onely.

If you experience any problems when crawling JavaScript, or encounter any differences between how we render and crawl, and Google, we’d love to hear from you. Please get in touch with our support team directly.

Installation

You can install the SEO Spider in one of two ways.

GUI

– Double click on the .deb file.
– Choose “Install” and enter your password.
– The SEO Spider requires the ttf-mscorefonts-install to be run, so accept the licence for this when it pops up.
– Wait for the installation to complete.

Command Line

Open up a terminal and enter the following command.

 sudo apt-get install ~/Downloads/screamingfrogseospider_10.4_all.deb

You will need enter your password, then enter Y when asked if you want to continue and accept the ttf-mscorefonts-install installations EULA.

Troubleshooting

  • E: Unable to locate package screamingfrogseospider_10.4_all.deb

    Please make sure you are entering an absolute path to the .deb to install as per the example.

  •  Failed to fetch http://archive.ubuntu.com/ubuntu/pool/main/u/somepackage.deb 404 Not Found 

    Please run the following and try again.

    sudo apt-get update

3) Crawl Overview Right Hand Window Pane

We received a lot of positive response to our when it was released last year. However, we felt that it was a little hidden away, so we have introduced a new right hand window which includes the crawl overview report as default. This overview pane updates alongside the crawl, which means you can see which tabs and filters are populated at a glance during the crawl and their respective percentages.

This means you don’t need to click on the tabs and filters to uncover issues, you can just browse and click on these directly as they arise. The ‘Site structure’ tab provides more detail on the depth and most linked to pages without needing to export the ‘crawl overview’ report or sort the data. The ‘response times’ tab provides a quick overview of response time from the SEO Spider requests. This new window pane will be updated further in the next few weeks.

You can choose to hide this window, if you prefer the older format.

8) Export Combined Orphan URLs via ‘Reports > Orphan Pages’

Finally, use the ‘Orphan Pages’ report if you wish to export a combined list of all orphan pages discovered.

There’s a ‘Source’ column next to each orphan URL, which provides the source of discovery. These have been abbreviated to ‘GA’ for Google Analytics, ‘GSC’ for Google Search Console and ‘Sitemaps’, for, erm, XML Sitemaps.

If you have integrated Google Analytics and Search Console in a crawl, but didn’t tick ‘Crawl New URLs Discovered In GA/GSC’ configuration, then this report will still contain data for those URLs. They just won’t have been crawled, and won’t appear under the respective tabs and filters.

5) Crawl In Sections (Subdomain or Subfolders)

If the website is very large, you can consider crawling it in sections. By default, the SEO Spider will crawl just the subdomain entered, and all other subdomains encountered will be treated as external (and appear under the ‘external’ tab). You can choose to , but obviously this will take up more memory.

The SEO Spider can also be configured to crawl a subfolder by simply entering the subfolder URI with file path and ensure ‘check links outside of start folder’ and ‘crawl outside of start folder’ are deselected under ‘Configuration > Spider’. For example, to crawl our blog, you’d then simply enter https://www.screamingfrog.co.uk/blog/ and hit start.

Please note, that if there isn’t a trailing slash on the end of the subfolder, for example ‘/blog’ instead of ‘/blog/’, the SEO Spider won’t currently recognise it as a sub folder and crawl within it. If the trailing slash version of a sub folder redirects to a non trailing slash version, then the same applies.

To crawl this sub folder, you’ll need to use the and input the regex of that sub folder (.*blog.* in this example).

10) Improved Redirect & Canonical Chain Reports

The SEO Spider now reports on canonical chains and ‘mixed chains’, which can be found in the renamed ‘Redirect & Canonical Chains’ report.

For example, the SEO Spider now has the ability to report on mixed chain scenarios such as, redirect to a URL which is canonicalised to another URL, which has a meta refresh to another URL, which then JavaScript redirects back to the start URL. It will identify this entire chain, and report on it.

The updated report has also been updated to have fixed position columns for the start URL, and final URL in the chain, and reports on the indexability and indexability status of the final URL to make auditing more efficient to see if a redirect chain ends up at a ‘noindex’ or ‘error’ page etc. The full hops in the chain are still reported as previously, but in varying columns afterwards.

This means auditing redirects is significantly more efficient, as you can quickly identify the start and end URLs, and discover the chain type, the number of redirects and the indexability of the final target URL immediately. There’s also flags for chains where there is a loop, or have a temporary redirect somewhere in the chain.

There simply isn’t a better tool anywhere for auditing redirects at scale, and while a feature like visualisations might receive all the hype, this is significantly more useful for technical SEOs in the trenches every single day. Please read our updated guide on auditing redirects in a site migration.

3) The Crawl Path Report

We often get asked how the SEO Spider discovered a URL, or how to view the ‘in links’ to a particular URL. Well, generally the quickest way is by clicking on the URL in question in the top window and then using the ‘in links’ tab at the bottom, which populates the lower window pane (as discussed in our guide on finding broken links).

But, sometimes, it’s not always that simple. For example, there might be a relative linking issue, which is causing infinite URLs to be crawled and you’d need to view the ‘in links’ of ‘in links’ (of ‘in links’ etc) many times, to find the originating source. Or, perhaps a page wasn’t discovered via a HTML anchor, but a canonical link element.

This is where the ‘crawl path report’ is very useful. Simply right click on a URL, go to ‘export’ and ‘crawl path report’.

You can then view exactly how a URL was discovered in a crawl and it’s shortest path (read from bottom to top).

Simple.

Other Updates

Version 9.0 also includes a number of smaller updates and bug fixes, outlined below.

  • While we have introduced the new database storage mode to improve scalability, regular memory storage performance has also been significantly improved. The SEO Spider uses less memory, which will enable users to crawl more URLs than previous iterations of the SEO Spider.
  • The ‘‘ configuration now works instantly, as it is applied to URLs already waiting in the queue. Previously the exclude would only work on new URLs discovered, and rather than those already found and waiting in the queue. This meant you could apply an exclude, and it would be some time before the SEO Spider stopped crawling URLs that matched your exclude regex. Not anymore.
  • The ‘inlinks’ and ‘outlinks’ tabs (and exports) now include all sources of a URL, not just links (HTML anchor elements) as the source. Previously if a URL was discovered only via a canonical, hreflang, or rel next/prev attribute, the ‘inlinks’ tab would be blank and users would have to rely on the ‘crawl path report’, or various error reports to confirm the source of the crawled URL. Now these are included within ‘inlinks’ and ‘outlinks’ and the ‘type’ defines the source element (ahref, HTML canonical etc).
  • In line with Google’s plan to stop using the old AJAX crawling scheme (and rendering the #! URL directly), we have adjusted the default rendering to . You can switch between text only, old AJAX crawling scheme and JavaScript rendering.
  • You can now choose to ‘cancel’ either loading in a crawl, exporting data or running a search or sort.
  • We’ve added some rather lovely line numbers to the feature.
  • To match Google’s rendering characteristics, we now allow blob URLs during crawl.
  • We renamed the old ‘GA & GSC Not Matched’ report to the ‘‘ report, so it’s a bit more obvious.
  • now applies to list mode input.
  • There’s now a handy ‘strip all parameters’ option within URL Rewriting for ease.
  • We have introduced numerous stability improvements.
  • The Chromium version used for rendering is now reported in the ‘Help > Debug’ dialog.
  • List mode now supports .gz file uploads.
  • The SEO Spider now includes Java 8 update 161, with several bug fixes.
  • Fix: The SEO Spider would incorrectly crawl all ‘outlinks’ from JavaScript redirect pages, or pages with a meta refresh with ‘Always Follow Redirects’ ticked under the advanced configuration. Thanks to our friend Fili Weise on spotting that one!
  • Fix: Ahrefs integration requesting domain and subdomain data multiple times.
  • Fix: Ahrefs integration not requesting information for HTTP and HTTPS on (sub)domain level.
  • Fix: The crawl path report was missing some link types, which has now been corrected.
  • Fix: Incorrect robots.txt behaviour for rules ending *$.
  • Fix: Auth Browser cookie expiration date invalid for non UK locales.

That’s everything for now. This is a big release and one which we are proud of internally, as it’s new ground for what’s achievable for a desktop application. It makes crawling at scale more accessible for the SEO community, and we hope you all like it.

As always, if you experience any problems with our latest update, then do let us know via support and we will help and resolve any issues.

We’re now starting work on version 10, where some long standing feature requests will be included. Thanks to everyone for all their patience, feedback, suggestions and continued support of Screaming Frog, it’s really appreciated.

Now, please go and download version 9.0 of the Screaming Frog SEO Spider and let us know your thoughts.

Как работать со Screaming frog SEO spider

Парсинг страниц с помощью Spider SEO не занимает много времени и усилий. После запуска программы ее необходимо совсем немного настроить под ваши требования.

Первое, что нам нужно сделать, – настроить режим парсинга (вкладка «Mode»). В зависимости от ваших потребностей, можно выбрать один из доступных режимов работы программы.

  • Spider – парсинг сайта.
  • List – парсинг указанных URL адресов.
  • SERP – проверка Title и Description страниц. Расчет количества знаков, ширины и длины в пикселях, прежде чем метатеги попадут на сайт.

Режим Spider

В этом режиме происходит парсинг всего сайта полностью, в том числе парсинг изображений. Здесь все просто – вставляем URL сайта в адресную строку и нажимаем кнопку «Start»

Режим List

В этом режиме программа будет парсить только те URL адреса, которые вы зададите.

Список адресов можно добавить в Screaming frog несколькими способами. Для этого делам следующее:

  1. Выбираем режим «List»
  2. Нажимаем кнопку «Upload List» и в выпадающем меню выбираем способ добавления URL адресов:
    • Загрузить файл со списком URL адресов кнопкой «From a File…»
    • Ввести вручную, выбрав пункт «Enter Manually…»
    • Скопировать URL адреса в буфер обмена и вставить их в программу кнопкой «Past»

После того как Screaming frog SEO spider спарсит ваш сайт или заданные URL адреса, в главном окне программы появится отчет с адресами и информацией о них.

Отчет сформирован так, что каждая строка – это отдельная страница сайта или же просто ссылка, а столбцы – характеристики.

Переходим к основным вкладкам программы, которые расположены в верхней части главного окна нашей программы. В каждой вкладке есть свои таблицы с URL адресами и фильтрами по характеристикам.

Подробнее о вкладках и о том, за что они отвечают:

  • Internal – обычно она открыта по умолчанию и отображает основные собранные данные по URL адресам, в том числе ответ сервера. В этой вкладке отображено больше всего параметров.
  • External – здесь отображаются исходящие ссылки.
  • Response Codes – вкладка, которая отображает заголовки HTTP страниц.
  • URL – здесь отображаются проблемные URL адреса. Изначально мы видим все URL, которые программа спарсила; чтобы посмотреть проблемные ссылки, необходимо выбрать тип проблемы в фильтре.
  • Page Titles – вкладка, где можно отследить страницы с проблемными заголовками. Аналогично предыдущему пункту, чтобы увидеть адреса страниц с проблемными заголовками, необходимо выбрать тип проблемы в фильтре.
  • Meta Description – аналогично Page Titles, только для описания страниц (метатег Description).
  • Meta Keywords – отображает результаты по содержанию тега Keywords для каждой страницы. Здесь можно увидеть адреса страниц с дублями ключевых слов или страницы, где метатег Keywords не заполнен.
  • Вкладки H1 и H2 соответственно отображают результаты по всем заголовкам H1 и H2, найденные на каждой странице сайта.
  • Images – эта вкладка отображает список изображений, их размер и количество ссылающихся на них ссылок.
  • Directives – здесь можно увидеть типы URL адресов: follow/nofollow, refresh, canonical и другие.

На каждой вкладке, как мы уже говорили ранее, есть фильтр, кнопка экспорта, кнопка вида таблицы и поисковая строка.

Подробнее:

  • Фильтр – таблицы на каждой вкладке можно фильтровать по параметрам, которые в свою очередь зависят от типа вкладки.
  • Кнопка экспорта – таблицы (отчеты) можно экспортировать. Экспортируются они с учетом фильтра и сортировки. (Например, если вы указали в фильтре файлы только CSS, и у вас отображается только CSS файл, то и в экспорт попадут исключительно CSS файлы.)
  • Вид – здесь есть 2 варианта: древовидный вид и список (второй вид отображен на скриншоте).
  • Строка поиска – поиск общий, указанные значения ищутся во всех параметрах отчета активной вкладки.

В нижней части программы Screaming frog SEO spider есть окно с вкладками. В этом окне выводится информация по каждому выбранному URL адресу из основного окна. Подробнее о каждой вкладке:

  • URL Info – основная информация о ссылке
  • Inlinks – входящие ссылки
  • Outlinks – исходящие ссылки
  • Image Info – информация об изображениях, связанных с выбранным URL адресом.
  • SERP Snippet – информация о сниппете выбранного URL адреса.

И последнее, что мы разберем в этой статье, – правый блок программы, у которого, аналогично остальным блокам, есть вкладки, которые расположены вверху.

Подробнее о каждой из вкладок:

Оцените статью
Рейтинг автора
5
Материал подготовил
Андрей Измаилов
Наш эксперт
Написано статей
116
Добавить комментарий