What Is a Sitemap?

Image text reads, "What Is a Sitemap?"

A sitemap is a file that lists all of the pages of a website and how they’re related to each other.

Sitemaps can be lists of pages, media or files on a website. Your website may have one sitemap for your pages, one sitemap for your blog posts, one sitemap for your images and so on.

Sitemaps make it easier to find all of a website’s pages quickly and in one singular location and are normally saved in an XML or HTML format.

What Is an XML Sitemap?

XML, or Extensible Markup Language, is a format for encoding information in an easy-to-read format for search engines.

An XML sitemap looks similar to a plain list of URLs but with some additional information added in. These are the tags <changefreq> (signifying how frequently the content of a page updates) and <priority> (which is a hint to website crawlers on a URL’s level of importance).

The more frequently a page changes, the more frequently it must be crawled. Using the <changefreq> tag in a sitemap helps to inform Google’s website crawlers — Googlebot — that a page should be crawled as frequently as possible so that any new content is found and submitted to Google’s index.

The <priority> tag helps to highlight which pages are the most important to your website. The priority number system goes from 0-1 in .5 increments. The closer to 1, the higher the priority level for a page.

When a website has thousands of pages, it’s important to highlight which pages Google should crawl first. This is because Google will only crawl a certain number of pages each time it visits a website. If an important page, such as a high-value service page, has a low priority level (0.1, for example) then Google may not re-visit that page and detect any improvements a page has had for hours, days or sometimes several weeks.

The higher the change frequency and priority level, the more frequently that page is crawled.

Lower value pages, such as a website’s privacy policy page, may have no change frequency at all and may instead list a <lastmod> (last modified) tag which will include the date (in YYYY-MM-DD format).

This way, website crawlers like Google can check the sitemap of a website and determine whether the last modified tag’s date of any pages has changed (by comparing it against its index) and whether it should re-crawl them.

Example of an XML sitemap

What Is an HTML Sitemap?

HTML sitemaps are a visual representation of a website’s site structure. Like an XML sitemap, it lists all of the most important pages of a website but in a more human-friendly way.

HTML sitemaps make it easier to find a page when it is difficult to in a website’s main navigation menu or within internal links.

HTML sitemaps are not as common as XML sitemaps, as they aren’t well-known by casual internet users, and for websites with thousands of pages, they can be near impossible to maintain.

Screenshot of an HTML Sitemap

Do I Need a Sitemap?

Every website should have a sitemap. Sitemaps are the perfect format for handling and maintaining URLs, especially when you’re handling tens-of-thousands of URLs, like an online eCommerce store with thousands of products.

Because you can create multiple sitemaps for a website, you can segment and manage sections of your website individually.

The most common split between sitemaps on websites is a page_sitemap.xml file and post_sitemap.xml file (due to the number of websites built using WordPress and the Yoast plugin). This allows you to review your website’s pages and posts separately, which is useful, as the blogs of websites are typically more frequently updated than pages and they total larger numbers (10 pages versus 100 blog posts, for example).

If you’re managing an eCommerce website, having sitemaps per product category can make product page management even easier. You could have a sitemap for your menswear products, your womenswear products, your accessories and so on. When the time comes that you need to audit and update your accessory pages, you will have a list of these pages ready and easy to find.

The largest benefit is that you can use your sitemap files to notify Google when you have made changes to your website, rather than waiting for Google to find your pages when it next crawls your website.

Using Google’s Search Console tool, you can submit your sitemap directly to Google and it’ll know that this list of pages should be crawled next. This will ensure that every page is crawled shortly after you submit it, rather than waiting for Google to find your sitemap and notice any <lastmod> tags a page may have.

How to Create a Sitemap

Some websites come with sitemaps automatically built as part of their system. The biggest example of this is the WordPress Content Management System (CMS), which has an automatically updating sitemap file built-in.

To make the WordPress sitemap easier to handle, some people use the Yoast SEO plugin for WordPress, which segments pages, posts and so on.

Each CMS has its own sitemap management system, although some systems may need a plugin.

There are rare occasions when a CMS may not have an in-built sitemap system in place or you may want to create your own sitemap by hand — you can then upload that sitemap to Google so it can crawl any new pages or bulk changes you may have made.

Before you create your sitemap, it can help to make a visual diagram of your website’s structure using a planning tool like Slickplan. You can then see how well structured your website is and how much segmentation your website has (or lacks).

To create your sitemap, you can use an online tool, like XML-Sitemaps.com, which will crawl your website (up to the first 500 URL limit) and give you an XML sitemap to download at the end.

Personally, we prefer to use the SEO Spider software by Screaming Frog — which also has a 500 URL limit for free accounts — as it is a multifunctional tool that can help with both website crawling and sitemap validation.

Follow these instructions to create a sitemap using Screaming Frog.

Step 1 – Crawl Your Website Using Screaming Frog

Enter your website’s domain address into the toolbar and press “Start”. Depending on the size of your website, this may take some time as every page will need to be discovered and crawled.

Example of a completed Screaming Frog crawl

Step 2 – Remove Any Unwanted URLs

Once the crawl is complete, review the complete list of URLs and look for any you do not want to include in your sitemap. If you have URLs you’d like to exclude, right-click these and select “Remove”. If you hold the Shift or Ctrl button (on Windows), you can select multiple URLs to remove at once.

Example of removing URLs from Screaming Frog

Step 3 – Open the Sitemap Menu

Open the Sitemap menu on the toolbar and select “XML Sitemap”. A menu will open with several options. The default options will be to include only Status 200 URLs within the sitemap, but you can choose to include pages with noindex tags, paginated pages or those with 301 redirects.

The other sub-menus — Last Modified, Priority, Change Frequency, Images, Hreflang — give you the option to edit the <changefreq>, <priority>, and <lastmod> tags to suit the needs of your website.

Example of Screaming Frog's Sitemap menu

Step 4 – Save Your Sitemap

Once you have finished making your edits, click the “Next” button and a Save menu will open. The default file type will be XML.

Step 5 – Upload Your Sitemap

Now that you have your new sitemap, you’ll need to upload this to your website.

Unfortunately, we’re unable to advise on the best way to do this, as every website’s CMS will differ.

If you’re unsure how to do this, speak to your website’s developer or get in touch with our Website Development team — we may be able to help on an ad-hoc basis.

For more details on crawling and the sitemap settings of Screaming Frog, you can refer to its guide.

How to Submit Your Sitemap to Google’s Search Console

Now that you have a complete sitemap, you can now submit it to Google’s Search Console.

To do this, log into the Search Console account for your website and navigate to the “Sitemap” page, found in the menu on the left under “Index”.

In the box titled “Add a new sitemap”, write in the URL of the sitemap you’ve just uploaded (minus your domain name).

Google will then schedule to crawl these URLs and update the status of the sitemap’s crawl in the “Status” column of the “Submitted sitemaps” box.

Screenshot of Google Search Console's Sitemap menu

Book cover of How To Get To The Top of Google 2021

Get to the top of Google for free

Download a free copy of our bestselling book,
"How To Get To The Top of Google"
Menu