How To Create & Submit A Sitemap: The Definitive Guide

by Posted @ Dec 13 2017

XML sitemaps are a great way to ensure your site is crawled and indexed properly. Learn how to take control and build your own!

When it comes to creating an XML sitemap, a car analogy works best. Sure, automatic is great. It’s convenient and affords you an extra hand to turn up that Adele song you love to sing along to terribly. But any driving enthusiast will tell you that a manual shift gives you a closer connection to the vehicle and to the road, and that’s exactly what we’re after – more connection. More control.

These days, there are many options for automating the creation of XML sitemaps, whether through a plugin or an online sitemap generator. Some are better than others (the Yoast plugin for WordPress does a pretty good job), but the machines haven’t replaced us just yet. Automation still does not measure up to a carefully-constructed sitemap by hand. So roll up your sleeves and follow these steps to create and submit custom XML sitemaps that represent your site better than any plugin or tool can.

 

Step 1: Know What You’re Looking For

An XML sitemap is essentially just a list of the pages that make up your website. But the key thing to remember is that we are only concerned with pages that should be in Google’s index. You don’t want to put a login page or a post-purchase “thank you” page on your sitemap, for instance. Before you set out to gather up the URLs of the pages on a site, let’s ask a simple question:

“Is this a page that should be in Google’s index?”

 

If you’re a bit more versed in SEO, you can also ask:

“Does the page return a 200 status code?”

and

“Does the page self-canonical?”

 

Doing this exercise will give meaning to everything we encounter in Step 2.

 

Step 2: Collect Your Pages

Now that we know exactly what we’re looking for, let’s go find it! In the first part of this step, we’re going to gather up all of the website’s URLs. The easiest way to do this is with a crawler like Screaming Frog, which can quickly crawl the pages of your site and spit out a list of URLs.

Alternatively, you can simply follow each of the site’s main navigation options down to their deepest level (also known as a human crawl). This is actually the method I prefer. If the site isn’t too big, it’s a great way to learn about the navigational logic and user-friendliness of your site.

Let’s use Go Fish Digital’s site as an example. Before I toss it into a crawler, I’m going to browse it manually and gain some insights. My first takeaway, as is often the case, is from the main navigation.

On the far left, we have a logo and branding, which links to the home page. You guessed it – the home page URL is going in the sitemap.

 

On the right, we have About, Services, Blog, and Contact.

 

Right away, I’m going to begin grouping. The About and Contact pages are more general pages, like the home page, so I consider those three URLs as a “General” section of the site.

General pages

https://gofishdigital.com/

https://gofishdigital.com/about-us/

https://gofishdigital.com/contact-us/

 

Next, we have Services and Blog.

 

Services has a drop-down menu – this is a perfect reason to group these pages together!

Service Pages

https://gofishdigital.com/services/

https://gofishdigital.com/search-engine-optimization/

https://gofishdigital.com/online-reputation-management/

https://gofishdigital.com/website-design-and-development/

https://gofishdigital.com/content-marketing/

https://gofishdigital.com/search-engine-marketing/

https://gofishdigital.com/conversion-rate-optimization/

 

Then, the blog. I’ve only displayed 3 posts here, but there are a lot more blog posts on GFD’s site. This is where a crawler would come into play.

Blog posts

https://gofishdigital.com/blog/

https://gofishdigital.com/google-shows-us-context-is-king/

https://gofishdigital.com/mobile-geofences/

https://gofishdigital.com/google-searching-tv/

 

Would you look at that? We now have the site sectioned out nicely. With our URLs grouped together like this, we can make a beautifully-organized sitemap!

In the last part of this step, we’re going to take out any pages that don’t hold up to the question(s) we asked in Step 1. I did find a privacy policy page in the footer, and I’ve decided not to include it. It’s not a keyword-focused page that is going to perform well in search. Never forget that you can include or exclude whatever pages you want when creating a sitemap!

https://gofishdigital.com/privacy-policy/

 

Step 3: Code Your URLs

If you’ve applied Step 2 carefully to your website’s pages, you now have a list of URLs that need to be formatted with the appropriate tags. XML is a lot like HTML – in fact, the “ML” in both stands for “markup language.”

For this step, you’ll need a text editor so you can create an XML file. I highly recommend Sublime Text. They offer a lifetime license key, and it will serve your SEO and text-editing future better than the finest hound.

a.) Let’s begin with an opening <urlset> tag:

<urlset xmlns=”http://www.sitemaps.org/schemas/sitemap/0.9>

 

b.) Next, add your first URL with the appropriate <url> and <loc> tags:

<url>

<loc>https://gofishdigital.com</loc>

</url>

 

c.) When you’ve entered your last URL, simply close the <urlset> tag:

</urlset>

 

Now that you know the different tags, get your eyes used to looking at a simple XML sitemap. Here is what the finished product would look like:

 

Step 4: Validate Your Sitemap

Now it’s time to run your sitemap through a validator to make sure all the syntax is correct. Go ahead and save your file and name it sitemap.xml. Then, visit https://validator.w3.org/#validate_by_upload and upload your XML file. Hopefully, you see this message:

If there are any errors, the validator will quote the line that contains the error so you can go back into Sublime Text and easily locate it.

 

Step 5: Add It To The Root

Next, you’ll want to add your sitemap file (sitemap.xml) to the root folder of your site. This can be done locally, through FTP or (ideally) by a developer. Adding your sitemap file to the root folder means that it will be located at yoursite.com/sitemap.xml. This is true for a lot of sites! Trying picking a couple of sites you regularly visit and type “/sitemap.xml” after the TLD (the “.com,” “.net,” etc.).

ex: https://www.apple.com/sitemap.xml

 

Step 6: Add It To The Robots(.txt)

A robots.txt file is a simple text file with instructions for the crawler that is visiting your site. The file exists in the root folder, so you can probably guess where it’s located – yoursite.com/robots.txt. One of the lines you can add to your robots.txt file is the “Sitemap:” line. This will ensure that the crawler goes and checks out your perdy, custom XML sitemap. Here’s how the the line would look, assuming your site is secure (HTTPS):

Sitemap: https://yoursite.com/sitemap.xml

 

Apple.com has a number of “Sitemap:” lines in their robots.txt file (https://www.apple.com/robots.txt):

 

Adding a line to your robots.txt file that points to your sitemap is somewhat debated as effective, but the purpose of this guide is to be thorough, and it is still a best practice I see utilized by many top SEOs and successful websites.

 

Step 7: Submit Your Sitemap

We gathered, we grouped, we tagged, we validated, and we added to the root. Now we’ll discuss how to submit your sitemap to Google and Bing. Doing so can improve the indexation of your site! Please note that I’m assuming you have Google Search Console and Bing Webmaster Tools accounts set up.

How to submit a sitemap to Google

a.) Sign into your GSC account.

b.) Click Crawl > Sitemaps > Add/Test Sitemap

c.) Enter “/sitemap.xml” into the available field and submit your sitemap!

How to submit a sitemap to Bing

a.) Sign into your BWT account.

b.) Click Configure My Site > Sitemaps

c.) Enter the full URL of your sitemap and submit your sitemap!

Check in periodically (but not obsessively) to ensure your sitemap URLs are being crawled. It is NOT uncommon for only part of your sitemap to be crawled. In fact, we rarely see a sitemap crawled in its entirety. That’s asking a lot and the major search engines love to be coy.

 

(Bonus) Next-Level Sitemapping: Creating an Index

The whole point of a sitemap is to make the pages of your site as crawler-accessible as possible. To do this, we present them in a simple, organized list. If you want to take order to the next level, you’ll want to create a sitemap index.

A sitemap index is an XML file that refers to a number of individual XML sitemaps. For Go Fish Digital’s site, we could make an individual sitemap for each grouping we created in Step 2:

general_sitemap.xml

services_sitemap.xml

blog_sitemap.xml

 

We would add each of these files to the root folder of the site and point to them within a sitemap index, which uses its own XML tags:

We would then name the sitemap index, validate, add it to the root folder, and submit it within the search engine consoles for Google and Bing – no need to submit each individual sitemap! The index will take care of everything. Additionally, you can add a “Sitemap:” line to your robots.txt file that points to the index, rather than pointing to each individual sitemap (looking at you, Apple).

A sitemap index with individual sitemaps represents the highest level of organization and is a superb way to present the indexable pages of your site to the major search engines.

 

Make Your Map(s)!

Whether you’re looking at your own site, a friend’s site, or a client’s site, you now have some great guidelines for creating a meaningful XML sitemap or sitemap index. So build your own custom sitemap and take charge of your SEO, learn more about your website, and cut the fat caused by automation.

Happy mapping!

subscribe to our newsletter

13 Comments

  1. Klaus

    December 15th, 2017 at 3:18 pm

    Outstanding guide, Bill!

    Now, on the sitemap-index tip: by which aspects do you like to split the different files. Do you think that aspects like “contains images” and “has text content” would be suitable characteristics to further analyze the index-rate over GSC?

    Cheers

    Klaus

    Reply

    • Brian Gorman

      December 15th, 2017 at 3:37 pm

      Hey Klaus!

      Thank you for commenting.

      For splitting things up, I follow the main navigation of the site as closely as I can. Creating sitemaps for the different categories or sections of a site will quickly reveal if the site’s navigation is good or if it needs work.

      You might begin this exercise and realize there is a much better way to organize the site. Then you’ll have two wins – a great sitemap index, and suggestions to make the site more intuitive for human users!

      I like to give images their own sitemap. If the number of images is approaching 50,000, I would then start to split them up into groups, just like the post mentions.

      Hope that’s helpful!

  2. neeraj pandey

    December 16th, 2017 at 6:28 am

    1 thing I sometimes get confused is if 1 url updated in sitemap8 so I used last modified for that url now for it’s sitemap should I again uodate the last modified in index file ?

    Reply

    • Brian Gorman

      December 16th, 2017 at 2:19 pm

      Hello and thanks for commenting!

      If you update a URL, you could update your lastmod tag in the appropriate sitemap and in the index. That said, I have heard for a couple of years now that lastmod is largely ignored by Google.

  3. George

    December 16th, 2017 at 3:29 pm

    Great article Brian!

    For websites with 100s or 1000s of pages, is it possible to perform above process and keep up with urls future updates without getting crazy?

    Also, regarding images, how you approach the creation of custom sitemap/s?

    Thanks,
    George

    Reply

    • Brian Gorman

      December 16th, 2017 at 3:52 pm

      Hey George –

      Thanks for stopping by!

      For larger websites, you can definitely perform this process. What will happen, though, is – as the site size increases, so will your use of a crawler.

      I always crawl the site with Screaming Frog first, then I go about making my sitemap(s) manually. If I run into a portion of the site that is too large to click through, I refer to my crawl. A lot of times, the URLs you’re looking for will have a naming convention – for instance, all blog posts may be housed in a /blog/ folder, which makes gathering them out of Screaming Frog really easy.

      As far as future updates – if you know the pages on a particular sitemap are going to update regularly, that is where a proper developer comes in. They can connect a sitemap to your database so when something changes on the site, the sitemap updates. I’m just a lowly SEO, so I don’t currently have the expertise to do that – but I know enough to know it CAN be done, through PHP, for instance.

      I normally don’t put as much effort into image sitemaps unless the site relies on Google image search quite a bit (which my clients rarely do). Images get more of the “general” sitemap treatment – I just toss everything into 1 🙂

  4. Consulenza

    December 18th, 2017 at 4:43 am

    I finally could understand some tricks about sitemaps. The best article I’ve read about it so far (especially the bonus). Thanks, Brian!

    Reply

    • Brian Gorman

      December 18th, 2017 at 9:32 am

      Consulenza –

      I am thrilled to hear that the post was helpful to you! Thanks so much for taking the time to comment.

  5. Soumya Roy

    December 20th, 2017 at 9:46 am

    Brian, why do I need to create sitemap files myself manually? We have plenty of online tools on internet to create sitemap files automatically. We just have to upload those on the server and submit those on GSC and additionally add a line of code on robots file.
    Isn’t it going to be easier?

    Reply

    • Brian Gorman

      December 24th, 2017 at 10:55 am

      Thanks for commenting!

      Yes, sitemap generators are always going to be easier. But with that convenience, you lose some things as well. Automated tools will not take the time to parse out unnecessary pages, they just crawl everything that’s there and put it on your sitemap. Also, I haven’t found any online tool that can make an index with custom, categorized sitemaps (not counting plugins here – looking at you, Yoast).

      The real thing we miss when we use an automated sitemap creation tool is the chance to explore our own (or client’s) site. To learn the site better than anyone else, to be the expert, to experience the navigation as visitors will, etc. Whenever I spend more time away from tools and on a site, I learn important lessons I would have otherwise missed.

      That’s just how I feel about the subject, and how I like to absorb a website and create my sitemaps. Whatever works best for you IS the best method, as long as you get the result you’re after!

  6. Etela

    December 26th, 2017 at 9:15 pm

    Thanks for the great post Brian. I love working hands-on because of the same reasons; better understanding of the site and navigation, the URL structure, and control.

    I’m curious what your approach is setting crawling priorities. How do you determine the priority level and how important do you think it is? I haven’t really been using it but I wonder if there is a good reason to not leave this out.
    Do the search engines actually follow or care for this setting in your experience?

    Thanks so much and I’m looking forward to your next article.

    Reply

    • Brian Gorman

      December 28th, 2017 at 2:28 pm

      Hi there!!

      Google has said publicly that the priority and changefreq tags have been deprecated. In other words, they mostly ignore them. Yoast even removed one or both from their automatic sitemap generation. I’ve even heard of (and tested myself) putting more important URLs at the top, cascading down to lesser importance, but I have not observed any meaningful, measurable effect there.

      Because of these factors, I tend toward keeping my sitemaps very organized, ordered, and simple.

      Thanks for stopping by!

  7. sandeep

    April 08th, 2018 at 4:19 am

    Awesome information….Great

    Reply

Leave a Comment