RankPill Logo

XML Sitemap Validator

Validate your XML sitemap for syntax errors, broken URLs, and missing pages that could prevent proper Google indexing.

Try

Sitemap Spec Cheatsheet

The rules this validator checks against - straight from sitemaps.org and Google's guidelines.

RuleValueNotes
URL limit per sitemap50,000Split into multiple sitemaps + index if over.
File size (uncompressed)50MBCompress with .gz if approaching the limit.
EncodingUTF-8All content must be UTF-8 encoded.
loc formatAbsolute URLMust start with http:// or https://. Same domain as sitemap.
lastmod formatW3C datetimeYYYY, YYYY-MM, YYYY-MM-DD, or ISO-8601 with timezone.
changefreq valuesalways, hourly, daily, weekly, monthly, yearly, neverIgnored by Google. Optional and safe to omit.
priority range0.0–1.0Ignored by Google. Optional and safe to omit.
ProtocolMatch canonicalDo not mix http:// and https:// URLs in one sitemap.

What Is an XML Sitemap?

An XML sitemap is a machine-readable list of the URLs on your site that you want search engines to discover and index. It uses the sitemaps.org XML protocol (adopted by Google, Bing, and Yandex) and is served at a stable URL, most commonly /sitemap.xml. For large sites, a sitemap index file can reference up to 50,000 child sitemaps, each of which can list up to 50,000 URLs.

A well-formed sitemap does two things at once: it surfaces pages that your internal linking would otherwise bury, and it gives search engines an accurate lastmod hint so they know which pages are actually worth re-crawling. A malformed sitemap does neither - worse, a sitemap that lists broken, noindexed, or redirect URLs actively wastes your crawl budget. This validator surfaces the exact issues Google logs silently in Search Console so you can fix them before your next index refresh.

Keep lastmod accurate

Only update lastmod when the content meaningfully changed. Lying here trains Google to ignore the signal across your entire site.

List only indexable URLs

Your sitemap should never include noindex, canonicalized, or redirected URLs. If a URL shouldn't be in Google's index, it shouldn't be in your sitemap either.

Reference it from robots.txt

Add a 'Sitemap: https://yourdomain.com/sitemap.xml' line to robots.txt. It's the primary way crawlers discover your sitemap without manual submission.

Submit to Search Console

Even with the robots.txt reference, submit the sitemap directly in Google Search Console. It unlocks the coverage and indexing reports that surface why specific URLs fail to index.

Frequently Asked Questions

Common questions about XML sitemaps and validation.

Want these on autopilot?

RankPill automates everything these tools do. Meta descriptions, titles, content briefs, and full articles published to your site every day without lifting a finger.