After moving my blog from WordPress to a static site on Bitbucket.org, I found that the new site was barely being indexed by Google — only two pages were showing up regularly. The situation continued after I set up Google Analytics; I was now able to see traffic statistics for my blog, but indexing was still minimal.
It turns out that Google has a procedure for requesting inclusion in its search results, using its Webmaster Tools site:
-
Site verification. There are a number of ways of proving that you own or control the site, such as adding a file or a meta tag containing a specified hash value. In my case, with Google Analytics already enabled for the site, nothing further was needed.
-
At this point, Google can crawl the site systematically, but that process and its updating can be made more efficient if the site has a
sitemap.xml
file added. A sitemap contains blocks detailing, for each file, its link, time of modification, and two tags specifying how often the page should be recrawled:1 2 3 4 5 6
<url> <loc>https://dpb.bitbucket.io/experimenting_with_analytic_trackers.html</loc> <lastmod>2014-05-12T05:10:00-00:00</lastmod> <changefreq>always</changefreq> <priority>1</priority> </url>
My site is currently built using Pelican, which supplies a sitemap-generator among its plugins. The sitemap was easy to set up:
- Clone plugin repository from https://github.com/getpelican/pelican-plugins. I placed this content in a directory at the same level as
pelicanconf.py
. -
To the file
pelicanconf.py
add1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
PLUGIN_PATH = u'pelican-plugins' PLUGINS = [u'sitemap',] SITEMAP = { 'format': 'xml', 'priorities': { 'articles': 1, 'indexes': 0.5, 'pages': 0.5, }, 'changefreqs': { 'articles': 'always', 'indexes': 'hourly', 'pages': 'never' } }
The documentation for
sitemap
currently specifies the linePLUGINS=['pelican.plugins.sitemap',]
but that leads to an error here (I am currently using Python 3.4).
-
After running
make devserver
the
sitemap.xml
is generated and once pushed to the server it can be tested by a tool on Google Developer Tools and then added to the site's account there officially.
Google Developer Tools includes some search traffic statistics that I was using Google Analytics for.
We shall see if doing all this makes any change in how thoroughly Google indexes my site. The old WordPress site appeared prominently in search results.
[end]