Generate a search engine friendly sitemap.xml using a Gulp stream
Easily generate a search engine friendly sitemap.xml from your project.
:bowtie: Search engines love the sitemap.xml and it helps SEO as well.
For information about sitemap properties and structure, see the wiki for sitemaps
Install with npm
$ npm install --save-dev gulp-sitemap
var gulp = require('gulp');
var sitemap = require('gulp-sitemap');
gulp.task('sitemap', function () {
gulp.src('build/**/*.html', {
read: false
})
.pipe(sitemap({
siteUrl: 'http://www.amazon.com'
}))
.pipe(gulp.dest('./build'));
});
siteUrl is required.
index.html will be turned into directory path
/.
404.html will be skipped automatically. No need to unglob it.
Let's see an example of how we can create and output a sitemap, and then return to the original stream files:
var gulp = require('gulp');
var sitemap = require('gulp-sitemap');
var save = require('gulp-save');
gulp.task('html', function() {
gulp.src('*.html', {
read: false
})
.pipe(save('before-sitemap'))
.pipe(sitemap({
siteUrl: 'http://www.amazon.com'
})) // Returns sitemap.xml
.pipe(gulp.dest('./dist'))
.pipe(save.restore('before-sitemap')) //restore all files to the state when we cached them
// -> continue stream with original html files
// ...
});
Your website's base url. This gets prepended to all documents locations.
Type:
string
Required:
true
Determine the output filename for the sitemap.
Type:
string
Default:
sitemap.xml
Required:
false
Gets filled inside the sitemap in the tag
<changefreq>. Not added by default.
Type:
string
Default:
undefined
Valid Values:
['always', 'hourly', 'daily', 'weekly', 'monthly', 'yearly', 'never']
Required:
false
Note: any falsey value is also valid and will skip this xml tag
Gets filled inside the sitemap in the tag
<priority>. Not added by default.
Type:
string|function
Default:
undefined
Valid Values:
0.0 to
1.0
Required:
false
Note: any falsey (non-zero) value is also valid and will skip this xml tag
Example using a function as
priority:
priority: function(siteUrl, loc, entry) {
// Give pages inside root path (i.e. no slashes) a higher priority
return loc.split('/').length === 0 ? 1 : 0.5;
}
The file last modified time.
null then this plugin will try to get the last modified time from the stream vinyl file, or use
Date.now() as lastmod.
null - It will be used as lastmod.
lastmod is a function, it is executed with the current file given as parameter. (Note: the function is expected to be sync).
lastmod.
Type:
string|datetime|function
Default:
null
Required:
false
Example that uses git to get lastmod from the latest commit of a file:
lastmod: function(file) {
var cmd = 'git log -1 --format=%cI "' + file.relative + '"';
return execSync(cmd, {
cwd: file.base
}).toString().trim();
}
Note: any falsey (other than null) value is also valid and will skip this xml tag
How to join line in the target sitemap file.
Type:
string
Default: Your OS's new line, mostly:
\n
Required:
false
How should the sitemap xml file be spaced. You can use
\t for tabs, or
Type:
string
Default:
Required:
false
Exclude pages from the sitemap when the
robots meta tag is set to
noindex. The plugin needs to be able to read the contents of the files for this to have an effect.
Type:
boolean
Default:
false
Required:
false
For generate sitemap for images per page, just enable images flag to
true
Type:
boolean
Default:
undefined
Required:
false
An object to custom map pages to their own configuration.
This should be an array with the following structure:
Type:
array
Default:
[]
Required:
false
Example:
mappings: [{
pages: [ 'minimatch pattern' ],
changefreq: 'hourly',
priority: 0.5,
lastmod: Date.now(),
getLoc(siteUrl, loc, entry) {
// Removes the file extension if it exists
return loc.replace(/\.\w+$/, '');
},
hreflang: [{
lang: 'ru',
getHref(siteUrl, file, lang, loc) {
return 'http://www.amazon.ru/' + file;
}
}]
},
//....
]
hreflang,
changefreq,
priority,
loc and
lastmod.
Type:
array
Required:
true
This is an array with minimatch patterns to match the relevant pages to override. Every file will be matched against the supplied patterns.
Uses multimatch to match patterns against filenames.
Example:
pages: ['home/index.html', 'home/see-*.html', '!home/see-admin.html']
Matching pages can get their
hreflang tags set using this option.
The input is an array like so:
hreflang: [{
lang: 'ru',
getHref: function(siteUrl, file, lang, loc) {
// return href src for the hreflang. For example:
return 'http://www.amazon.ru/' + file;
}
}]
Matching pages can get their
loc tag modified by using a function.
getLoc: function(siteUrl, loc, entry) {
return loc.replace(/\.\w+$/, '');
}
Type:
boolean
Required:
false
Default:
false
If true, will log the number of files that where handled.
To grunt-sitemap for the inspiration on writing this.
MIT © Gilad Peleg