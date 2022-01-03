AssetGraph is an extensible, node.js-based framework for manipulating and optimizing web pages and web applications. The main core is a dependency graph model of your entire website, where all assets are treated as first class citizens. It can automatically discover assets based on your declarative code, reducing the configuration needs to a minimum.
If you just want to get started with the basics, read Peter Müller - Getting started with Assetgraph.
If you are looking for a prepackaged build system take a look at Assetgraph-builder.
All web build tools, even those that target very specific problems, have to get a bunch of boring stuff right just to get started, such as loading files from disc, parsing and serializing them, charsets, inlining, finding references to other files, resolution of and updating urls, etc.
The observation that inspired the project is that most of these tasks can be viewed as graph problems, where the nodes are the assets (HTML, CSS, images, JavaScript...) and the edges are the relations between them, e.g. anchor tags, image tags, favorite icons, css background-image properties and so on.
An AssetGraph object is a collection of assets (nodes) and the relations (edges) between them. It's a basic data model that allows you to populate, query, and manipulate the graph at a high level of abstraction. For instance, if you change the url of an asset, all relations pointing at it are automatically updated.
Additionally, each individual asset can be inspected and massaged using a relevant API: jsdom for HTML, PostCSS for CSS, and an Esprima AST for Javascript.
AssetGraph represents inline assets the same way as non-inline ones,
so eg. inline scripts, stylesheets, and images specified as
data:
urls are also first-class nodes in the graph. This means that you
don't need to dig into the HTML of the containing asset to manipulate
them. An extreme example would be an Html asset with a conditional
comment with an inline stylesheet with an inline image, which are
modelled as 4 separate assets:
<!DOCTYPE html>
<html>
<head>
<!--[if !IE]> -->
<style type="text/css">
body {
background-image: url();
}
</style>
<!-- <![endif]-->
</head>
<body></body>
</html>
These are some of the supported assets and associated relation types:
<a>,
<link rel="stylesheet|shortcut icon|fluid-icon|alternate|serviceworker">,
<script>,
<style>,
<html manifest="...">
<img>,
<video>,
<audio>,
<applet>,
<embed>,
<esi:include>,
<iframe>,
<svg>,
<meta property="og:...">
<style>, inline
style=... attributes, event handlers,
<?xml-stylesheet href=...>,
<font-face-src>
//# sourceMappingURL=...,
background-image: url(...),
@import url(...),
behavior: url(...),
filter: AlphaImageLoader(src='...'),
@font-face { src: url(...) }
//# sourceMappingURL=..., homegrown
'foo/bar.png'.toString('url') syntax for referencing external files
Icon urls,
related_applications,
start_url, etc.
Entries in the
CACHE,
NETWORK and
FALLBACK sections
(none)
<script> and
<link rel='stylesheet'> tags in your
HTML.
-ag-sprite prefix.
Make sure you have node.js and npm installed, then run:
$ npm install assetgraph
AssetGraph supports a flexible syntax for finding assets and relations
in a populated graph using the
findAssets and
findRelations
methods. Both methods take a query object as the first argument.
The query engine uses MongoDB-like queries via the
sift module. Please consult that
to learn about the advanced querying features. Below are some basic examples.
Get an array containing all assets in the graph:
var allAssets = assetGraph.findAssets();
Find assets by type:
var htmlAssets = assetGraph.findAssets({ type: 'Html' });
Find assets of different named types:
var jsAndCss = assetGraph.findAssets({ type: { $in: ['Css', 'JavaScript' ] });
Find assets by matching a regular expression against the url:
var localImageAssets = assetGraph.findAssets({
url: { $regex: /^file:.*\.(?:png|gif|jpg)$/ },
});
Find assets by predicate function:
var orphanedJavaScriptAssets = assetGraph.findAssets(function (asset) {
return (
asset.type === 'JavaScript' &&
assetGraph.findRelations({ to: asset }).length === 0
);
});
Find all HtmlScript (
<script src=...> and inline
<script>) relations:
var allHtmlScriptRelations = assetGraph.findRelations({ type: 'HtmlScript' });
Query objects have "and" semantics, so all conditions must be met for a multi-criteria query to match:
var textBasedAssetsOnGoogleCom = assetGraph.findAssets({
isText: true,
url: { $regex: /^https?:\/\/(?:www\.)google\.com\// },
});
Find assets by existence of incoming relations:
var importedCssAssets = assetGraph.findAssets({
type: 'Css',
incomingRelations: { $elemMatch: { type: 'CssImport' } },
});
Relation queries can contain nested asset queries when querying the
to and
from properties.
Find all HtmlAnchor (
<a href=...>) relations pointing at local images:
assetGraph.findRelations({
type: 'HtmlAnchor',
to: { isImage: true, url: { $regex: /^file:/ } },
});
AssetGraph comes with a collection of premade "transforms" that you can use as high level building blocks when putting together your build procedure. Most transforms work on a set of assets or relations and usually accept a query object so they can be scoped to work on only a specific subset of the graph.
Usually you'll start by loading some initial assets from disc or via
http using the
loadAssets transform, then get the related assets
added using the
populate transform, then do the actual
processing. Eventually you'll probably write the resulting assets back
to disc.
Thus the skeleton looks something like this:
var AssetGraph = require('assetgraph');
const assetGraph = new AssetGraph({ root: '/the/root/directory/' });
await assetGraph.loadAssets('*.html'); // Load all Html assets in the root dir
await assetGraph.populate({ followRelations: { type: 'HtmlAnchor' } }); // Follow <a href=...>
// More work...
await assetGraph.writeAssetsToDisc({ type: 'Html' }); // Overwrite existing files
// Done!
In the following sections the built-in transforms are documented individually:
Add a
CacheManifest asset to each
Html asset in the graph (or
to all
Html assets matched by
queryObj if provided). The cache
manifests will contain relations to all assets reachable by traversing
the graph through relations other than
HtmlAnchor.
Bundle the
Css and
JavaScript assets pointed to by the
relations matched by
queryObj.
The
strategyName (string) parameter can be either:
oneBundlePerIncludingAsset (the default)
Each unique asset pointing to one or more of the assets being
bundled will get its own bundle. This can lead to duplication if
eg. several
Html assets point to the same sets of assets, but
guarantees that the number of http requests is kept low.
sharedBundles
Create as many bundles as needed, optimizing for combined byte size
of the bundles rather than http requests. Warning: Not as well
tested as
oneBundlePerIncludingAsset.
Note that a conditional comment within an
Html asset conveniently
counts as a separate including asset, so in the below example
ie.css and
all.css won't be bundled together:
<!--[if IE]><link rel="stylesheet" href="ie.css" /><![endif]-->
<link rel="stylesheet" href="all.css" />
The created bundles will be placed at the root of the asset graph with
names derived from their unique id (for example
file://root/of/graph/124.css) and will replace the original
assets.
Compresses all
JavaScript assets in the graph (or those specified by
queryObj).
The
compressorName (string) parameter can be either:
uglifyJs (the default and the fastest)
The excellent UglifyJS
compressor. If provided, the
compressorOptions object will be
passed to UglifyJS'
ast_squeeze command.
yuicompressor
Yahoo's YUICompressor though Tim-Smart's node-yuicompressor
module. If provided, the
compressorOptions object will be
passed as the second argument to
require('yui-compressor').compile.
closurecompiler
Google's Closure Compiler through Tim-Smart's node-closure
module. If provided, the
compressorOptions object will be
passed as the second argument to
require('closure-compiler').compile.
Finds all
Html assets in the graph (or those specified by
queryObj), finds all
CssImport relations (
@import url(...)) in inline and external CSS and converts them to
HtmlStyle relations directly from the Html document.
Effectively the inverse of
assetGraph.convertHtmlStylesToInlineCssImports.
Example:
<style type="text/css">
@import url(print.css) print;
@import url(foo.css);
body {
color: red;
}
</style>
is turned into:
<link rel="stylesheet" href="print.css" media="print" />
<link rel="stylesheet" href="foo.css" />
<style type="text/css">
body {
color: red;
}
</style>
Finds all
Html assets in the graph (or those specified by
queryObj), finds all outgoing, non-inline
HtmlStyle relations
(
<link rel='stylesheet' href='...'>) and turns them into groups of
CssImport relations (
@import url(...)) in inline
stylesheets. A maximum of 31
CssImports will be created per inline
stylesheet.
Example:
<link rel="stylesheet" href="foo.css" />
<link rel="stylesheet" href="bar.css" />
is turned into:
<style type="text/css">
@import url(foo.css);
@import url(bar.css);
</style>
This is a workaround for the limit of 31 stylesheets in Internet Explorer <= 8. This transform allows you to have up to 31*31 stylesheets in the development version of your HTML and still have it work in older Internet Explorer versions.
Uses the Graphviz
dot command to render the current contents of the
graph and writes the result to
fileName. The image format is
automatically derived from the extension and can be any of these. Using
.svg is recommended.
Requires Graphviz to be installed,
sudo apt-get install graphviz on
Debian/Ubuntu.
Experimental: For each JavaScript asset in the graph (or those matched by
queryObj), find all reachable
JavaScript assets and execute them
in order.
If the
context parameter is specified, it will be used as the
execution context. Otherwise a new context will be created using
vm.createContext.
Finds all inline relations in the graph (or those matched by
queryObj) and makes them external. The file names will be derived
from the unique ids of the assets.
For example:
<script>
foo = 'bar';
</script>
<style type="text/css">
body {
color: maroon;
}
</style>
could be turned into:
<script src="4.js"></script>
<link rel="stylesheet" href="5.css" />
Finds all
Html assets in the graph (or those matched by
queryObj), finds all directly reachable
Css assets, and
converts the outgoing
CssImage relations (
background-image
etc.) to
data: urls, subject to these criteria:
If
options.sizeThreshold is specified, images with a greater byte size
won't be inlined.
To avoid duplication, images referenced by more than one
CssImage relation won't be inlined.
A
CssImage relation pointing at an image with an
inline GET
parameter will always be inlined (eg.
background-image: url(foo.png?inline);). This takes precedence over the first two
criteria.
If
options.minimumIeVersion is specified, the
data: url length
limitations of that version of Internet Explorer will be honored.
If any image is inlined an Internet Explorer-only version of the
stylesheet will be created and referenced from the
Html asset in a
conditional comment.
For example:
await assetGraph.inlineCssImagesWithLegacyFallback(
{ type: 'Html' },
{ minimumIeVersion: 7, sizeThreshold: 4096 }
);
where
assetGraph contains an Html asset with this fragment:
<link rel="stylesheet" href="foo.css" />
and
foo.css contains:
body {
background-image: url(small.gif);
}
will be turned into:
<!--[if IE]><link rel="stylesheet" href="foo.css" /><![endif]-->
<!--[if !IE]>--><link rel="stylesheet" href="1234.css" /><!--<![endif]-->
where
1234.css is a copy of the original
foo.css with the
images inlined as
data: urls:
body {
background-image: url();
}
The file name
1234.css is just an example. The actual asset file
name will be derived from the unique id of the copy and be placed at
the root of the assetgraph.
Inlines all relations in the graph (or those matched by
queryObj). Only works on relation types that support inlining, for
example
HtmlScript,
HtmlStyle, and
CssImage.
Example:
await assetGraph.inlineRelations({ type: { $in: ['HtmlStyle', 'CssImage'] } });
where
assetGraph contains an Html asset with this fragment:
<link rel="stylesheet" href="foo.css" />
and
foo.css contains:
body {
background-image: url(small.gif);
}
will be turned into:
<style type="text/css">
body {
background-image: url();
}
</style>
Note that
foo.css and the
CssImage will still be modelled as
separate assets after being inlined, so they can be manipulated the
same way as when they were external.
Add new assets to the graph and make sure they are loaded, returning a promise that fulfills with an array of the assets that were added. Several syntaxes are supported, for example:
const [aHtml, bCss] = await assetGraph.loadAssets('a.html', 'b.css'); // Relative to assetGraph.root
await assetGraph.loadAssets({
url: 'http://example.com/index.html',
text: 'var foo = bar;', // The source is specified, won't be loaded
});
file:// urls support wildcard expansion:
await assetGraph.loadAssets('file:///foo/bar/*.html'); // Wildcard expansion
await assetGraph.loadAssets('*.html'); // assetGraph.root must be file://...
Compute the MD5 sum of every asset in the graph (or those specified by
queryObj) and remove duplicates. The relations pointing at the
removed assets are updated to point at the copy that is kept.
For example:
await assetGraph.mergeIdenticalAssets({ type: { $in: ['Png', 'Css'] } });
where
assetGraph contains an
Html asset with this fragment:
<head>
<style type="text/css">
body {
background-image: url(foo.png);
}
</style>
</head>
<body>
<img src="bar.png" />
</body>
will be turned into the following if
foo.png and
bar.png are identical:
<head>
<style type="text/css">
body {
background-image: url(foo.png);
}
</style>
</head>
<body>
<img src="foo.png" />
</body>
and the
bar.png asset will be removed from the graph.
Minify all assets in the graph, or those specified by
queryObj. Only has an effect for asset types that support
minification, and what actually happens also varies:
Html and
Xml
Pure-whitespace text nodes are removed immediately.
Json,
JavaScript, and
Css
The asset gets marked as minified (
isPretty is set to
false), which doesn't affect the in-memory representation
(
asset.parseTree), but is honored when the asset is serialized.
For
JavaScript this only governs the amount of whitespace
(escodegen's
compact parameter); for how to apply variable renaming and
other compression techniques see
assetGraph.compressJavaScript.
Compare to
assetGraph.prettyPrintAssets.
Add assets to the graph by recursively following "dangling
relations". This is the preferred way to load a complete web site or
web application into an
AssetGraph instance after using
assetGraph.loadAssets to add one or more assets to serve as the
starting point for the population. The loading of the assets happens
in parallel.
The
options object can contain these properties:
from: queryObj
Specifies the set assets of assets to start populating from (defaults to all assets in the graph).
followRelations: queryObj
Limits the set of relations that are followed. The default is to follow all relations.
onError: function (err, assetGraph, asset)
If there's an error loading an asset and an
onError function is
specified, it will be called, and the population will continue. If
not specified, the population will stop and pass on the error to its
callback. (This is poorly thought out and should be removed or
redesigned).
concurrency: Number
The maximum number of assets that can be loading at once (defaults to 100).
Example:
const assetGraph = new AssetGraph();
await assetGraph.loadAssets('a.html');
await assetGraph.populate({
followRelations: {
type: 'HtmlAnchor',
to: { url: { $regex: /\/[bc]\.html$/ } },
},
});
If
a.html links to
b.html, and
b.html links to
c.html
(using
<a href="...">), all three assets will be in the graph
after
assetGraph.populate is done. If
c.html happens to link
to
d.html,
d.html won't be added.
Pretty-print all assets in the graph, or those specified by
queryObj. Only has an effect for asset types that support pretty
printing (
JavaScript,
Css,
Html,
Xml, and
Json).
The asset gets marked as pretty printed (
isPretty is set to
true), which doesn't affect the in-memory representation
(
asset.parseTree), but is honored when the asset is
serialized. For
Xml, and
Html, however, the existing
whitespace-only text nodes in the document are removed immediately.
Compare to
assetGraph.minifyAssets.
Example:
// Pretty-print all Html and Css assets:
await assetGraph.prettyPrintAssets({ type: { $in: ['Html', 'Css'] } });
Remove all relations in the graph, or those specified by
queryObj.
The
options object can contain these properties:
detach: Boolean
Whether to also detach the relations (remove their nodes from the
parse tree of the source asset). Only supported for some relation
types. Defaults to
false.
removeOrphan: Boolean
Whether to also remove assets that become "orphans" as a result of removing their last incoming relation.
Sets the
width and
height attributes of the
img elements
underlying all
HtmlImage relations, or those matching
queryObj. Only works when the image pointed to by the relation is
in the graph.
Example:
const AssetGraph = require('assetgraph');
const assetGraph = new AssetGraph();
await assetGraph.loadAssets('hasanimage.html');
await assetGraph.populate();
// assetGraph.findAssets({type: 'Html'})[0].text === '<body><img src="foo.png"></body>'
await assetGraph.setHtmlImageDimensions();
// assetGraph.findAssets({type: 'Html'})[0].text === '<body><img src="foo.png" width="29" height="32"></body>'
Dumps an ASCII table with some basic stats about all the assets in the
graph (or those matching
queryObj) in their current state.
Example:
Ico 1 1.1 KB
Png 28 196.8 KB
Gif 145 129.4 KB
Json 2 60.1 KB
Css 2 412.6 KB
JavaScript 34 1.5 MB
Html 1 1.3 KB
Total: 213 2.2 MB
Writes the assets matching
queryObj to disc. The
outRoot
parameter must be a
file:// url specifying the directory where the
files should be output. The optional
root parameter specifies the
url that you want to correspond to the
outRoot directory (defaults
to the
root property of the AssetGraph instance).
Directories will be created as needed.
Example:
const AssetGraph = require('assetgraph');
const assetGraph = new AssetGraph({root: 'http://example.com/'});
await assetGraph.loadAssets(
'http://example.com/bar/quux/foo.html',
'http://example.com/bar/baz.html'
);
// Write the two assets to /my/output/dir/quux/foo.html and /my/output/dir/baz.html:
await assetGraph.writeAssetsToDisc({type: 'Html'} 'file:///my/output/dir/', 'http://example.com/bar/');
Writes all assets in the graph (or those specified by
queryObj) to
stdout. Mostly useful for piping out a single asset.
AssetGraph is licensed under a standard 3-clause BSD license -- see the
LICENSE-file for details.