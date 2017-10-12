A fast, stand-alone legal citation extractor.
Currently supports:
usc: US Code
law: US Slip Laws (public and private laws)
stat: US Statutes at Large
cfr: US Code of Federal Regulations
dc_code: DC Code
dc_register: DC Register
dc_law: DC Slip Law
With limited, opt-in support for:
judicial: US court opinions, using walverine (more below)
As you can see, Citation is currently US-only, but we'd love for that to change. There are lots more citation types out there, and it's easy to contribute, so please help us grow!
Compatible in-browser with modern browsers, including IE 9+.
Citation can be used:
But one way or another, you pass in text:
Citation.find("pursuant to 5 U.S.C. 552(a)(1)(E) and");
And you get back data about matched citations:
[{
"match": "5 U.S.C. 552(a)(1)(E)",
"citation": "5 U.S.C. 552(a)(1)(E)",
"type": "usc",
"index": "0",
"usc": {
"title": "5",
"section": "552",
"subsections": ["a", "1", "E"],
"id": "usc/5/552/a/1/E",
"section_id": "usc/5/552"
}
}]
Check out /browser for browser-ready compressed and uncompressed versions of the library.
Loading any of them with a
<script> tag will result in a global
Citation object being available for immediate use.
Install Node.js and NPM, then install Citation globally (may require
sudo):
npm install -g citation
Or install it locally to a
node_modules directory with
npm install citation.
Citation.find(text, options)
Check a block of
text for citations of a given type, returning an array of
matches with citations broken out into fields.
options can include:
types: (string | string array) Limit citation types to those given. e.g.
["usc", "law"]
excerpt: (integer) Return an
excerpt of the surrounding text for each detected cite, with the given number of characters on either side.
parents: (boolean) For any cite, return any "parent" cites alongside it. For example, matching "5 USC 552(b)(3)" would return 3 results - one for the parent section, one for
(b), and one for
(b)(3).
filter: (string) Enable Filtering.
replace: (function | object) Enable Replacement.
links: (boolean) Include Links.
Some examples:
Citation.find("pursuant to 5 U.S.C. 552(a)(1)(E) and");
// Yields:
[{
"match": "5 U.S.C. 552(a)(1)(E)",
"citation": "5 U.S.C. 552(a)(1)(E)",
"type": "usc",
"index": "0",
"usc": {
"title": "5",
"section": "552",
"subsections": ["a", "1", "E"],
"id": "usc/5/552/a/1/E",
"section_id": "usc/5/552"
}
}]
Citation.find("that term in section 5362(5) of title 31, United States Code.", {
excerpt: 10
})
// Yields:
[{
"match": "section 5362(5) of title 31",
"citation": "31 U.S.C. 5362(5)",
"excerpt": "t term in section 5362(5) of title 31, United S",
// ... more details ...
}]
Start the API on a given port (defaults to 3000):
cite-server [port]
GET or POST to
/citation/find with a
text parameter:
curl http://localhost:3000/citation/find?text=5+U.S.C.+552%28a%29%281%29%28E%29
curl -XPOST "http://localhost:3000/citation/find" -d "text=5 U.S.C. 552(a)(1)(E)"
Will return the results of running Citation.find() on the block of text, under a
results key:
{
"results": [
{
"match": "5 U.S.C. 552(a)(1)(E)",
"citation": "5 U.S.C. 552(a)(1)(E)",
"type": "usc",
"index": "0",
"usc": {
"title": "5",
"section": "552",
"subsections": ["a", "1", "E"],
"id": "usc/5/552/a/1/E",
"section_id": "usc/5/552"
}
}
]
}
Some HTTP-specific parameters:
callback: a function name to use as a JSONP callback.
pretty: prettify (indent) output.
And some of the options that the JavaScript API supports:
text: required, text to extract citations from.
options[excerpt]: include excerpts with up to this many characters around it.
options[types]: limit citation types to a comma-separated list (e.g. "usc,law")
See etc/ for an example upstart script to keep
cite-server running in production.
The shell command can accept a string to parse as an argument or through STDIN, and outputs results to STDOUT as indented JSON.
cite "section 5362(5) of title 31"
echo "section 5362(5) of title 31" | cite
cite "pursuant to 5 U.S.C. 552(a)(1)(E) and > results.json"
Pass any options the library takes, using dot operators to pass nested options.
For example, searching among types:
cite --types=usc,law "section 5362(5) of title 31"
Passing nested options:
cite --dc_code.source=dc_code "and then § 3-101.01 happened"
Opt-in to using
walverine to search judicial cites with
--judicial:
cite --judicial "Smith v. Hardibble, 111 Cal.2d 222, 555, 558, 333 Cal.3d 444 (1988)"
Add
--links to include links in the output.
Instead of treating the input text as just a blob of text that matches citations at a string index, you can apply a "filter" that will parse the input text and provide more precise context.
For each citation, return the line number and the relative character index of the match inside that line.
Example:
cite --pretty --filter=lines "I once met a cite named nancy
whose 5 usc 552 was awfully fancy
and then the poem ended"
{
"citations": [
{
"type": "usc",
"match": "5 usc 552",
"index": 6,
"citation": "5 U.S.C. 552",
"usc": {
"title": "5",
"section": "552",
"subsections": [],
"id": "usc/5/552"
},
"line": 2
}
]
}
For each citation, return an XPath statement identifying the match's specific node in the input document, and the relative character index of the match inside that node.
Example:
cite --pretty --filter=xpath_xml "
<?xml>
<document>
<title>Best Bill of 2012</title>
<bill>
<introduction>Bill to enforce happiness amongst all the children</introduction>
<closing>All information releasable through 5 U.S.C. 552 is now banned</closing>
<footer>(c) Congress</footer>
</bill>
</document>
"
{
"citations": [
{
"type": "usc",
"match": "5 U.S.C. 552",
"index": 35,
"citation": "5 U.S.C. 552",
"usc": {
"title": "5",
"section": "552",
"subsections": [],
"id": "usc/5/552"
},
"xpath": "/document[1]/bill[1]/closing[1]/text()[1]"
}
]
}
You can perform a "find-and-replace" with detected citations, by providing a
replace callback to be executed on each citation, that returns the string to replace that citation.
By passing a
replace callback, a
text field will be included at the top of the returned object, with the processed text.
Citation.find("click on 5 USC 552 to read more", {
replace: function(cite) {
var url = "http://www.law.cornell.edu/uscode/text/" + cite.usc.title + "/" + cite.usc.section;
return "<a href=\"" + url + "\">" + cite.match + "</a>";
};
});
The response will have a
text field containing:
click on <a href="http://www.law.cornell.edu/uscode/text/5/552">5 USC 552</a> to read more
This feature is only available in the JavaScript API.
With the
links option, each matched citation will include URLs to access the content of the citation on the web. For:
Citation.find("pursuant to 5 U.S.C. 552(a)(1)(E) and", { links: true });
you will get back an extended object with permalinks:
[{
"match": "5 U.S.C. 552(a)(1)(E)",
"type": "usc",
...
"usc": {
"id": "usc/5/552/a/1/E",
...
"links": {
"usgpo": {
"source": {
"name": "U.S. Government Publishing Office",
"abbreviation": "US GPO",
"link": "http://www.gpo.gov",
"authoritative": true,
"note": "2014 edition. Sub-section citation is not reflected in the link."
},
"pdf": "http://api.fdsys.gov/link?collection=uscode&year=2014&title=5§ion=552&type=usc",
"html": "http://api.fdsys.gov/link?collection=uscode&year=2014&title=5§ion=552&type=usc&link-type=html",
"landing": "http://api.fdsys.gov/link?collection=uscode&year=2014&title=5§ion=552&type=usc&link-type=contentdetail"
},
"cornell_lii": {
"source": {
"name": "Cornell Legal Information Institute",
"abbreviation": "Cornell LII",
"link": "https://www.law.cornell.edu/uscode/text",
"authoritative": false,
"note": "Link is to most current version of the US Code, as available at law.cornell.edu."
},
"landing": "https://www.law.cornell.edu/uscode/text/5/552#a_1_E"
}
}
}
}]
The
links object maps sources to one or more renditions. The rendition types are
html (for raw HTML content),
landing for a landing page (i.e. a website) about the document refered to by the citation, and
mods (US GPO MODS XML files).
You can pass arbitrary options to individual citators, if that citator supports them.
By using a key is the key of a citator, e.g.
usc or
dc_code, that citator's processors will get the value of that key passed in as an argument.
For example, the
dc_code citator accepts a
source option, to indicate
what the text source is. If the value of
source is itself "dc_code",
then the citator will apply a looser pattern to detect internal cites.
That looks like this:
Citation.find("required under § 3-101.01(13)(e), the Commission shall perform the", {
dc_code: {source: "dc_code"}
})
That will match
§ 3-101.01(13)(e), because the
dc_code citator assumes it's processing the text of the DC Code itself, and internal references are unambiguous.
Citation can integrate with walverine to detect and return results for US court opinions.
To use walverine, you may need to "opt-in" to including
judicial-type citations.
In JavaScript:
Citation.types.judicial = require("./citations/judicial");
In CLI:
cite --judicial "Text to scan"
The HTTP server,
cite-server actually loads
judicial cites by default, since the performance penalty is absorbed on start-up.
walverine's support for extra features is limited. When detecting
judicial-type cites, there is no support for:
This project is tested with nodeunit.
To run tests, you'll need to install this project from source and install its node dependencies:
git clone git@github.com:unitedstates/citation.git
cd citation
npm install
npm test
Test cases are stored in the
test directory. Each test case covers a subsection
of the code and ensures that citations are correctly detected: for instance, see
test/stat.js.
To run all tests:
nodeunit test
To run a specific test:
nodeunit test/usc.js
This project is dedicated to the public domain. As spelled out in CONTRIBUTING:
The project is in the public domain within the United States, and copyright and related rights in the work worldwide are waived through the CC0 1.0 Universal public domain dedication.
All contributions to this project will be released under the CC0 dedication. By submitting a pull request, you are agreeing to comply with this waiver of copyright interest.