documerica
This repository holds JSON(L) artifacts and a few scripts related to managing archival data from the EPA's DOCUMERICA program.
Contents:
Makefile
: A Makefile with some convenience rules for rebuilding files*.sort.json.gz
: A compressed "raw" JSON collection of API results from the National Archives, pertaining to DOCUMERICAdocumerica.jsonl
: A filtered JSONL collection of DOCUMERICA records, munged for use with the Twitter botfilter.jq
: Ajq
filter for transforming*.sort.jsonl
intodocumerica.jsonl
missing.txt
: A newline-delimited list of National Archive IDs (NAIDs) for DOCUMERICA records that are missing photographic scansmake_db.py
: A Python script that createsdocumerica.db
fromdocumerica.jsonl
bot/
: The Twitter bot
License
The API results stored in the library are public domain.
All other material is under a modified MIT License.