Data Assimilation with Python: a Package for Experimental Research





GitHub Stars



Last Commit

2d ago









DAPPER is a set of templates for benchmarking the performance of data assimilation (DA) methods. The tests provide experimental support and guidance for new developments in DA. The typical set-up is a synthetic (twin) experiment, where you specify a dynamic model and an observational model, and use these to generate a synthetic truth (multivariate time series), and then estimate that truth given the models and noisy observations.

Github CI Coveralls pre-commit PyPI - Version PyPI - Downloads

Getting started

Install, then read, run and try to understand examples/basic_{1,2,3}.py. Some of the examples can also be opened in Jupyter, and thereby run in the cloud (i.e. without installation, but requiring Google login): Open In Collab. This screencast provides an introduction. The documentation includes general guidelines and the API, but for any serious use you will want to read and adapt the code yourself. If you use it in a publication, please cite, e.g., The experiments used (inspiration from) DAPPER [ref], version 1.2.1, where [ref] points to DOI. Lastly, for an introduction to DA theory also using Python, see these tutorials.


DAPPER enables the numerical investigation of DA methods through a variety of typical test cases and statistics. It (a) reproduces numerical benchmarks results reported in the literature, and (b) facilitates comparative studies, thus promoting the (a) reliability and (b) relevance of the results. For example, this figure is generated by examples/basic_3.py and is a reproduction of figure 5.7 of these lecture notes.

Comparative benchmarks with Lorenz'96 plotted as a function of the ensemble size (N)

DAPPER is (c) open source, written in Python, and (d) focuses on readability; this promotes the (c) reproduction and (d) dissemination of the underlying science, and makes it easy to adapt and extend. It also comes with a battery of diagnostics and statistics, and live plotting (on-line with the assimilation) facilities, including pause/inspect options, as illustrated below

EnKF - Lorenz'63

In summary, it is well suited for teaching and fundamental DA research. Also see its drawbacks.


Works on Linux/Windows/Mac.

Prerequisite: Python>=3.7

If you're an expert, setup a python environment however you like. Otherwise: Install Anaconda, then open the Anaconda terminal and run the following commands:

conda create --yes --name dapper-env python=3.8
conda activate dapper-env
python -c 'import sys; print("Version:", sys.version.split()[0])'

Ensure the output at the end gives a version bigger than 3.7.
Keep using the same terminal for the commands below.


Do you want the DAPPER code available to play around with? Then

  • Download and unzip (or git clone) DAPPER.
  • Move the resulting folder wherever you like,
    and cd into it (ensure you're in the folder with a setup.py file).
  • pip install -e .[dev]
    You can omit [dev] if you don't need to do serious development.

Or: Install as library

Do you just want to run a script that requires DAPPER? Then

  • If the script comes with a requirements.txt file, then do
    pip install -r path/to/requirements.txt.
  • If not, hopefully you know the version of DAPPER needed. Run
    pip install dapper==1.0.0 to get version 1.2.3 (as an example).

Finally: Test the installation

You should now be able to do run your script with python path/to/script.py.
For example, if you are in the DAPPER dir,

python examples/basic_1.py

PS: If you closed the terminal (or shut down your computer), you'll first need to run conda activate dapper-env

DA methods

MethodLiterature reproduced
EnKF 1Sakov08, Hoteit15, Grudzien2020
EnKF-NBocquet12, Bocquet15
EnKS, EnRTSRaanes2016
iEnKS / iEnKF / EnRML / ES-MDA 2Sakov12, Bocquet12, Bocquet14
LETKF, local & serial EAKFBocquet11
Sqrt. model noise methodsRaanes2014
Particle filter (bootstrap) 3Bocquet10
Optimal/implicit Particle filter 3Bocquet10
NETFTödter15, Wiljes16
Rank histogram filter (RHF)Anderson10
Extended KF
Optimal interpolation

1: Stochastic, DEnKF (i.e. half-update), ETKF (i.e. sym. sqrt.). Serial forms are also available.
Tuned with inflation and "random, orthogonal rotations".
2: Also supports the bundle version, and "EnKF-N"-type inflation.
3: Resampling: multinomial (including systematic/universal and residual).
The particle filter is tuned with "effective-N monitoring", "regularization/jittering" strength, and more.

For a list of ready-made experiments with suitable, tuned settings for a given method (e.g. the iEnKS), use:

grep -r "xp.*iEnKS" dapper/mods

Test cases (models)

ModelLinTLM**PDE?Phys.dim.State lenLyap≥0Implementer
Linear Advect. (LA)YesYesYes1d1000 *51Evensen/Raanes
LotkaVolterraNoYesNo0d5 *1Wikipedia/Raanes
Lorenz96NoYesNo1d40 *13Raanes
Lorenz96sNoYesNo1d10 *4Grudzien
LorenzUVNoYesNo2x 1d256 + 8 *≈60Raanes
LorenzIIINoNoNo1d960 *≈164Raanes
Vissio-Lucarini 20NoYesNo1d36 *10Yumeng
Kuramoto-SivashinskyNoYesYes1d128 *11Kassam/Raanes
Quasi-Geost (QG)NoNoYes2d129²≈17k≈140Sakov
  • *: Flexible; set as necessary
  • **: Tangent Linear Model included?

The models are found as subdirectories within dapper/mods. A model should be defined in a file named __init__.py, and illustrated by a file named demo.py. Most other files within a model subdirectory are usually named authorYEAR.py and define a HMM object, which holds the settings of a specific twin experiment, using that model, as detailed in the corresponding author/year's paper. A list of these files can be obtained using

find dapper/mods -iname '[a-z]*[0-9]*.py'

Some files contain settings used by several papers. Moreover, at the bottom of each such file should be (in comments) a list of suitable, tuned settings for various DA methods, along with their expected, average rmse.a score for that experiment. As mentioned above, DAPPER reproduces literature results. You will also find results that were not reproduced by DAPPER.

Similar projects

DAPPER is aimed at research and teaching (see discussion up top). Example of limitations:

  • It is not suited for very big models (>60k unknowns).
  • Time-dependent error covariances and changes in lengths of state/obs (although the Dyn and Obs models may otherwise be time-dependent).
  • Non-uniform time sequences not fully supported.

The scope of DAPPER is restricted because


Moreover, even straying beyond basic configurability appears unrewarding when already building on a high-level language such as Python. Indeed, you may freely fork and modify the code of DAPPER, which should be seen as a set of templates, and not a framework.

Also, DAPPER comes with no guarantees/support. Therefore, if you have an operational or real-world application, such as WRF, you should look into one of the alternatives, sorted by approximate project size.

NameDevelopersPurpose (approximately)
OpenDATU DelftGeneral
EMPIREReading (Met)General
ERTStatoilHistory matching (Petroleum DA)
PIPTCIPRHistory matching (Petroleum DA)
VerdandiINRIABiophysical DA
PyOSSEEdinburgh, ReadingEarth-observation DA

Below is a list of projects with a purpose more similar to DAPPER's (research in DA, and not so much using DA):

DAPPERRaanes, Chen, GrudzienPython
SANGOMAConglomerate*Fortran, Matlab
hIPPYlibVilla, Petra, GhattasPython, adjoint-based PDE methods
FilterPyR. LabbePython. Engineering oriented.
DASoftwareYue Li, StanfordMatlab. Large inverse probs.
PompU of MichiganR
EnKF-CSakovC. Light-weight, off-line DA
DasPyXujun HanPython
DataAssimilationBenchmarks.jlGrudzienJulia, Python
IEnKS codeBocquetPython

The EnKF-Matlab and IEnKS codes have been inspirational in the development of DAPPER.

*: AWI/Liege/CNRS/NERSC/Reading/Delft


Patrick N. Raanes, Yumeng Chen, Colin Grudzien, Maxime Tondeur, Remy Dubois

DAPPER is developed and maintained at NORCE (Norwegian Research Institute) and the Nansen Environmental and Remote Sensing Center (NERSC), in collaboration with the University of Reading, the UK National Centre for Earth Observation (NCEO), and the University of Nevada, Reno.


Publication list

Rate & Review

Great Documentation0
Easy to Use0
Highly Customizable0
Bleeding Edge0
Responsive Maintainers0
Poor Documentation0
Hard to Use0
Unwelcoming Community0