5. Analysis of output

5.1. Output file structure

The output is written to three files: output_aux.hdf5, output_indata.hdf5, and output_ray.hdf5. This is a big departure from RH, which contained several more output files. In particular, RH 1.5D will not write all the information that was written by RH, due to the sheer size it would take for large 3D simulations. The files are written in the machine-independent, self-describing HDF5 format. The contents of the files are organised in groups, variables, and attributes. Groups and variables can be imagined as directories and files in a filesystem. Inside groups, different variables and dimensions can be organised. The content of the output files can vary: some runs will have more detailed information and therefore more variables written to the files.

The structure of the three files is given below.

Note

When a column fails to converge, output for that column is not written. This means that the variables that depend on (nx, ny) will have some values missing. HDF5 marks these values as missing data and uses a fill value (of 9.9692e+36). When the 15D_DEPTH_ZCUT option is used, not all heights will be used in the calculation. The code does not read the skipped parts of the atmosphere. When writing such variables of nz, only the points that were used are written to the file, and the rest will be marked as missing data (typically the z cut height varies with the column).

5.1.1. output_aux.hdf5

This file contains the level populations and radiative rates. For each active atom or molecule, it contains different groups called atom_XX or molecule_XX, where XX is the identifier for the species (e.g. MG, CO).

The dimensions of the root group are nx, ny, and nz.

Note

The atmosphere dimensions on many of the output files are not necessarily the same as in the atmosphere file. They depend on the number of columns calculated, which are a function of X/Y_START/END/STEP.

There are two global attributes:

atmosID Identifier for the atmosphere file.
rev_id Revision identifier.

Inside each of the atom/molecule groups, the following dimensions can exist:

Name Description
nlevel Number of atomic levels.
nline Number of atomic transitions
ncontinuum Number of bound-free transitions.
nlevel_vibr Number of molecule vibration levels.
nline_molecule Number of molecular lines.
nJ Number of rotational states.

The atom groups contain the following variables:

The molecule groups contain the following variables:

All units are SI.

Note

In older versions it was possible to specify the keyword 15D_WRITE_EXTRA and get additional output written to output_aux.hdf5 (e.g. a new opacity group and more rates). While the procedures are still in writeAux_p.c, the functionality is deprecated because other changes in the code were not compatible with this way of writing the output. It is possible that this functionality will return at a later version.

5.1.2. output_indata.hdf5

This file contains data and metadata related to the run. It contains three groups: input (mostly settings from keyword.input), atmos (atmospheric variables), and mpi (several variables relating to the run).

The dimensions of the root group are nx, ny, nz, and strlen (used as maximum length for string type variables).

There are two global attributes:

atmosID Identifier for the atmosphere file.
rev_id Revision identifier.

The input group contains only attributes, all options from keyword.input.

The atmos groups contains the dimensions nhydr, elements and nrays. It also contains the following variables:

Name Dimensions Units Description
temperature (nx, ny, nz) K Temperatures
velocity_z (nx, ny, nz) m s-1 Vertical velocities
height (nx, ny, nz) m Height scale used. Can be different for every column when depth refine is used.
element_weight (nelements) a.m.u. Atomic weights
element_abundance (nelements)   Log of element abundances relative to hydrogen (A(H) = 12).
element_id (nelements, strlen)   Element identifiers.
muz (nrays)   mu values for each ray.
muz (nrays)   mu weights for each ray.
x (nx) m Spatial coordinates along x axis.
y (ny) m Spatial coordinates along y axis.

Note

When 15D_DEPTH_REFINE is used, each column will have a different (optimised) height scale, but they all have the same number of depth points (nz). In these cases, it is very important to save the height variable because otherwise one does not know how to relate the height relations of quantities from different columns.

The atmos group also contains the following attributes:

moving Unsigned int, 1 if velocity fields present.
stokes Unsigned int, 1 if stokes output present.

The mpi group contains the dimensions nprocesses (number of processes) and niterations (maximum value of iterations).

Warning

niterations is currently hardcoded in the code to a maximum of 1500. If you try to run more than 1500 iterations, there will be an error writing to the output.

The mpi group also contains several variables:

Name Dimensions Description
xnum (nx) Indices of x positions calculated.
xnum (nx) Indices of x positions calculated.
task_map (nx, ny) Maps which process ran which column.
task_map_number (nx, ny) Maps which task number each column was.
iterations (nx, ny) Number of iterations used for each column.
convergence (nx, ny) Indicates if each column converged or not. Possible values are 1 (converged), 0 (non converged), or -1 (crashed).
delta_max (nx, ny) Final value for delta_max when iteration finished.
delta_max_history (nx, ny, niterations) Evolution of delta_max
ntasks (nprocesses) Number of tasks assigned to each process. Does not work in pool mode.
hostname (nprocesses, strlen) Hostname of each process.
starting_time (nprocesses, strlen) Time when each process started the calculations.
finish_time (nprocesses, strlen) Time when each process finished the calculations.
z_cut (nx, ny) Height index of the temperature cut.

The mpi group also contains the following attributes: x_start, x_end, x_step, y_start, y_end, and y_step, all of which are options from keyword.input.

5.1.3. output_ray.hdf5

This file contains the synthetic spectra and can also contain extra information such as opacities and the source function. It contains only the root group. Its dimensions are nx, ny, nwave, and eventually wavelength_selected. The latter is only present when ray.input specifies more than 0 wavelengths for detailed output, and it matches Nsource, the number of those wavelengths entered in ray.input.

It can contain the following variables:

Name Dimensions Description
wavelength (nwave) Wavelength scale.
intensity (nx, ny, nwave) Synthetic disk-centre intensity (Stokes I).
stokes_Q (nx, ny, nwave) Stokes Q. Optional.
stokes_U (nx, ny, nwave) Stokes U. Optional.
stokes_V (nx, ny, nwave) Stokes V. Optional.
tau_one_height (nx, ny, nwave) Height where optical depth reaches unity, for each column. Optional.
wavelength_indices (wavelength_selected) Indices of wavelengths selected for variables below. Optional.
chi (nx, ny, nz, wavelength_selected) Total opacity (line and continuum). Optional.
source_function (nx, ny, nz, wavelength_selected) Total opacity (line and continuum). Optional.
Jlambda (nx, ny, nz, wavelength_selected) Angle-averaged radiation field. Optional.

All units are in SI. The intensity, Stokes vector, source_function, Jlambda are all in J s-1m-2Hz-1sr-1. The wavelength is in nm, air or vacuum units, depending if VACUUM_TO_AIR is TRUE or FALSE (in keyword.input). chi is in m-1and tau_one_height in m.

Despite internally being calculated in double precision, all the output (except the wavelength scale) is written in single precision to save disk space.

The full Stokes vector is only written when in keyword.input STOKES_MODE is not NO_STOKES and the STOKES_INPUT is set.

The chi, source_function, and Jlambda variables depend on the 3D grid and on wavelength. Therefore, for even moderate grid sizes they can take huge amounts of space. If nx = ny = nz = 512 and wavelength_selected = 200, each of these variables will need 100Gb of disk space. For a simulation with a cubic grid of 10243 points and saving the full output for 1000 wavelength points, output_ray.hdf5 will occupy a whopping 12Tb per snapshot of disk space. To avoid such problems, these large arrays are only written when ray.input contains Nsource > 0, and for the wavelengths selected.

Note

As noted above, arrays of nz will have the first values missing if 15D_DEPTH_ZCUT is used. The variables chi, source_function, and Jlambda are an exception to this rule. For performance reasons these missing values are not marked with a fill value, but instead they are filled with zeros.

The output_ray.hdf5 file contains the following global attributes:

atmosID Identifier for the atmosphere file
snapshot_number Number of simulation snapshot (from atmosphere file)
rev_id Revision identifier
creation_time Local time when file was created

5.2. Reading the output files

HDF5 is an open, platform-independent format, and therefore interfaces to many programming languages are available. The main interface libraries are available in C, C++, Fortran, and Java. But there are also interfaces for Python (h5py), Julia, IDL (from version 6.2), MATLAB , Octave, Perl, and R.

The RH 1.5D output files can be read with standard HDF5 or NetCDF 4 readers: in most cases one needs to specify only the variable or group name. The HDF5 library provides useful command line tools, which can be used to gather information about the RH 1.5D files or extract data. Additionally, there is a more complete set of tools written in Python to read and analyse these files. Interfaces in other languages are also planned.

Warning

Because of the limitations of different languages, not all interfaces support all HDF5 features. IDL in particular does not support masked arrays. This means that when reading variables with missing data (see Output file structure), IDL will happily read all the data with no warning or indication of those that have special fill values.

5.2.1. Command line tools

Two useful command line tools that come with HDF5 are h5dump and h5repack.

As shown above, h5dump can be used with the -H option to look at the header of a file: see the dimensions, variables, groups. It can also be used to print a text version of any variable in an HDF5 file (e.g. this can be redirected to a text file). When printing a variable one uses the option -v variable, and the resulting output is the same as in the -H mode, with the variable printed at the end.

The h5repack program can be used to copy and modify the parameters of HDF5 files. It can convert the files between different format versions, compress variables, etc. Of particular importance is the option for rechunking a file. Chunking in HDF5 files can be used to improve performance by changing the disk structures to improve different read patterns. It is analogous to fully or partially transposing the variables along certain dimensions.

See also

h5dump guide
Detailed information about h5dump.
h5repack guide
Detailed information about h5repack.
Chunking in HDF5
Description on the advantages of chunking.

5.2.2. Python interface

Python routines to read the input, output, and more are available at rh/python/rh15d.py. They require the numpy and h5py (or netCDF4) modules.

5.2.2.1. Reading output files

The main class to read the output is called Rh15dout. It can be initiated in the following way:

>>> from rh15d import *
>>> rr = rh15d.Rh15dout()
--- Read ./output_aux.hdf5 file.
--- Read ./output_indata.hdf5 file.
--- Read ./output_ray.hdf5 file.

By default, it will look for the three files in the directory specified as main argument (defaults to current directory). Additionally, the methods read_aux(infile), read_indata(infile), and read_ray(infile) can be used to manually load a file. The variables themselves are not read into memory, but are rather a memmap object (file pointer; only read when needed).

After loading the files, the Rh15dout instance maps them into different classes. The ray class contains all the variables that were saved in the output_ray.hdf5 as attributes:

>>> [a for a in dir(rr.ray) if a[0] != '_']
['intensity',
 'params',
 'stokes_Q',
 'stokes_U',
 'stokes_V',
 'chi',
 'source_function',
 'scattering',
 'Jlambda',
 'tau_one_height',
 'wavelength',
 'wavelength_indices']
>>> rr.ray.intensity
<HDF5 dataset "intensity": shape (256, 256, 1626), type "<f4">

Several of the options from output_indata.hdf5, along with many of the dimensions used, are gathered into a method called params, saved as a dictionary. The mpi and atmos groups of output_indata.hdf5, with all their variables are like ray, added into classes with the same name that are methods of the Rh15dout instance. For example:

>>> rr.mpi.convergence
<HDF5 dataset "convergence": shape (256, 256), type "<i8">
>>> rr.atmos.temperature
<HDF5 dataset "temperature": shape (256, 256, 400), type "<f4">

Likewise, the groups of output_aux.hdf5 are also added as classes that are methods of the main instance:

>>> [a for a in dir(rr.atom_CA) if a[0] != '_']
['Rij_continuum',
 'Rij_line',
 'Rji_continuum',
 'Rji_line',
 'params',
 'populations',
 'populations_LTE']

All the groups and variables are therefore automatically added, so they will adapt to any changes in the output files.

5.2.2.2. Reading input files

The HDF5Atmos class can be used to read the input atmosphere files. It can be initiated in the following way:

>>> from rh15d import *
>>> atm = rh15d.HDF5Atmos('my_atmos.hdf5')

Inspection of the result reveals the variables, a dictionary called param with basic data and the properties of the variables:

>>> [a for a in dir(atm) if a[0] != '_']
['B_x',
 'B_y',
 'B_z',
 'close',
 'closed',
 'electron_density',
 'file',
 'hydrogen_populations',
 'params',
 'read',
 'snapshot_number',
 'temperature',
 'velocity_z',
 'write_multi',
 'write_multi_3d',
 'x',
 'y',
 'z']
>>> atm.params     # Contains basic properties
{'boundary_bottom': 1,
 'boundary_top': 0,
 'description': 'Created with make_hdf5_atmos.on 2017-04-03 17:32:06.831811',
 'has_B': 1,
 'nhydr': 6,
 'nt' : 4,
 'nx': 512,
 'ny': 512,
 'nz': 400}
>>> atm.velocity_z
<HDF5 dataset "velocity_z": shape (4, 512, 512, 400), type "<f4">

This interface is read-only, no modifications to the atmosphere files are possible.

5.2.2.3. Writing input files

In rh15d.py there is also a function to write the input atmosphere, assuming the user already has the required arrays at hand. Its function definition is:

def make_hdf5_atmos(outfile, T, vz, nH, z, x=None, y=None, Bz=None, By=None,
              Bx=None, rho=None, ne=None, vx=None, vy=None, desc=None,
              snap=None, boundary=[1, 0], comp=None, complev=None,
              append=False):
  """
  Creates HDF5 input file for RH 1.5D.

  Parameters
  ----------
  outfile : string
      Name of destination. If file exists it will be wiped.
  T : n-D array
      Temperature in K. Its shape will determine the output
      dimensions (can be 1D, 2D, or 3D).
  vz : n-D array
      Line of sight velocity in m/s. Same shape as T.
  nH : n-D array
      Hydrogen populations in m^-3. Shape is [nhydr, shape.T] where
      nydr can be 1 (total number of protons) or more (level populations).
  z : n-D array
      Height in m. Can have same shape as T (different height scale
      for each column) or be only 1D (same height for all columns).
  ne : n-D array, optional
      Electron density in m^-3. Same shape as T.
  rho : n-D array, optional
      Density in kg / m^-3. Same shape as T.
  vx : n-D array, optional
      x velocity in m/s. Same shape as T. Not in use by RH 1.5D.
  vy : n-D array, optional
      y velocity in m/s. Same shape as T. Not in use by RH 1.5D.
  Bx : n-D array, optional
      Magnetic field in x dimension, in Tesla. Same shape as T.
  By : n-D array, optional
      Magnetic field in y dimension, in Tesla. Same shape as T.
  Bz : n-D array, optional
      Magnetic field in z dimension, in Tesla. Same shape as T.
  x : 1-D array, optional
      Grid distances in m. Same shape as first index of T.
  y : 1-D array, optional
      Grid distances in m. Same shape as second index of T.
  x : 1-D array, optional
      Grid distances in m. Same shape as first index of T.
  snap : array-like, optional
      Snapshot number(s).
  desc : string, optional
      Description of file
  boundary : Tuple, optional
      Tuple with [bottom, top] boundary conditions. Options are:
      0: Zero, 1: Thermalised, 2: Reflective.
  append : boolean, optional
      If True, will append to existing file (if any).
  comp : string, optional
      Options are: None (default), 'gzip', 'szip', 'lzf'.
  complev : integer or tuple, optional
      Compression level. Integer for 'gzip', 2-tuple for szip.
  """

Both zlib and szip compression are supported but again this is not recommended.

5.2.2.4. Writing wavelength files

Another utility function in rh15d.py is make_wave_file. It creates an RH wavelength file from a given array of wavelengths. It’s usage is documented in its function call:

def make_wave_file(outfile, start=None, end=None, step=None, new_wave=None,
                   ewave=None, air=True):
   """
   Writes RH wave file (in xdr format). All wavelengths should be in nm.

   Parameters
   ----------
   start: number
       Starting wavelength.
   end: number
       Ending wavelength (non-inclusive)
   step: number
       Wavelength separation
   outfile: string
       Name of file to write.
   ewave: 1-D array, optional
       Array of existing wavelengths. Program will make discard points
       to make sure no step is enforced using these points too.
   air: boolean, optional
       If true, will at the end convert the wavelengths into vacuum
       wavelengths.
   """

5.2.3. Other languages

While many other languages have interfaces to read HDF5 files, there are no specific routines for reading the output from RH 1.5D. Support for other languages may be added later as demand requires.

Under RH 1.5D idl/ directory is routine named read_ncdf_var.pro. The function read_ncdf_var() can be used to read variables from an HDF5 or netCDF4 file, e.g.:

IDL> data = read_ncdf_var("output_ray.hdf5", "intensity")
IDL> help, data
DATA            FLOAT     = Array[902, 512, 512]
IDL> pops = read_ncdf_var("output_aux.hdf5", "populations", groupname="atom_CA")
IDL> help, pops
POPS            FLOAT     = Array[400, 512, 512, 5]

5.3. Analysis tools

Note

There is no organised package of analysis tools for the output of RH 1.5D. This should be added in the future. The IDL analysis suite of RH does not work with RH 1.5D.