5. Analysis of output¶
5.1. Output file structure¶
The output is written to three files: output_aux.hdf5
, output_indata.hdf5
, and output_ray.hdf5
. This is a big departure from RH, which contained several more output files. In particular, RH 1.5D will not write all the information that was written by RH, due to the sheer size it would take for large 3D simulations. The files are written in the machine-independent, self-describing HDF5 format. The contents of the files are organised in groups, variables, and attributes. Groups and variables can be imagined as directories and files in a filesystem. Inside groups, different variables and dimensions can be organised. The content of the output files can vary: some runs will have more detailed information and therefore more variables written to the files.
The structure of the three files is given below.
Note
When a column fails to converge, output for that column is not written. This means that the variables that depend on (nx, ny)
will have some values missing. HDF5 marks these values as missing data and uses a fill value (of 9.9692e+36). When the 15D_DEPTH_ZCUT
option is used, not all heights will be used in the calculation. The code does not read the skipped parts of the atmosphere. When writing such variables of nz
, only the points that were used are written to the file, and the rest will be marked as missing data (typically the z cut height varies with the column).
5.1.1. output_aux.hdf5¶
This file contains the level populations and radiative rates. For each active atom or molecule, it contains different groups called atom_XX
or molecule_XX
, where XX
is the identifier for the species (e.g. MG
, CO
).
The dimensions of the root group are nx
, ny
, and nz
.
Note
The atmosphere dimensions on many of the output files are not necessarily the same as in the atmosphere file. They depend on the number of columns calculated, which are a function of X/Y_START/END/STEP
.
There are two global attributes:
atmosID |
Identifier for the atmosphere file. |
rev_id |
Revision identifier. |
Inside each of the atom/molecule groups, the following dimensions can exist:
Name | Description |
---|---|
nlevel |
Number of atomic levels. |
nline |
Number of atomic transitions |
ncontinuum |
Number of bound-free transitions. |
nlevel_vibr |
Number of molecule vibration levels. |
nline_molecule |
Number of molecular lines. |
nJ |
Number of rotational states. |
The atom groups contain the following variables:
The molecule groups contain the following variables:
All units are SI.
Note
In older versions it was possible to specify the keyword 15D_WRITE_EXTRA
and get additional output written to output_aux.hdf5
(e.g. a new opacity
group and more rates). While the procedures are still in writeAux_p.c
, the functionality is deprecated because other changes in the code were not compatible with this way of writing the output. It is possible that this functionality will return at a later version.
5.1.2. output_indata.hdf5¶
This file contains data and metadata related to the run. It contains three groups: input
(mostly settings from keyword.input
), atmos
(atmospheric variables), and mpi
(several variables relating to the run).
The dimensions of the root group are nx
, ny
, nz
, and strlen
(used as maximum length for string type variables).
There are two global attributes:
atmosID |
Identifier for the atmosphere file. |
rev_id |
Revision identifier. |
The input
group contains only attributes, all options from keyword.input
.
The atmos
groups contains the dimensions nhydr
, elements
and nrays
. It also contains the following variables:
Name | Dimensions | Units | Description |
---|---|---|---|
temperature |
(nx, ny, nz) |
K | Temperatures |
velocity_z |
(nx, ny, nz) |
m s-1 | Vertical velocities |
height |
(nx, ny, nz) |
m | Height scale used. Can be different for every column when depth refine is used. |
element_weight |
(nelements) |
a.m.u. | Atomic weights |
element_abundance |
(nelements) |
Log of element abundances relative to hydrogen (A(H) = 12). | |
element_id |
(nelements, strlen) |
Element identifiers. | |
muz |
(nrays) |
mu values for each ray. | |
muz |
(nrays) |
mu weights for each ray. | |
x |
(nx) |
m | Spatial coordinates along x axis. |
y |
(ny) |
m | Spatial coordinates along y axis. |
Note
When 15D_DEPTH_REFINE
is used, each column will have a different (optimised) height scale, but they all have the same number of depth points (nz
). In these cases, it is very important to save the height
variable because otherwise one does not know how to relate the height relations of quantities from different columns.
The atmos
group also contains the following attributes:
moving |
Unsigned int, 1 if velocity fields present. |
stokes |
Unsigned int, 1 if stokes output present. |
The mpi
group contains the dimensions nprocesses
(number of processes) and niterations
(maximum value of iterations).
Warning
niterations
is currently hardcoded in the code to a maximum of 1500. If you try to run more than 1500 iterations, there will be an error writing to the output.
The mpi
group also contains several variables:
Name | Dimensions | Description |
---|---|---|
xnum |
(nx) |
Indices of x positions calculated. |
xnum |
(nx) |
Indices of x positions calculated. |
task_map |
(nx, ny) |
Maps which process ran which column. |
task_map_number |
(nx, ny) |
Maps which task number each column was. |
iterations |
(nx, ny) |
Number of iterations used for each column. |
convergence |
(nx, ny) |
Indicates if each column converged or not. Possible
values are 1 (converged), 0 (non converged),
or -1 (crashed). |
delta_max |
(nx, ny) |
Final value for delta_max when iteration finished. |
delta_max_history |
(nx, ny, niterations) |
Evolution of delta_max |
ntasks |
(nprocesses) |
Number of tasks assigned to each process. Does not work in pool mode. |
hostname |
(nprocesses, strlen) |
Hostname of each process. |
starting_time |
(nprocesses, strlen) |
Time when each process started the calculations. |
finish_time |
(nprocesses, strlen) |
Time when each process finished the calculations. |
z_cut |
(nx, ny) |
Height index of the temperature cut. |
The mpi
group also contains the following attributes: x_start
, x_end
, x_step
, y_start
, y_end
, and y_step
, all of which are options from keyword.input
.
5.1.3. output_ray.hdf5¶
This file contains the synthetic spectra and can also contain extra information such as opacities and the source function. It contains only the root group. Its dimensions are nx
, ny
, nwave
, and eventually wavelength_selected
. The latter is only present when ray.input
specifies more than 0
wavelengths for detailed output, and it matches Nsource
, the number of those wavelengths entered in ray.input
.
It can contain the following variables:
Name | Dimensions | Description |
---|---|---|
wavelength |
(nwave) |
Wavelength scale. |
intensity |
(nx, ny, nwave) |
Synthetic disk-centre intensity (Stokes I). |
stokes_Q |
(nx, ny, nwave) |
Stokes Q. Optional. |
stokes_U |
(nx, ny, nwave) |
Stokes U. Optional. |
stokes_V |
(nx, ny, nwave) |
Stokes V. Optional. |
tau_one_height |
(nx, ny, nwave) |
Height where optical depth reaches unity, for each column. Optional. |
wavelength_indices |
(wavelength_selected) |
Indices of wavelengths selected for variables below. Optional. |
chi |
(nx, ny, nz, wavelength_selected) |
Total opacity (line and continuum). Optional. |
source_function |
(nx, ny, nz, wavelength_selected) |
Total opacity (line and continuum). Optional. |
Jlambda |
(nx, ny, nz, wavelength_selected) |
Angle-averaged radiation field. Optional. |
All units are in SI. The intensity
, Stokes vector, source_function
, Jlambda
are all in J s-1m-2Hz-1sr-1. The wavelength
is in nm, air or vacuum units, depending if VACUUM_TO_AIR
is TRUE
or FALSE
(in keyword.input
). chi
is in m-1and tau_one_height
in m.
Despite internally being calculated in double precision, all the output (except the wavelength scale) is written in single precision to save disk space.
The full Stokes vector is only written when in keyword.input
STOKES_MODE
is not NO_STOKES
and the STOKES_INPUT
is set.
The chi
, source_function
, and Jlambda
variables depend on the 3D grid and on wavelength. Therefore, for even moderate grid sizes they can take huge amounts of space. If nx = ny = nz = 512
and wavelength_selected = 200
, each of these variables will need 100Gb of disk space. For a simulation with a cubic grid of 10243 points and saving the full output for 1000 wavelength points, output_ray.hdf5
will occupy a whopping 12Tb per snapshot of disk space. To avoid such problems, these large arrays are only written when ray.input
contains Nsource > 0
, and for the wavelengths selected.
Note
As noted above, arrays of nz
will have the first values missing if 15D_DEPTH_ZCUT
is used. The variables chi
, source_function
, and Jlambda
are an exception to this rule. For performance reasons these missing values are not marked with a fill value, but instead they are filled with zeros.
The output_ray.hdf5
file contains the following global attributes:
atmosID |
Identifier for the atmosphere file |
snapshot_number |
Number of simulation snapshot (from atmosphere file) |
rev_id |
Revision identifier |
creation_time |
Local time when file was created |
5.2. Reading the output files¶
HDF5 is an open, platform-independent format, and therefore interfaces to many programming languages are available. The main interface libraries are available in C, C++, Fortran, and Java. But there are also interfaces for Python (h5py), Julia, IDL (from version 6.2), MATLAB , Octave, Perl, and R.
The RH 1.5D output files can be read with standard HDF5 or NetCDF 4 readers: in most cases one needs to specify only the variable or group name. The HDF5 library provides useful command line tools, which can be used to gather information about the RH 1.5D files or extract data. Additionally, there is a more complete set of tools written in Python to read and analyse these files. Interfaces in other languages are also planned.
Warning
Because of the limitations of different languages, not all interfaces support all HDF5 features. IDL in particular does not support masked arrays. This means that when reading variables with missing data (see Output file structure), IDL will happily read all the data with no warning or indication of those that have special fill values.
5.2.1. Command line tools¶
Two useful command line tools that come with HDF5 are h5dump and h5repack.
As shown above, h5dump
can be used with the -H
option to look at the header of a file: see the dimensions, variables, groups. It can also be used to print a text version of any variable in an HDF5 file (e.g. this can be redirected to a text file). When printing a variable one uses the option -v variable
, and the resulting output is the same as in the -H
mode, with the variable printed at the end.
The h5repack
program can be used to copy and modify the parameters of HDF5 files. It can convert the files between different format versions, compress variables, etc. Of particular importance is the option for rechunking a file. Chunking in HDF5 files can be used to improve performance by changing the disk structures to improve different read patterns. It is analogous to fully or partially transposing the variables along certain dimensions.
See also
- h5dump guide
- Detailed information about
h5dump
. - h5repack guide
- Detailed information about
h5repack
. - Chunking in HDF5
- Description on the advantages of chunking.
5.2.2. Python interface¶
Python routines to read the input, output, and more are available at rh/python/rh15d.py
. They require the numpy and h5py (or netCDF4) modules.
5.2.2.1. Reading output files¶
The main class to read the output is called Rh15dout
. It can be initiated in the following way:
>>> from rh15d import *
>>> rr = rh15d.Rh15dout()
--- Read ./output_aux.hdf5 file.
--- Read ./output_indata.hdf5 file.
--- Read ./output_ray.hdf5 file.
By default, it will look for the three files in the directory specified as main argument (defaults to current directory). Additionally, the methods read_aux(infile)
, read_indata(infile)
, and read_ray(infile)
can be used to manually load a file. The variables themselves are not read into memory, but are rather a memmap object (file pointer; only read when needed).
After loading the files, the Rh15dout
instance maps them into different classes. The ray
class contains all the variables that were saved in the output_ray.hdf5
as attributes:
>>> [a for a in dir(rr.ray) if a[0] != '_']
['intensity',
'params',
'stokes_Q',
'stokes_U',
'stokes_V',
'chi',
'source_function',
'scattering',
'Jlambda',
'tau_one_height',
'wavelength',
'wavelength_indices']
>>> rr.ray.intensity
<HDF5 dataset "intensity": shape (256, 256, 1626), type "<f4">
Several of the options from output_indata.hdf5
, along with many of the dimensions used, are gathered into a method called params
, saved as a dictionary. The mpi
and atmos
groups of output_indata.hdf5
, with all their variables are like ray
, added into classes with the same name that are methods of the Rh15dout
instance. For example:
>>> rr.mpi.convergence
<HDF5 dataset "convergence": shape (256, 256), type "<i8">
>>> rr.atmos.temperature
<HDF5 dataset "temperature": shape (256, 256, 400), type "<f4">
Likewise, the groups of output_aux.hdf5
are also added as classes that are methods of the main instance:
>>> [a for a in dir(rr.atom_CA) if a[0] != '_']
['Rij_continuum',
'Rij_line',
'Rji_continuum',
'Rji_line',
'params',
'populations',
'populations_LTE']
All the groups and variables are therefore automatically added, so they will adapt to any changes in the output files.
5.2.2.2. Reading input files¶
The HDF5Atmos
class can be used to read the input atmosphere files. It can be initiated in the following way:
>>> from rh15d import *
>>> atm = rh15d.HDF5Atmos('my_atmos.hdf5')
Inspection of the result reveals the variables, a dictionary called param
with basic data and the properties of the variables:
>>> [a for a in dir(atm) if a[0] != '_']
['B_x',
'B_y',
'B_z',
'close',
'closed',
'electron_density',
'file',
'hydrogen_populations',
'params',
'read',
'snapshot_number',
'temperature',
'velocity_z',
'write_multi',
'write_multi_3d',
'x',
'y',
'z']
>>> atm.params # Contains basic properties
{'boundary_bottom': 1,
'boundary_top': 0,
'description': 'Created with make_hdf5_atmos.on 2017-04-03 17:32:06.831811',
'has_B': 1,
'nhydr': 6,
'nt' : 4,
'nx': 512,
'ny': 512,
'nz': 400}
>>> atm.velocity_z
<HDF5 dataset "velocity_z": shape (4, 512, 512, 400), type "<f4">
This interface is read-only, no modifications to the atmosphere files are possible.
5.2.2.3. Writing input files¶
In rh15d.py
there is also a function to write the input atmosphere, assuming the user already has the required arrays at hand. Its function definition is:
def make_hdf5_atmos(outfile, T, vz, nH, z, x=None, y=None, Bz=None, By=None,
Bx=None, rho=None, ne=None, vx=None, vy=None, desc=None,
snap=None, boundary=[1, 0], comp=None, complev=None,
append=False):
"""
Creates HDF5 input file for RH 1.5D.
Parameters
----------
outfile : string
Name of destination. If file exists it will be wiped.
T : n-D array
Temperature in K. Its shape will determine the output
dimensions (can be 1D, 2D, or 3D).
vz : n-D array
Line of sight velocity in m/s. Same shape as T.
nH : n-D array
Hydrogen populations in m^-3. Shape is [nhydr, shape.T] where
nydr can be 1 (total number of protons) or more (level populations).
z : n-D array
Height in m. Can have same shape as T (different height scale
for each column) or be only 1D (same height for all columns).
ne : n-D array, optional
Electron density in m^-3. Same shape as T.
rho : n-D array, optional
Density in kg / m^-3. Same shape as T.
vx : n-D array, optional
x velocity in m/s. Same shape as T. Not in use by RH 1.5D.
vy : n-D array, optional
y velocity in m/s. Same shape as T. Not in use by RH 1.5D.
Bx : n-D array, optional
Magnetic field in x dimension, in Tesla. Same shape as T.
By : n-D array, optional
Magnetic field in y dimension, in Tesla. Same shape as T.
Bz : n-D array, optional
Magnetic field in z dimension, in Tesla. Same shape as T.
x : 1-D array, optional
Grid distances in m. Same shape as first index of T.
y : 1-D array, optional
Grid distances in m. Same shape as second index of T.
x : 1-D array, optional
Grid distances in m. Same shape as first index of T.
snap : array-like, optional
Snapshot number(s).
desc : string, optional
Description of file
boundary : Tuple, optional
Tuple with [bottom, top] boundary conditions. Options are:
0: Zero, 1: Thermalised, 2: Reflective.
append : boolean, optional
If True, will append to existing file (if any).
comp : string, optional
Options are: None (default), 'gzip', 'szip', 'lzf'.
complev : integer or tuple, optional
Compression level. Integer for 'gzip', 2-tuple for szip.
"""
Both zlib and szip compression are supported but again this is not recommended.
5.2.2.4. Writing wavelength files¶
Another utility function in rh15d.py
is make_wave_file
. It creates an RH wavelength file from a given array of wavelengths. It’s usage is documented in its function call:
def make_wave_file(outfile, start=None, end=None, step=None, new_wave=None,
ewave=None, air=True):
"""
Writes RH wave file (in xdr format). All wavelengths should be in nm.
Parameters
----------
start: number
Starting wavelength.
end: number
Ending wavelength (non-inclusive)
step: number
Wavelength separation
outfile: string
Name of file to write.
ewave: 1-D array, optional
Array of existing wavelengths. Program will make discard points
to make sure no step is enforced using these points too.
air: boolean, optional
If true, will at the end convert the wavelengths into vacuum
wavelengths.
"""
5.2.3. Other languages¶
While many other languages have interfaces to read HDF5 files, there are no specific routines for reading the output from RH 1.5D. Support for other languages may be added later as demand requires.
Under RH 1.5D idl/
directory is routine named read_ncdf_var.pro
. The function read_ncdf_var()
can be used to read variables from an HDF5 or netCDF4 file, e.g.:
IDL> data = read_ncdf_var("output_ray.hdf5", "intensity")
IDL> help, data
DATA FLOAT = Array[902, 512, 512]
IDL> pops = read_ncdf_var("output_aux.hdf5", "populations", groupname="atom_CA")
IDL> help, pops
POPS FLOAT = Array[400, 512, 512, 5]
5.3. Analysis tools¶
Note
There is no organised package of analysis tools for the output of RH 1.5D. This should be added in the future. The IDL analysis suite of RH does not work with RH 1.5D.