# Configuration file and processing options

This page descripbes how nabu ingests a configuration file.

A configuration file describes the steps to perform in the processing pipeline.
Nabu parses this configuration file and builds an internal "pipeline description".
To do so, two things are needed: the configuration file itself, and the dataset.


## Stage 1: parsing and syntax validation

In the first stage, the configuration file is parsed and syntactically checked (ex. key names, types of keys, etc).  
On another hand, the dataset is generically browsed in order to extract metadata (images shape, number of images, etc).  
These two steps are performed independently.

```
(config file) ----> [config parser | syntax validation] --> nabu_config

(dataset)   ------> [dataset analyzer] ---> dataset_info
		reduction on flats/darks
```

This produces two data structures:
  - `dataset_info`: a class containing information on the dataset to process
  - `nabu_config`: a dictionary reflecting the user configuration file. This structure is not modified as it should reflect the user input.

## Stage 2: update `dataset_info` with user configuration

The above steps (configuration parsing and dataset browsing) were done independently.
In this stage, the `dataset_info` data structure is updated with user configuration.
For example:
  - Overwrite the dataset flats/darks with user-provided images
  - Re-define rotation angles
  - Define sample (or detector) movements to be corrected for

```
(nabu_config, dataset_info)  --> [update] --> (nabu_config, updated dataset_info)
```


## Stage 3: estimations

This stage performs various estimations like center of rotation, camera tilt, etc.

This step has to be done before the final validation, because final checks need a numerical value.

```
(nabu_config, dataset_info) ---> [estimators]  --->  (nabu_config, updated dataset_info)
				CoR estimation
				tilt estimation
```


## Stage 4: validation

In this stage, the two data structures `dataset_info` and `nabu_config` are checked against eachother.
The purpose is to check that each pipeline step (defined by the configuration file) is compatible with the current dataset. For example, flat-field normalization (asked by user) cannot take place if the dataset contains no flats/darks.

```
(nabu_config, dataset_info)  ---> [validation]          --------->  (nabu_config, updated dataset_info)
				check that config file values
				are consistent with the dataset
				(ex. start_xyz/end_xyz, etc)
```

## Stage 5: build pipeline description

This produces an internal object ProcessConfig, made of the following data structures:
  - `processing_steps`: a list of processing step names
  - `processing_options`: a dictionary describing the options of each processing step. It is derived from `nabu_config`, with updated values (ex. from estimators)

A `ProcessConfig` instance can be directly fed into a `Pipeline` object. This means it can be saved for further reuse, avoiding to go through all the above steps. However it should be kept in mind that this internal representation might change, while the user-exposed configuration file is stable.