Falaise
4.0.1
SuperNEMO Software Toolkit
|
If you have just started using Falaise or the FLReconstruct application, we strongly recommend that you familiarize yourself with the basic usage of FLReconstruct covered in The FLReconstruct Application.
FLReconstruct uses a pipeline pattern to process events. You can view this as a production line with each stage on the line performing some operation on the event. Each stage in the pipeline is called a "pipeline module" (or just "module") and is implemented as a C++ class. The FLReconstruct application can load new modules at runtime using a "plugin" mechanism. Scripting, as demonstrated in the tutorial on using FLReconstruct, is used to load new modules from plugins, select the modules to use in the pipeline, and configure each module.
In this tutorial we will see how to implement our own modules for use in the FLReconstruct pipeline. This will cover
Getting your module to actually do something with the events that are passed to it is deferred to a later tutorial.
We begin by creating an empty directory to hold the source code for our example module, which we'll name "MyModule"
You are free to organise the source code under this directory as you see fit. In this very simple case we will just place all files in the MyModule
directory without any subdirectories. We start by creating the implementation file, for the C++ class, which we'll name MyModule.cpp
Here we can see the minimal interface and infrastructure required by a module class for flreconstruct
. The class must implement:
falaise::config::property_set const&
, the configuration supplied to the module by the pipeline scriptdatatools::service_manager&
, flreconstruct
's service providerprocess
taking a single datatools::things&
input parameter, and returning a falaise::processing::status enumeration.To make the plugin we'll build from this code loadable by flreconstruct
we must also use the FALAISE_REGISTER_MODULE macro, passing it the class's typename. This will also become a string that can be used to create a module of this type in an flreconstruct
pipeline script.
The non-default constructor is responsible for initializing the module using, if required, the information supplied in the falaise::config::property_set and datatools::service_manager objects. Our basic module doesn't require any configuration or service information so we simply ignore these arguments. Later tutorials will cover module configuration and use of services by modules.
The process
member function performs the actual operation on the event, which is represented by a datatools::things instance. It is passed via non-const reference so process
can both read and write data to the event. As noted above, a later tutorial will cover the interface and use of datatools::things. We therefore don't do anything with the event, and simply write a message to standard output so that we'll be able to see the method being called in flreconstruct
. process
must return a processing exit code. In this case, our processing is always successful, so we return falaise::processing::status::PROCESS_OK
.
With the source code for MyModule
in place we need to build a shared library from it that flreconstruct
can load at runtime to make MyModule
usable in a pipeline. As MyModule
uses components from Falaise, the compilation needs to use its headers, libraries and dependencies. The simplest way to set this up is to use CMake to build the shared library and make use of Falaise's find_package support.
To do this, we add a CMake script alongside the sources:
The implementation of CMakeLists.txt
is very straightforward:
Comments begin with a #
. The first two commands simply setup CMake and the compiler for us. The find_package
command will locate Falaise for us, and we supply the REQUIRED
argument to ensure CMake will fail if a Falaise install cannot be found. The add_library
command creates the actual shared library. Breaking the arguments to add_library
down one by one:
MyModule
: the name of the library, which will be used to create the on disk name. For example, on Linux, this will output a library file libMyModule.so
, and on Mac OS X a library file libMyModule.dylib
.SHARED
: the type of the library, in this case a dynamic library.MyModule.cpp
: all the sources need to build the library.Finally, the target_link_libraries
command links the shared library to Falaise's Falaise::FalaiseModule
target. This ensures that compilation and linking of the MyModule
target will use the correct compiler and linker flags for use of Falaise. The flreconstruct
application makes a default set of libraries available, and if you require use of additional ones, CMake must be set up to find and use these. This is documented later in this tutorial.
For more detailed documentation on CMake, please refer to its online help.
To build the library, we first create a so-called build directory to hold the files generated by the compilation to isolate them from the source code. This means we can very quickly delete and recreate the build without worrying about deleting the primary sources (it also helps to avoid accidental commits of local build artifacts to Git!). This directory can be wherever you like, but it's usually most convenient to create it alongside the directory in which the sources reside. In this example we have the directory structure:
so we create the build directory under /path/to/MyWorkSpace
as
The first step of the build is to change into the build directory and run cmake
to configure the build of MyModule
:
Here, the CMAKE_PREFIX_PATH
argument should be the directory under which Falaise was installed. If you installed Falaise using brew
and are using the snemo-shell
environment then you will not need to set this. The last argument ../MyModule
points CMake to the directory holding the CMakeLists.txt
file for the project we want to build, in this case our custom module.
Running the command will produce output that is highly system dependent, but you should see something along the lines of
The exact output will depend on which compiler and platform you are using. However, the last three lines are common apart from the path, and indicate a successful configuration. Listing the contents of the directory shows that CMake has generated a Makefile for us:
To build the library for our module we simply run make
:
If the build succeeds, we now have the shared library present in our build directory:
Note that the extension of the shared library is platform dependent (.dylib
for Mac, .so
on Linux). With the library built, we now need to make flreconstruct
aware of it so we can use MyModule
in a pipeline.
To use our new module in flreconstruct
we need to tell the application about it before using it in a pipeline. We do this through the pipeline script we pass to flreconstruct
via
flreconstruct.plugins
which tells flreconstruct
about libraries to be loaded.We create a script named MyModulePipeline.conf
in our project directory:
This script takes the same basic form as shown in the tutorial on using flreconstruct:
The plugins
key in the flreconstruct.plugins
section is a list of strings naming the libraries to be loaded by flreconstruct
at startup. These are taken as the "basename" of the library, from which the full physical file to be loaded, lib<basename>.{so,dylib}
, is constructed. flreconstruct
only searches for plugin libraries in its builtin location by default, so custom modules must set the <basename>.directory
property to tell it the path under which their <basename>
library is located.
In the above example, MyModule.directory : string ="."
tells flreconstruct
to look in the current working directory, i.e. the directory from which it was run, for the MyModule
plugin. This is convenient for testing a local build of a module, as we can run flreconstruct
directly from the build directory of our module and it will locate the library immediately. You can also specify absolute paths, e.g.
or paths containing environment variables which will be expanded automatically, e.g.
With the loading of the custom module in place, we can use it in the script as we did for the builtin modules. As we did in in the trivial pipeline example for flreconstruct, we can simply declare the main pipeline module as being of the MyModule
type, hence the line
Note that the type
key value must always be the full typename of the module, as used in the FALAISE_REGISTER_MODULE macro. Remember that in MyModule.cpp
we called the macro as:
thus type
is just "MyModule".
We can now run flreconstruct
with MyModulePipeline.conf
as the pipeline script. Because we've specified the location of the MyModule
library as the working directory, we first change to the directory in which this library resides, namely our build directory. We also need to have a file to process, so we run flsimulate
first to create a simple file of one event (NB in the following, we assume you have flsimulate
and flreconstruct
in your PATH
).
We can see that flreconstruct
loaded the MyModule
library, and the MyModule::process
method was called, showing that the pipeline used our custom module! We can also add our module into a chain pipeline and other pipeline structures. For example, try the following pipeline script:
You should see each event being dumped, with the dumped info being bracketed by the MyModule::process called!
text from each of the MyModule
instances in the chain.
The minimal module presented in the section above outputs a fixed message which can only be changed by modifying the code and recompiling the module. In most use cases hard-coding like this is sufficient, but if your module has parameters that may change frequently (e.g. a threshold that requires optimization), it is easy to make them configurable at runtime through the pipeline script. To demonstrate this, we'll modify the MyModule
class from earlier to have a single std::string
type data member and make this configurable from the pipeline script.
To add a configurable data member to MyModule
, we modify the code as follows:
The key changes are:
std::string
data member message
process
member functionps
passed to the user-defined constructor to extract configuration informationHere, message
is our configurable parameter, and is initialized in the MyModule
constructor using the falaise::config::property_set::get member function. We supply std::string
as the template argument as that is the type we need, and message
as the parameter ID to extract. This ID does not have to match the name of the data member, but it is useful to do so for clarity.
As configuration is always done through the constructor, you can then use configured data members just like any other. In this case we simply report the value of message
to standard output in the process
member function.
No special build setup is needed for a configurable module, so you can use the CMake script exactly as given for the basic module above. If you've made the changes as above, simply rebuild!
In the preceding section, we saw that module configuration is passed to a module through an instance of the falaise::config::property_set class. This instance is created by flreconstruct
for the module from the properties, if any, supplied in the section of the pipeline script defining the module. To begin with, we can use the pipeline script from earlier to run the configurable module, simply adding the required string parameter message
to its section:
The key name message
and its type must match that looked for by MyModule
's constructor in the supplied falaise::config::property_set. Allowed key/types and their mappings to C++ types are documented in a later section. The script can be run in flreconstruct
as before:
We can see that the module has been run using the supplied value for the parameter. To change the message
parameter, we simply update its value, e.g.
Having add the key, we can rerun with the updated pipeline script:
and see that the parameter has been changed to the value defined in the script. Keys are bound to the section they are defined in, so we can use the same module type multiple times but with different parameters. For example, try the following pipeline script:
You should see each event being dumped, with the dumped info being bracketed by the output from each MyModule
instance, each with different values of the message parameter.
Both flreconstruct
and falaise::config::property_set work together to check that needed parameters are supplied and of the correct type. For example, if we did not supply the message
parameter:
then flreconstruct
will error out and tell us what happened:
Equally, if we supply the parameter but it has the wrong type:
then a similar error would be reported:
Additional methods for configuration and validation are covered in the following section.
Whilst the ability to make modules configurable is extremely useful, you should aim to minimize the number of parameters your module takes. This helps to make the module easier to use and less error prone. Remember that the modular structure of the pipeline means that tasks are broken down into smaller chunks, so you should consider refactoring complex modules into smaller orthogonal units.
An important restriction on configurable parameters is that they can only be of types understood by falaise::config::property_set and the underlying datatools::properties configuration language.
C++ Type | property_set accessor | properties script syntax |
---|---|---|
std::string | auto x = ps.get<std::string>("key"); | key : string = "hello" |
int | auto x = ps.get<int>("key"); | key : integer = 42 |
double | auto x = ps.get<double>("key"); | key : real = 3.14 |
bool | auto x = ps.get<bool>("key"); | key : boolean = true |
std::vector<std::string> | auto x = ps.get<std::vector<std::string>>("key"); | key : string[2] = "hello" "world" |
std::vector<int> | auto x = ps.get<std::vector<int>>("key"); | key : int[2] = 1 2 |
std::vector<double> | auto x = ps.get<std::vector<double>>("key"); | key : real[2] = 3.14 4.13 |
std::vector<bool> | auto x = ps.get<std::vector<bool>>("key"); | key : bool[2] = true false |
falaise::config::path | auto x = ps.get<falaise::config::path>("key"); | key : string as path = "/tmp/foo" |
falaise::config::quantity_t | auto x = ps.get<falaise::config::length_t>("key"); | key : real as length = 3.14 mm |
falaise::config::property_set | auto x = ps.get<falaise::config::property_set>("key"); | see below |
The last item handles the case of nested configurations, for example
The keys can be extracted individually from the resultant falaise::config::property_set, e.g.
However, nested configurations typically imply structured data, with periods indicating the nesting level. Each level can be extracted into its own set of properties, e.g.
with subsequent handling as required. A restriction on nesting is that it cannot support configurations such as
as the key "a" is ambiguous. You should not use this form in any case as it generally indicates bad design.
When using falaise::config::property_set, you have several methods to validate the configuration supplied to your module. By validation, we mean checking the configuration supplies:
All configuration and validation must be handled in the module's constructor, with exceptions thrown if an validation check fails. The first two checks can be handled automatically by falaise::config::property_set through its get
member functions.
Parameters may be required, i.e. there is no sensible default, or optional, i.e. where we may wish to adjust the default. A required parameter is validated for existence and correct type by the single parameter get
member function, e.g.
If the ps
instance does not hold a parameter "message", or holds it with a type other than std::string
, then an exception is thrown and will be handled automatically by flreconstruct
.
An optional parameter is validated in the same way, but we use the two parameter form of get
, e.g:
Here, if the ps
instance does not hold a parameter "myparam" then the myparam
data member will be initialized to 42
. If ps
holds parameter "myparam" of type int
then myparam
will be set to its value. If ps
holds parameter "myparam" and it is not of type int
, then an exception is thrown (and handled by flreconstruct
as above). Both forms are particularly useful for parameters that supply physical quantities such as lengths. See the documentation on Falaise's System of Units for further information on their use to assist with dimensional and scaling correctness.
Additional validation tasks such as bounds checking must be handled manually, and generally within the body of the module's constructor. For example, if we have a required integer parameter that must be even, we could validate this via:
You should prefer to initialize parameter values in the constructor's initializer list, with further validation, if required, in the constructor body. Errors must be handled by throwing an exception derived from std::exception
.
The flreconstruct
program provides the needed libraries to run core modules, specifically the minimal set:
Linking your module to the Falaise::FalaiseModule
target in target_link_libraries
ensures that your module uses the appropriate headers at compile time, and the correct symbols at runtime. If your module requires use of additional libraries, then you will need to get CMake to locate these and then link them to your module.
In the most common case of using non-core libraries from the ROOT package, then the find_package
step would be modified to:
The module can then be linked to the additional library by adding it in the target_link_libraries
command:
For other packages, find_package
followed by target_link_libraries
can be used in the same fashion.
The above examples have illustrated the basic structures needed to implement a module and load it into flreconstruct
.
Practical modules will access the event object passed to them, process it and then write information back into the event record. Using the event data model in modules is covered in a dedicated tutorial.
Modules may also need access to global data such as run conditions. FLReconstruct uses the concept of "Services" to provide such data, and a tutorial on using services in modules is provided.
Modules should also always be documented so that users have a clear picture of the task performed by the module and its configurable parameters. A tutorial on documenting modules using the builtin Falaise/Bayeux tools is available.
Though modules for FLReconstruct may not be directly integrated in Falaise, for consistency and maintanability their code should use the Falaise coding standards