# Work functions¶

A work function is the simplest of the two types of workflows in AiiDA. It can call one or more calculation functions and return data that has been created by the calculation functions it has called. Moreover, work functions can also call other work functions, allowing you to write nested workflows.

In this section, you will learn to:

1. Understand how to add simple python functions to the provenance.

2. Learn how to write and launch a simple workflow in AiiDA.

## Calculation functions¶

Calculation functions are a great way to keep track of steps that are part of your scientific workflow and written in Python to the provenance of AiiDA. In order to do so, you have to add a `calcfunction` decorator to the Python function. A simple example is the `multiply` calculation function from the AiiDA basics section:

```from aiida.engine import calcfunction

@calcfunction
def multiply(x, y):
return x * y
```

In a sense, this example is deceptively simple. Let’s consider a slightly more complicated example: a `rescale` function that takes an ASE `Atoms` structure and rescales the unit cell with a certain `scale` factor:

```def rescale(structure, scale):

new_cell = structure.get_cell() * scale
structure.set_cell(new_cell, scale_atoms=True)

return structure
```

Open a `verdi shell` or Jupyter notebook (with the AiiDA magic: `%aiida`) and use the code snippet above to define the `rescale` function. Next, load any `StructureData`, for example using the `QueryBuilder`:

```In : from aiida.orm import StructureData
...: structure = QueryBuilder().append(StructureData).first()
```

In order to test the method, we need to convert the `StructureData` into an ASE `Atoms` instance. This can be easily done using the `get_ase()` method:

```In : ase_structure = structure.get_ase()
```

Let’s have a look at what structure we found:

```In : ase_structure
Out: Atoms(symbols='NaNbO3', pbc=True, cell=[3.9761497211, 3.9761497211, 3.9761497211], masses=...)
```

Next, use the `rescale` function to double the lattice vectors of the unit cell:

```In : rescale(ase_structure, 2)
Out: Atoms(symbols='NaNbO3', pbc=True, cell=[7.9522994422, 7.9522994422, 7.9522994422], masses=...)
```

Great! That all seems to be working as expected. Now it’s time to convert our Python function into a calculation function.

### Working with nodes¶

Try to adapt the `rescale` function above into a calculation function by adding a `calcfunction` decorator:

```from aiida.engine import calcfunction

@calcfunction
def rescale(structure, scale):

new_cell = structure.get_cell() * scale
structure.set_cell(new_cell, scale_atoms=True)

return structure
```

Maybe you already see why just adding the `calcfunction` decorator is not sufficient. Trying to run the method again with the `ase_structure` and `2` scaling factor will fail, since neither are a `Data` node:

```In : rescale(ase_structure, 2)
(...)
ValueError: Error occurred validating port 'inputs.structure': value 'structure' is not of the right type.
Got '<class 'ase.atoms.Atoms'>', expected '(<class 'aiida.orm.nodes.data.data.Data'>,)'
```

However, passing the originally imported `StructureData` stored in `structure` and `Float(2)` won’t work either:

```In : rescale(structure, Float(2))
(...)
AttributeError: 'StructureData' object has no attribute 'get_cell'
```

The reason for these failures is that we need to adjust the `rescale` function further, to make sure it can both accept AiiDA nodes as inputs, as well as returns an AiiDA node:

```from aiida.engine import calcfunction

@calcfunction
def rescale(structure, scale):
"""Calculation function to rescale a structure

:param structure: An AiiDA `StructureData` to rescale
:param scale: The scale factor (for the lattice constant)
:return: The rescaled structure
"""
from aiida.orm import StructureData

ase_structure = structure.get_ase()
scale_value = scale.value

new_cell = ase_structure.get_cell() * scale_value
ase_structure.set_cell(new_cell, scale_atoms=True)

return StructureData(ase=ase_structure)
```

Let’s explain the required changes in more detail:

```    from aiida.orm import StructureData
```

Here the `StructureData` class is imported, since we need it later to convert the ASE `Atoms` structure into a `StructureData` node so we can output it.

```    ase_structure = structure.get_ase()
scale_value = scale.value
```

These two lines simply convert the inputs, which have to be AiiDA nodes, into the corresponding ASE `Atoms` structure and the Python `float` base type that we need to scale the unit cell.

```    return StructureData(ase=ase_structure)
```

After the `ase_structure` has been rescaled, we need to convert it back into a `StructureData` node that is then returned by the `rescale` function as an output.

So, in reality we have to do two things in order to adapt a regular Python function into a calculation function that can be tracked in the provenance:

1. Add the `calcfunction` decorator.

2. Make sure the function expects and returns AiiDA `Data` nodes. This often involves converting the input nodes into other Python objects, and converting the result of the analysis back into an AiiDA `Data` node.

### Exercises¶

(1) Run the calculation function version of `rescale` with AiiDA nodes as inputs. Convert the output `StructureData` node back into an ASE `Atoms` structure. Is the result what you expected?

(2) Why was the `multiply` function so deceptively simple? That is, why was conversion to/from AiiDA nodes not an issue there?

(3) Since calculation functions are tracked in the provenance, you should be able to find those you have just run using the `verdi process list` command. If you’ve tried the incorrect `rescale` calculation function above, this list will contain one `Excepted` result. Use what you’ve learned in the Troubleshooting module to figure out what went wrong here.

## Writing a work function¶

Writing a work function whose provenance is automatically stored can be achieved by writing a Python function and decorating it with the `workfunction()` decorator:

```"""Basic calcfunction-based workflows for demonstration purposes."""
from aiida.engine import calcfunction, workfunction

@calcfunction
return x + y

@calcfunction
def multiply(x, y):
return x * y

@workfunction
"""Add two numbers and multiply it with a third."""
return product
```

It is important to reiterate here that the `workfunction()`-decorated `add_multiply()` function does not create any new data nodes. The `add()` and `multiply()` calculation functions create the `Int` data nodes, all the work function does is return the results of the `multiply()` calculation function. Moreover, both calculation and work functions can only accept and return data nodes, i.e. instances of classes that subclass the `Data` class.

Copy the code snippet above and put it into a Python file (e.g. `add_multiply.py`), or download it directly using the link next to it. In the terminal, navigate to the folder where you stored the script. Next, import the add_multiply work function in the `verdi shell`:

```In : from add_multiply import add_multiply
```

Similar to a calculation function, running a work function is as simple as calling a typical Python function: simply call it with the required input arguments:

```In : result = add_multiply(Int(2), Int(3), Int(5))
```

Here, the `add_multiply` work function returns the output `Int` node and we assign it to the variable `result`. Again, note that the input arguments of a work function must be an instance of a `Data` node, or any of its subclasses. Just calling the `add_multiply` function with regular integers will result in a `ValueError`, as these cannot be stored in the provenance graph.

When we check the AiiDA list of all processes that have terminated in the past day:

```\$ verdi process list -a -p 1
PK  Created    Process label    Process State    Process status
----  ---------  ---------------  ---------------  ----------------
...
1859  1m ago     add_multiply     ⏹ Finished 
1860  1m ago     add              ⏹ Finished 
1862  1m ago     multiply         ⏹ Finished 
```

Copy the PK of the `add_multiply` work function and check its status with `verdi process status` (in the above example, the PK is `1859`):

```\$ verdi process status <PK>
└── multiply<1862> Finished 
```

Finally, you can also check the details of the inputs and outputs of the work function:

```\$ verdi process show <PK>
```

Notice that each input and output to the work function `add_multiply` is stored as a node, and that the work chain has `CALLED` both the `add` and `multiply` calculation functions:

```Property     Value
-----------  ------------------------------------
state        Finished 
pk           1859
uuid         c65df725-6065-40ec-8343-6ee9ef68ca9a
description
ctime        2021-06-07 14:48:06.342948+00:00
mtime        2021-06-07 14:48:06.835870+00:00

Inputs      PK  Type
--------  ----  ------
x         1856  Int
y         1857  Int
z         1858  Int

Outputs      PK  Type
---------  ----  ------
result     1863  Int

Called      PK  Type
--------  ----  --------
CALL      1862  multiply
```

### Exercise¶

Let’s look at multiple ways to generate the provenance graph and what this can teach us.

(1) Generate the provenance graph of the `add_multiply` work function without any additional options. Does anything seem missing here?

(2) Try to generate the provenance graph again, but this time with the `-i, --process-in` option. You can use `verdi node graph generate -h` for more information about the various options of this command.

(3) Finally, try to generate the data provenance by:

1. Targetting the `multiply` calculation function instead of the `add_multiply` method.

2. Using the `-l, --link-types` option to select the `data` links only.