# Querying for data

## Contents

Important

The “Querying for data” module is written in a Jupyter notebook format. On this page you can find a rendered version of the notebook, but you can download the `.ipynb` file using the following command:

```\$ wget https://filedn.com/lsOzB8TTUIDz2WkFj8o6qhp/permanent/querying.ipynb
```

Note that not all cells are complete! Many are exercises for you to solve.

# Querying for data¶

The notebook will show you how the `QueryBuilder` can be used to query your database for specific data. It will demonstrate certain concepts you can then use to perform certain queries on your own database. Some of these question cells will have partial solutions that you will have to complete.

Important

Make sure to execute the cell below this one (it may be hidden)

```from IPython.display import Image
from datetime import datetime, timedelta
import numpy as np
from matplotlib import gridspec, pyplot as plt
from aiida.orm import load_node, Node, Group, Computer, User, CalcJobNode, Code
from aiida.plugins import CalculationFactory, DataFactory

PwCalculation = CalculationFactory('quantumespresso.pw')
StructureData = DataFactory('structure')
KpointsData = DataFactory('array.kpoints')
Dict = DataFactory('dict')
UpfData = DataFactory('upf')

def plot_results(query_res):
"""
:param query_res: The result of an instance of the QueryBuilder
"""
smearing_unit_set,magnetization_unit_set,pseudo_family_set = set(), set(), set()
# Storing results:
results_dict = {}
for pseudo_family, formula, smearing, smearing_units, mag, mag_units in query_res:
if formula not in results_dict:
results_dict[formula] = {}
# Storing the results:
results_dict[formula][pseudo_family] = (smearing, mag)
# Adding to the unit set:

# Sorting by formula:
sorted_results = sorted(results_dict.items())
formula_list = next(zip(*sorted_results))
nr_of_results = len(formula_list)

# Checks that I have not more than 3 pseudo families.
# If more are needed, define more colors
#pseudo_list = list(pseudo_family_set)
if len(pseudo_family_set) > 3:
raise Exception('I was expecting 3 or less pseudo families')

colors = ['b', 'r', 'g']

# Plotting:
plt.clf()
fig=plt.figure(figsize=(16, 9), facecolor='w', edgecolor=None)
gs  = gridspec.GridSpec(2,1, hspace=0.01, left=0.1, right=0.94)

# Defining barwidth
barwidth = 1. / (len(pseudo_family_set)+1)
offset = [-0.5+(0.5+n)*barwidth for n in range(len(pseudo_family_set))]
# Axing labels with units:
yaxis = ("Smearing energy [{}]".format(smearing_unit_set.pop()),
"Total magnetization [{}]".format(magnetization_unit_set.pop()))
# If more than one unit was specified, I will exit:
if smearing_unit_set:
raise ValueError('Found different units for smearing')
if magnetization_unit_set:
raise ValueError('Found different units for magnetization')

# Making two plots, the top one for the smearing, the bottom one for the magnetization
for index in range(2):
for i,pseudo_family in enumerate(pseudo_family_set):
X = np.arange(nr_of_results)+offset[i]
Y = np.array([thisres[1][pseudo_family][index] for thisres in sorted_results])
ax.bar(X, Y,  width=0.2, facecolor=colors[i], edgecolor=colors[i], label=pseudo_family)
ax.set_xlim(-0.5, nr_of_results-0.5)
ax.set_xticks(np.arange(nr_of_results))
if index == 0:
plt.setp(ax.get_yticklabels()[0], visible=False)
ax.xaxis.tick_top()
ax.legend(loc=3, prop={'size': 18})
else:
plt.setp(ax.get_yticklabels()[-1], visible=False)
for i in range(0, nr_of_results, 2):
ax.axvspan(i-0.5, i+0.5, facecolor='y', alpha=0.2)
ax.set_xticklabels(list(formula_list),rotation=90, size=14, ha='center')
plt.show()

def generate_query_graph(qh, out_file_name):

def draw_vertice_settings(idx, vertice, **kwargs):
"""
Returns a string with all infos needed in a .dot file  to define a node of a graph.
:param node:
:param kwargs: Additional key-value pairs to be added to the returned string
:return: a string
"""
if vertice['entity_type'].startswith('process'):
shape = "shape=polygon,sides=4"
elif vertice['entity_type'].startswith('data.code'):
shape = "shape=diamond"
else:
shape = "shape=ellipse"
filters = kwargs.pop('filters', None)
if filters:
for k,v in filters.items():
additional_string += "\n   {} : {}".format(k,v)

label_string = " ('{}')".format(vertice['tag'])

labelstring = 'label="{} {}{}"'.format(
vertice['entity_type'], #.split('.')[-2] or 'Node',
label_string,
#~ return "N{} [{},{}{}];".format(idx, shape, labelstring,
return "{} [{},{}];".format(vertice['tag'], shape, labelstring)
nodes = {v['tag']:draw_vertice_settings(idx, v, filters=qh['filters'][v['tag']]) for idx, v in enumerate(qh['path'])}
links = [(v['tag'], v['joining_value'], v['joining_keyword']) for v in qh['path'][1:]]

with open('temp.dot','w') as fout:
fout.write("digraph G {\n")
fout.write('    {} -> {} [label=" {}"];\n'.format(*l))
for _, n_values in nodes.items():
fout.write("    {}\n".format(n_values))

fout.write("}\n")
import os
os.system('dot temp.dot -Tpng -o {}'.format(out_file_name))

def store_formula_in_extra():
from aiida.orm import QueryBuilder
query = QueryBuilder()
query.append(StructureData, filters={'extras':{'!has_key':'formula'}})
for structure, in query.iterall():
structure.set_extra('formula', structure.get_formula(mode='count'))

store_formula_in_extra()
```

## Introduction to the QueryBuilder¶

As you will use AiiDA to submit and manage your calculations, the database that stores all the data and the provenance will quickly grow to be very large. To help you find the needle you might be looking for in this big haystack, you need an efficient search tool. AiiDA provides a tool to do exactly this: the `QueryBuilder`. The `QueryBuilder` acts as the gatekeeper to your database, to whom you can ask questions about the contents of your database (also referred to as queries), by specifying what you are looking for. In this part of the tutorial, we will focus on how to use the `QueryBuilder` to make these queries and understand and use the results.

In order to use the `QueryBuilder`, you first need to import it. This can accomplished by executing the `import` statement in the following cell. Go ahead and select the next cell, and press `Shift+Enter`.

```from aiida.orm import QueryBuilder
```

Before you can use the `QueryBuilder` to query the database, you first need to create an instance of it:

```query = QueryBuilder()
```

Now that you have an instance of the `QueryBuilder` named `query`, you are ready to start asking about the content of your database. For example, we may want to know exactly how many nodes there are in the database. To let the AiiDA database know that we are interested in all the occurrences of the `Node` class, you can `append` it to the list of objects it should find through `query`.

Note

The method is called `append` because, as we will see later, you can append multiple nodes to a `QueryBuilder` instance consecutively to search in the graph, as if you had a list. What we are doing is querying a graph, and for every vertex of the graph in our sub-query, we will use one `append` call. But this use will be demonstrated more fully in a few steps.

```query.append(Node)
```
```<aiida.orm.querybuilder.QueryBuilder at 0x108f70550>
```

We have now narrowed down the scope of `query` to just the nodes that are present in the database (i.e., we are ignoring computers, users, etc.). To learn how many nodes there are exactly, you can use the `count()` method:

```query.count()
```
```1981
```

Now as you may have learned in previous sections of the tutorial, nodes come in different kinds and flavors. For example, all the crystal structures stored in the database are saved in nodes that are of the type `StructureData`. If instead of all the nodes, we would rather like to count only the crystal structure nodes, we simply tell a `QueryBuilder` instance to narrow its scope only to objects of type `StructureData`. Since we want to create a new independent query, you must create a new instance of the `QueryBuilder`.

### Exercise¶

In the next cell, we have typed part of the code to count all the structure nodes. See if you can finish the line with the comment, to tell the `QueryBuilder` that you are only interested in `StructureData` nodes.

```query = QueryBuilder()
query.append() # How do we finish this line to tell the query builder to count only the structure nodes?
query.count()
```

Instead of just counting how many crystal structure nodes exist, we may also actually want to see some of them. This is as easy as telling the `QueryBuilder` that we are not interested in the `count` but rather that we want to retrieve `all` the nodes.

```query = QueryBuilder()
query.append(StructureData)
query.all()
```
```[[<StructureData: uuid: 5307c08a-90c2-445b-8cdc-d3fdcb2eb47c (pk: 14)>],
[<StructureData: uuid: c44af50c-90e8-47e0-984c-7edb7eda5205 (pk: 15)>],
[<StructureData: uuid: 68d5b4c2-f43a-48ac-9cd3-e9f405f89765 (pk: 20)>],
[<StructureData: uuid: 6d0a3c87-3dac-4c1d-a09d-e19b2f908ca9 (pk: 28)>],
[<StructureData: uuid: 528d37da-e757-4ed1-b291-6c84103fa55b (pk: 50)>],
[<StructureData: uuid: c91903cf-74cd-4bce-9c63-fce6516f7bdf (pk: 73)>],
[<StructureData: uuid: fa28119f-b883-4a98-85e9-5b725760d969 (pk: 104)>],
[<StructureData: uuid: 2251fced-64d1-46bd-b0ee-0967dd0bff27 (pk: 130)>],
[<StructureData: uuid: de48e70c-776b-4295-a0aa-35bf057c35d1 (pk: 142)>],
[<StructureData: uuid: 1f917d76-c429-44f4-b391-7c8268238a89 (pk: 146)>],
[<StructureData: uuid: b30cabd3-6d73-4b18-b7fd-b9e9e73679d2 (pk: 153)>],
[<StructureData: uuid: 9a6fec90-1927-4ddb-8f1f-eb0820072b5f (pk: 168)>],
[<StructureData: uuid: 20d69fb5-ed0a-48c5-ab28-97dac620bf09 (pk: 169)>],
[<StructureData: uuid: 2b548a2c-c474-46c0-b19c-376cfeef6d7b (pk: 194)>],
[<StructureData: uuid: 0a58813e-87cf-4361-8cfa-df1d6d863389 (pk: 227)>],
[<StructureData: uuid: 4fbe7a29-c2fd-4afe-a573-d9641bd6ee01 (pk: 229)>],
[<StructureData: uuid: 42361400-bdce-4b29-9045-e33b8a7c18b1 (pk: 240)>],
[<StructureData: uuid: 4be4c540-5705-4d7a-b83a-021bd77348d9 (pk: 243)>],
[<StructureData: uuid: 44da8eb1-5fcb-497b-bc6f-65bb22cda0eb (pk: 249)>],
[<StructureData: uuid: d55e3685-072d-4b92-8d91-76f5776e692d (pk: 261)>],
[<StructureData: uuid: f31dc747-6a49-4e11-8ee8-13c576eef539 (pk: 307)>],
[<StructureData: uuid: 608fe362-435f-4836-8960-f6c69b32ce58 (pk: 308)>],
[<StructureData: uuid: c6d02247-5d2d-44cf-bf9a-04ca680df864 (pk: 313)>],
[<StructureData: uuid: 47a409db-ed75-4265-a7eb-49c04e476772 (pk: 321)>],
[<StructureData: uuid: bc682d9b-eac5-499a-91b7-fdd24090faf2 (pk: 324)>],
[<StructureData: uuid: f3ef00dd-9131-47ee-b5b9-0c4da21cfd61 (pk: 327)>],
[<StructureData: uuid: ee939714-eca5-49c6-8c9c-e263af322511 (pk: 366)>],
[<StructureData: uuid: e981fbe7-2e0e-4b2a-af4f-64cbfb63552f (pk: 387)>],
[<StructureData: uuid: 987d30aa-97fd-4bf9-a225-e6a42daa321f (pk: 401)>],
[<StructureData: uuid: d3ddc751-6b6b-4893-bed2-cc45e5e6a096 (pk: 423)>],
[<StructureData: uuid: 6a08d389-5627-41b7-9218-328e5edfb914 (pk: 452)>],
[<StructureData: uuid: a3668f1c-1444-4939-8972-93d0e218bb82 (pk: 453)>],
[<StructureData: uuid: ed5f2de7-ac62-4837-a7d6-406dd3c52790 (pk: 454)>],
[<StructureData: uuid: fafbd310-8cd0-485e-8151-28eacfbb89f7 (pk: 482)>],
[<StructureData: uuid: 41c98b52-1193-4bce-9fb5-e88d1cbeaefd (pk: 499)>],
[<StructureData: uuid: 057fc325-0e7d-4a58-bd97-91756735fdb7 (pk: 504)>],
[<StructureData: uuid: 24367c67-dfa8-43f1-8790-0b10191842c8 (pk: 508)>],
[<StructureData: uuid: ed2f1f28-5df7-4ea0-9a4e-17927d7a8a17 (pk: 510)>],
[<StructureData: uuid: 9bba277a-58c6-4a9c-b0f6-5b1f2e9b94fe (pk: 514)>],
[<StructureData: uuid: 87d07811-264d-44c3-bdcf-f645fdd4539b (pk: 517)>],
[<StructureData: uuid: d5638996-0543-47bd-bc7f-7907b6fd64ca (pk: 547)>],
[<StructureData: uuid: d59bb12e-8802-4ce2-899b-6469a791afe1 (pk: 583)>],
[<StructureData: uuid: 91ea9908-9411-4580-925d-5d1c40f2ea3f (pk: 588)>],
[<StructureData: uuid: afd4fcc4-2531-4f8c-8100-91d9ce43b6b2 (pk: 603)>],
[<StructureData: uuid: 01033dd2-81a9-4e77-861f-dbf5bea9c410 (pk: 614)>],
[<StructureData: uuid: 8973e74f-6463-4207-8c33-21bc3ec2611d (pk: 615)>],
[<StructureData: uuid: efe1b5c8-4aec-4738-ba86-707096a576af (pk: 622)>],
[<StructureData: uuid: a72d0525-6971-4d29-9c6e-4d82a93e93eb (pk: 636)>],
[<StructureData: uuid: 8d64d732-4ab4-475b-91ca-e2848157cdf5 (pk: 637)>],
[<StructureData: uuid: 8b093318-d945-467c-b983-a88007e5bd6f (pk: 671)>],
[<StructureData: uuid: 50a3f2e7-3e6f-43bf-8a60-3586ddd811a3 (pk: 672)>],
[<StructureData: uuid: 312c8afb-d459-4f3c-919a-9ed7dc1535af (pk: 680)>],
[<StructureData: uuid: 3e39897b-124f-4f3f-9e0d-7bac9f2d687c (pk: 687)>],
[<StructureData: uuid: 643097f0-7ea9-44b0-896d-4abd1576ba12 (pk: 696)>],
[<StructureData: uuid: bb1f74b7-8dd5-444f-a198-828f59390c6f (pk: 706)>],
[<StructureData: uuid: a3542ed1-e04c-4094-8db5-4940ac8c2dac (pk: 721)>],
[<StructureData: uuid: 0b41f7ca-3ac7-4275-bf34-5495f4e9b464 (pk: 728)>],
[<StructureData: uuid: ef417b01-b912-4a6c-9999-7ac68c870ab6 (pk: 729)>],
[<StructureData: uuid: 6c3f4e3d-770f-48b6-bc0e-1c7676fd6ef4 (pk: 740)>],
[<StructureData: uuid: b2152976-4f84-4822-a7ae-877d3752b2f0 (pk: 743)>],
[<StructureData: uuid: 8183a91b-4c03-4770-8d96-1249b7d7576c (pk: 763)>],
[<StructureData: uuid: a1618368-3fc5-44a8-b3cb-b2ea1be02cc6 (pk: 781)>],
[<StructureData: uuid: f9055068-9494-4c01-a568-af3d2b487b6e (pk: 782)>],
[<StructureData: uuid: f20ff114-58b4-42ef-8339-0428b613c4e2 (pk: 810)>],
[<StructureData: uuid: 58e3e9eb-9c4e-4787-a259-a42cc2ea036a (pk: 791)>],
[<StructureData: uuid: f2c257f9-66f4-4545-8609-3864c2087289 (pk: 822)>],
[<StructureData: uuid: b69afaa9-761e-4377-ae77-f952a60a8887 (pk: 840)>],
[<StructureData: uuid: 28e81f2a-ee95-4bf8-a004-6acd1f521264 (pk: 857)>],
[<StructureData: uuid: 32f9906d-b296-4df3-a428-3abae27acc4d (pk: 862)>],
[<StructureData: uuid: a0d6ae3d-0bd2-42b8-8f07-3945cce48d77 (pk: 885)>],
[<StructureData: uuid: 1603138b-f4d9-41bb-8bc0-c3cfd2af8498 (pk: 926)>],
[<StructureData: uuid: e57c4cea-a263-4f1a-8f1d-28d33c90c7c5 (pk: 935)>],
[<StructureData: uuid: 9c6807c1-5f1c-49fd-a52e-ccd3e53d3d86 (pk: 958)>],
[<StructureData: uuid: f73a2209-69d4-4648-b5a2-d3b83bdfee48 (pk: 972)>],
[<StructureData: uuid: 934fdc92-36da-413d-8b48-85042e7f68e5 (pk: 973)>],
[<StructureData: uuid: 599ef74b-ee60-40f4-816c-e58310d819a7 (pk: 981)>],
[<StructureData: uuid: d2a0bfc7-7506-493c-9393-fa69a91daae7 (pk: 982)>],
[<StructureData: uuid: a4a17738-c87a-4086-a948-f56b39df2e57 (pk: 986)>],
[<StructureData: uuid: ffae249a-4561-4978-9346-368b661d52ed (pk: 991)>],
[<StructureData: uuid: b31d9bf8-a224-4385-a135-af5872dde0f3 (pk: 1009)>],
[<StructureData: uuid: 4909f51c-e037-4ba1-9b8f-0e8f7477b5ef (pk: 1012)>],
[<StructureData: uuid: 57b92419-f024-4f93-880b-3c89cc5e1c5d (pk: 1013)>],
[<StructureData: uuid: 1077e7b0-7ef9-47d2-bc02-f5f7cfda6f96 (pk: 1025)>],
[<StructureData: uuid: 3e27691d-94b4-42f8-b2e3-4794f8fc5447 (pk: 1026)>],
[<StructureData: uuid: 1eefd0f6-d7d9-4681-bae8-d882cb3e4340 (pk: 1044)>],
[<StructureData: uuid: e1fc2c63-ba39-4c79-99c5-d36637f1c3f1 (pk: 1046)>],
[<StructureData: uuid: eba2658e-2b4d-422e-8678-4ddc7f13c589 (pk: 1052)>],
[<StructureData: uuid: a662059c-2817-4f65-a0ab-3ab11199e437 (pk: 1060)>],
[<StructureData: uuid: 89dacd1c-59db-4748-9662-04281bfb42e3 (pk: 1061)>],
[<StructureData: uuid: 06f6e872-33a2-40b1-af78-77cbfebd3c11 (pk: 1083)>],
[<StructureData: uuid: b278a1b5-53c5-4c6c-917c-a228c2e63e13 (pk: 1127)>],
[<StructureData: uuid: 3a4b1270-82bf-4d66-a51f-982294f6e1b3 (pk: 1133)>],
[<StructureData: uuid: 91fc0ea2-138c-4555-81a0-bfc1d13c788c (pk: 1143)>],
[<StructureData: uuid: 280d41a8-c8de-4e16-a8f3-59492234c819 (pk: 1151)>],
[<StructureData: uuid: 216fd0a5-cecc-442c-a248-a302a0f42b56 (pk: 1159)>],
[<StructureData: uuid: a9b6b043-4570-46f6-afdd-e74b4d5fe0c1 (pk: 1163)>],
[<StructureData: uuid: 783c5ae4-462c-44dc-a72c-15e3eef134ff (pk: 1167)>],
[<StructureData: uuid: 69312d64-7f22-49eb-beb6-5fbf21f93e06 (pk: 1177)>],
[<StructureData: uuid: df873b09-312f-45e7-88aa-0d3f3a4e7d01 (pk: 1187)>],
[<StructureData: uuid: 0aa59f4e-d029-4d55-8768-994217a17d7b (pk: 1201)>],
[<StructureData: uuid: 29093b6c-8496-467a-a480-f41dee8f642c (pk: 1202)>],
[<StructureData: uuid: 31de0572-9280-489a-bbd4-832c6d256e7e (pk: 1203)>],
[<StructureData: uuid: 933b0eb9-1068-4ee8-a600-fdc134b35438 (pk: 1224)>],
[<StructureData: uuid: 3142345a-dc05-424f-9e9a-302e71f734f1 (pk: 1225)>],
[<StructureData: uuid: 2cbc4725-ea74-457d-9e2c-2133e12a7397 (pk: 1247)>],
[<StructureData: uuid: e7cc3418-28a9-41f2-856b-e144b41778f7 (pk: 1248)>],
[<StructureData: uuid: c5ee6abf-fe28-4682-b1e5-2ca7d6c8b922 (pk: 1253)>],
[<StructureData: uuid: 12dfca56-1f2d-4ea5-bae0-0c3e712f60b1 (pk: 1305)>],
[<StructureData: uuid: 7ec43ef8-c1da-4d93-b7f1-b0faf4316a31 (pk: 1262)>],
[<StructureData: uuid: 1ea3a33e-c009-4a87-82a5-9f723031a7ca (pk: 1264)>],
[<StructureData: uuid: b49c0b5c-76e7-4d0e-bbd1-d6971a67fcc9 (pk: 1288)>],
[<StructureData: uuid: 33684f60-7891-463a-b999-bacc6efc7a80 (pk: 1282)>],
[<StructureData: uuid: 5e107ccb-f42f-4550-8968-bd49c97c04dc (pk: 1303)>],
[<StructureData: uuid: 805d7940-2ac2-423d-91b0-2c0ece31ae43 (pk: 1327)>],
[<StructureData: uuid: 54134d5c-b9b4-4f92-9cc2-fda0a9716db9 (pk: 1334)>],
[<StructureData: uuid: 5f873c48-2756-407e-942b-87ac693f0d06 (pk: 1360)>],
[<StructureData: uuid: c11b2492-2711-46c7-a939-f37acd7bac77 (pk: 1379)>],
[<StructureData: uuid: 292091b7-a364-41bb-b838-4505a80a9a48 (pk: 1405)>],
[<StructureData: uuid: 815ab53e-52da-4247-84d4-1e31133a9dc5 (pk: 1408)>],
[<StructureData: uuid: 29ae3911-a1db-44f2-b88b-68fbd365fb1d (pk: 1409)>],
[<StructureData: uuid: 8f715aee-7a9d-4151-8cce-69af82edc563 (pk: 1414)>],
[<StructureData: uuid: 951e0281-d6b8-46fd-8ce6-b99a0e8fdcef (pk: 1415)>],
[<StructureData: uuid: 282cea76-85ea-4d4c-a222-0f546e05335f (pk: 1442)>],
[<StructureData: uuid: fe85b080-02c1-440b-bf09-0730bb089d5f (pk: 1446)>],
[<StructureData: uuid: 6b1e9f7f-8655-49fd-b7b7-f2356dfa822f (pk: 1454)>],
[<StructureData: uuid: e64e5a98-1ab7-4de0-b21b-b048bf86341e (pk: 1456)>],
[<StructureData: uuid: a3bd984a-60b9-4768-a53c-d5de9e0279ec (pk: 1461)>],
[<StructureData: uuid: 34a39848-476b-402e-a502-f9759ed4d7d6 (pk: 1468)>],
[<StructureData: uuid: c786e78a-4117-469e-a421-0096b3474ddf (pk: 1502)>],
[<StructureData: uuid: 0a513ab6-435d-416b-85b4-f973df2646b0 (pk: 1512)>],
[<StructureData: uuid: 9f7003cd-d87b-43c1-be4a-ee146c29bf2e (pk: 1517)>],
[<StructureData: uuid: bd621c08-b63a-4bed-8f51-0a8b3021d0a6 (pk: 1531)>],
[<StructureData: uuid: 38ff91e9-afc5-485c-b002-e20d627c3b2d (pk: 1557)>],
[<StructureData: uuid: 88682e49-ba33-4ab3-a3d6-1d075501ab5e (pk: 1562)>],
[<StructureData: uuid: 5f5a2f62-15b9-4ef1-9cb0-8334c5a092fa (pk: 1587)>],
[<StructureData: uuid: 00afea82-862b-4dcc-a88b-a378d1bc316f (pk: 1588)>],
[<StructureData: uuid: 536bde59-2151-4d9e-a73a-5ae3b5a5195c (pk: 1606)>],
[<StructureData: uuid: dc6a3a46-600a-4dac-9109-f64746bb3dc1 (pk: 1607)>],
[<StructureData: uuid: 0984bf55-2b57-4590-bce8-ea10fbe75763 (pk: 1646)>],
[<StructureData: uuid: 095dd254-fc2a-499a-9291-7ce8b4b91dff (pk: 1657)>],
[<StructureData: uuid: 66f50237-aee5-4a13-8966-fef2c9d6fe6c (pk: 1667)>],
[<StructureData: uuid: c04f9178-4f63-4d4c-9555-99f2f5a4cc5e (pk: 1674)>],
[<StructureData: uuid: 0cd545fc-e105-47c9-af9e-12a5d364c121 (pk: 1700)>],
[<StructureData: uuid: f47ef250-0cca-4b6c-8931-f545562556fe (pk: 1717)>],
[<StructureData: uuid: 29bf77e6-1c91-4806-a662-ee74f73d1ec8 (pk: 1747)>],
[<StructureData: uuid: 2d2bddde-80e3-424d-88ef-baec49f8fa8b (pk: 1823)>],
[<StructureData: uuid: a7c956f7-e60f-4def-8846-d58e8f2e4c64 (pk: 1827)>],
[<StructureData: uuid: bc5d54cc-1759-4cb8-807b-e4de4dbcfbeb (pk: 1845)>],
[<StructureData: uuid: 07ef19d3-4077-4406-8d89-6d128d11ce2f (pk: 1856)>],
[<StructureData: uuid: 0cbdcd25-afb4-4462-83d1-e2d175edee17 (pk: 1859)>],
[<StructureData: uuid: 2386661e-f6ce-4c02-b1e7-d0db75d76469 (pk: 1870)>],
[<StructureData: uuid: c32268d0-4988-4f56-a071-057473a14aa5 (pk: 1875)>],
[<StructureData: uuid: 3ce89daa-bfe4-483f-8496-03a4e255ed32 (pk: 1889)>],
[<StructureData: uuid: cd21db6e-a429-4bbb-bd78-0605a4363729 (pk: 1913)>],
[<StructureData: uuid: b5f2d1e8-36cd-4e46-b233-97ccd0560083 (pk: 1914)>],
[<StructureData: uuid: 30bc32fa-782b-4a7d-93b8-f257830e2b66 (pk: 1917)>],
[<StructureData: uuid: b05f7c51-691a-455b-9da1-1b9fe1f47014 (pk: 1948)>],
[<StructureData: uuid: 5c7291dc-597c-4fe9-a55e-a3bd510c3ce6 (pk: 1949)>],
[<StructureData: uuid: c3285a08-f04b-48a1-8ab5-931d5eda4baf (pk: 1956)>],
[<StructureData: uuid: 08c34099-bfd7-4c7f-89ea-54c55cbe01bb (pk: 1958)>],
[<StructureData: uuid: 954bd42d-5111-4605-88a0-64ee6838e8a9 (pk: 1976)>],
[<StructureData: uuid: b96f67cb-1483-4162-99e2-268b09c1f228 (pk: 27)>],
[<StructureData: uuid: 57a14dc8-900b-48ab-910e-a2b42e269a78 (pk: 34)>],
[<StructureData: uuid: ecb560fd-93ac-43b2-9533-6f3a5fda408e (pk: 40)>],
[<StructureData: uuid: 949b4082-69c9-4c56-8e08-a9c80f5a8b08 (pk: 42)>],
[<StructureData: uuid: 76a44fc8-4c26-4de9-af4d-e8aa21eab576 (pk: 51)>],
[<StructureData: uuid: b923e51d-bfa4-4e30-a318-b72c72c3bc3e (pk: 65)>],
[<StructureData: uuid: 260fc4a4-0ea8-4136-893e-fd968c1beb88 (pk: 89)>],
[<StructureData: uuid: 3fef81b7-8330-4030-a113-cb03167026f4 (pk: 90)>],
[<StructureData: uuid: 7c47e1d5-36cb-4c4e-a44a-0a9163b2ffa1 (pk: 107)>],
[<StructureData: uuid: 46f6b355-1162-4739-9f84-7b2968409b85 (pk: 113)>],
[<StructureData: uuid: e4cdeca6-cc4e-4bf6-b005-1b050f80e0af (pk: 159)>],
[<StructureData: uuid: 65bae9e2-ed73-406b-8cbd-095b7b62634a (pk: 183)>],
[<StructureData: uuid: 9b57b385-b26b-4a97-b704-feac60605597 (pk: 205)>],
[<StructureData: uuid: cca5af82-9ae2-404f-9399-73528236ef94 (pk: 216)>],
[<StructureData: uuid: ab37cb1e-4bc2-4859-a085-ffa6ff982338 (pk: 219)>],
[<StructureData: uuid: 74a6dcb8-f22b-46db-b766-f4e1620b0e13 (pk: 244)>],
[<StructureData: uuid: 7fb1e483-78f5-4911-a7d7-f6f38c1e0a73 (pk: 246)>],
[<StructureData: uuid: 164cc810-9c0d-49de-ba9d-be075dcd972a (pk: 279)>],
[<StructureData: uuid: 7c1818be-8b41-411a-8e9f-8fa5848bbff1 (pk: 284)>],
[<StructureData: uuid: d4a99146-57eb-4f4c-8423-e650862bdd60 (pk: 287)>],
[<StructureData: uuid: 7dabdc64-4945-457a-8f3c-d34a844c9256 (pk: 295)>],
[<StructureData: uuid: b9af4d93-f56b-4d39-8528-2e4d8f5c1f09 (pk: 299)>],
[<StructureData: uuid: 4b7d32c1-9f1a-4384-bc9e-da20b8591fa4 (pk: 302)>],
[<StructureData: uuid: 14c17b15-9ffd-4c35-b000-c70007588a69 (pk: 305)>],
[<StructureData: uuid: 88bddb92-c6b5-4c24-b24e-1b927ea2cbab (pk: 310)>],
[<StructureData: uuid: c95924c3-8783-46b9-9056-8f94794e977a (pk: 316)>],
[<StructureData: uuid: 39a3fd74-8129-4000-9d79-6d36dfa75ef8 (pk: 388)>],
[<StructureData: uuid: da1aa62c-b8c3-43f7-a675-7e6af87b7674 (pk: 390)>],
[<StructureData: uuid: 251e391b-9df7-4486-b4f5-8d7ca11aa01a (pk: 405)>],
[<StructureData: uuid: ceffd783-6de3-4271-b1d1-c57cd0cf1986 (pk: 435)>],
[<StructureData: uuid: ae9b5026-22b6-4820-91c3-61230582063e (pk: 409)>],
[<StructureData: uuid: a1d621ff-9daa-4d53-af76-93163817bae7 (pk: 418)>],
[<StructureData: uuid: caf2b2b8-e355-4d4c-a648-a0bd38fd5e46 (pk: 422)>],
[<StructureData: uuid: 0d046375-8fa4-476a-9f0e-07d5db1f659e (pk: 433)>],
[<StructureData: uuid: b8186d88-5d55-4a92-bd8a-8396e72158a8 (pk: 434)>],
[<StructureData: uuid: 305b2184-3023-4c9d-8d46-83da177a53d1 (pk: 443)>],
[<StructureData: uuid: 5d98dc0c-0a25-46c9-ab1c-0e88f9983c0b (pk: 464)>],
[<StructureData: uuid: 03e5231c-a1b3-4c1e-9657-c63c6821eb3b (pk: 468)>],
[<StructureData: uuid: cecd2ca6-04e5-4a25-8a52-6da364a4c0ee (pk: 494)>],
[<StructureData: uuid: 3ab3f2ef-b4a5-4c43-87d5-a955599fced1 (pk: 511)>],
[<StructureData: uuid: 31401bbe-e5ee-43aa-bfa1-2464526e0471 (pk: 558)>],
[<StructureData: uuid: ae19590b-564e-413e-b894-49c574b76f98 (pk: 556)>],
[<StructureData: uuid: 684474eb-14d2-42a2-93a0-ce2332503eb1 (pk: 569)>],
[<StructureData: uuid: 2ab22931-c69e-4653-b422-95c2204f4e95 (pk: 572)>],
[<StructureData: uuid: f5a4596c-bbaa-40e4-bc8c-55473a8bcbf4 (pk: 584)>],
[<StructureData: uuid: 03b3ae04-5cfd-49dc-a2e7-0e7f0e5c0e94 (pk: 589)>],
[<StructureData: uuid: da914d4c-5f7f-4fa6-bac4-7016e3c6f986 (pk: 592)>],
[<StructureData: uuid: a16d800c-6482-4a73-b74e-ea9fffee489e (pk: 611)>],
[<StructureData: uuid: 15b1d607-c753-4790-ba9c-10923c845464 (pk: 621)>],
[<StructureData: uuid: e2043ee2-e483-4691-a2e3-8c97a5b14448 (pk: 633)>],
[<StructureData: uuid: 24be3c01-ee6a-4167-803e-7e3f904fc106 (pk: 643)>],
[<StructureData: uuid: 06c6596d-94ae-4f0b-a558-701cdd530fb8 (pk: 654)>],
[<StructureData: uuid: 997be6f7-fc20-4749-ba78-eff6062b2b05 (pk: 657)>],
[<StructureData: uuid: 40d670c8-cb43-4269-825b-6ccf4afba90a (pk: 686)>],
[<StructureData: uuid: db8ee37e-89e8-4e37-a5c9-92a921f2b534 (pk: 710)>],
[<StructureData: uuid: b4e020d5-d737-455c-8d30-1e2e07d20268 (pk: 714)>],
[<StructureData: uuid: d20f8cc9-cd10-4b00-b399-574268bb976c (pk: 726)>],
[<StructureData: uuid: f1a433ba-f0d1-44af-8f2c-d747618cb69e (pk: 735)>],
[<StructureData: uuid: 3f4961a2-27ba-4b4c-b049-f9610c4129c5 (pk: 736)>],
[<StructureData: uuid: 1e26f227-161c-4e0d-8a52-f134ae56c02a (pk: 742)>],
[<StructureData: uuid: 5778efc1-4457-4a07-8174-bfb515d84d00 (pk: 748)>],
[<StructureData: uuid: 737039d0-0217-4b63-975c-b4622211bcac (pk: 764)>],
[<StructureData: uuid: 8fb0b01b-f089-469c-9a39-093af3b67a1b (pk: 771)>],
[<StructureData: uuid: ed56077d-6d3c-4164-b3c3-bfdb8c2ef4e6 (pk: 784)>],
[<StructureData: uuid: 254e5a86-7478-4b91-ab2d-7e980eced9be (pk: 788)>],
[<StructureData: uuid: 109d9424-27c5-4f0c-871a-01c7678c3078 (pk: 793)>],
[<StructureData: uuid: 79a400aa-6f13-4aa2-b5ca-2e5f3a5d2fc4 (pk: 818)>],
[<StructureData: uuid: 3def34ed-b152-4062-8e95-ba52ac233bd2 (pk: 821)>],
[<StructureData: uuid: 3b6f3daa-fd8e-4961-8d85-7fda8b9123de (pk: 832)>],
[<StructureData: uuid: d6bcb57f-a14b-4fbf-bafc-a75cafef8ff1 (pk: 861)>],
[<StructureData: uuid: 60abe5ce-d5b0-49c4-9104-df47c190ce4c (pk: 869)>],
[<StructureData: uuid: ffd4611d-cb1e-41f4-9282-ab61caa2e701 (pk: 887)>],
[<StructureData: uuid: 16c5f607-0be1-4f1a-99af-35f4d93c5b01 (pk: 915)>],
[<StructureData: uuid: e732e724-a3d6-47be-af7a-8c0e615e01d5 (pk: 927)>],
[<StructureData: uuid: 6275dd5b-c7c8-4ff3-84dc-bfa226e959f8 (pk: 928)>],
[<StructureData: uuid: 2d2943c5-140a-4faf-bf9c-40bb4a90245d (pk: 940)>],
[<StructureData: uuid: 75303f8a-02e7-483c-ab3b-63e9e0fc3e4f (pk: 947)>],
[<StructureData: uuid: e87462a9-4426-427b-b013-f1b7bbc1bc7b (pk: 948)>],
[<StructureData: uuid: f1c573c8-e8c7-40fb-aa8b-7ccefa8d7269 (pk: 955)>],
[<StructureData: uuid: 5488757e-b5e1-4b62-acf9-9a08de8f684a (pk: 980)>],
[<StructureData: uuid: a9d5454f-772f-45c6-af82-b85dd4f83278 (pk: 968)>],
[<StructureData: uuid: 436ef08d-acf8-4d19-8fab-e2b7a01ac0da (pk: 1001)>],
[<StructureData: uuid: 8c8aeaef-7d19-4e62-9c6d-b5bee45bf8d1 (pk: 1005)>],
[<StructureData: uuid: aed3e548-ec4d-4320-9662-7c4643b04d64 (pk: 1053)>],
[<StructureData: uuid: 4c36e70c-fbd8-48d0-8dbc-04b2e4fce8be (pk: 1064)>],
[<StructureData: uuid: cc079fe1-37b2-491a-9f1e-9dcb75cf3630 (pk: 1076)>],
[<StructureData: uuid: 359bc31f-f9e7-48f4-87ce-347bac5d5431 (pk: 1080)>],
[<StructureData: uuid: 829d89ba-57eb-4dac-b0bf-145de3c6d5a5 (pk: 1084)>],
[<StructureData: uuid: d34523c8-0eb7-41c7-91fe-bb38f6dd150c (pk: 1140)>],
[<StructureData: uuid: c4e6a2aa-004e-4373-9f13-398ba579deea (pk: 1168)>],
[<StructureData: uuid: fe715a79-1945-45c2-86ca-c9e4438ca00f (pk: 1169)>],
[<StructureData: uuid: 6ec69457-0ae2-4d15-865f-0e62184ef8f0 (pk: 1172)>],
[<StructureData: uuid: fcfd5d23-5639-4245-8be1-35096c2d2e9a (pk: 1207)>],
[<StructureData: uuid: 78abbcb4-7bba-4805-a008-deaa6da59c6c (pk: 1211)>],
[<StructureData: uuid: b7854dd7-9fa7-4b80-8ed1-17186e0771d4 (pk: 1229)>],
[<StructureData: uuid: 96a5613d-dc4b-40a9-bb55-69842c34a830 (pk: 1233)>],
[<StructureData: uuid: 1c5e22a5-c466-4e8d-b3ba-101b4e8f407a (pk: 1238)>],
[<StructureData: uuid: 608e2e50-fa4a-4863-bbdd-d9117de58b0f (pk: 1254)>],
[<StructureData: uuid: 9a7d449f-467f-4eae-a2ae-a89545dd9803 (pk: 1259)>],
[<StructureData: uuid: 6e6f8381-4dca-40cc-ac74-08e198ded7a4 (pk: 1260)>],
[<StructureData: uuid: 8a288b5b-3060-4cbf-83f5-a1272960f55e (pk: 1266)>],
[<StructureData: uuid: ff45e62b-8b04-4ace-942e-25d8eef1f8d4 (pk: 1267)>],
[<StructureData: uuid: f606153f-8ca0-42a2-9eec-7e5f23f784d6 (pk: 1279)>],
[<StructureData: uuid: ce9b946e-d76e-41b9-8305-79e63e191348 (pk: 1323)>],
[<StructureData: uuid: b66825c7-e06d-4165-a3e4-e79e77cb2063 (pk: 1352)>],
[<StructureData: uuid: 4eaca5af-daae-4dbd-8a0c-137a18028982 (pk: 1353)>],
[<StructureData: uuid: 62b32fd2-0140-4483-9bb7-bb766a23e613 (pk: 1362)>],
[<StructureData: uuid: 0226a217-10e0-41b6-a6d4-02aa4999f71f (pk: 1363)>],
[<StructureData: uuid: 1e9a7095-47c4-4ea8-b660-f1b06be75da9 (pk: 1377)>],
[<StructureData: uuid: 42ebbb4f-ff3d-4901-b297-c7c626707b23 (pk: 1380)>],
[<StructureData: uuid: 565172e0-57fa-48e7-85bc-e7b4d4c7c5d9 (pk: 1394)>],
[<StructureData: uuid: 27b62e92-ca7a-40be-8b22-f0460fcde0b7 (pk: 1424)>],
[<StructureData: uuid: c47bfedf-2cab-4ecf-842f-3a5662f53770 (pk: 1429)>],
[<StructureData: uuid: bec83f75-f08c-4765-83cc-3cc752e47c33 (pk: 1443)>],
[<StructureData: uuid: c9af3a3e-3422-418c-b6e4-e7f835febe46 (pk: 1482)>],
[<StructureData: uuid: 8aa71cc2-9b8a-4b21-bfc1-bdae460193b8 (pk: 1499)>],
[<StructureData: uuid: 88d7d308-9e97-403a-8fa0-d43c69ed5878 (pk: 1533)>],
[<StructureData: uuid: 29d02818-0f3b-42b2-b7d7-579ae0e1ba98 (pk: 1535)>],
[<StructureData: uuid: 1461de80-fda5-4a0b-b9ce-50103ae4b452 (pk: 1548)>],
[<StructureData: uuid: a187da6b-c065-4bba-9f34-812ce2cc5e2b (pk: 1567)>],
[<StructureData: uuid: cf22ccfb-6ea5-44b4-934e-4f0d3dfa0175 (pk: 1569)>],
[<StructureData: uuid: b1d67f0a-3cf0-436f-998b-c684fcd4503f (pk: 1576)>],
[<StructureData: uuid: a61d60fd-787b-4fe2-b377-906e18d933b5 (pk: 1589)>],
[<StructureData: uuid: d5d02962-9610-42c8-8e6a-a9b7bdba6ed3 (pk: 1601)>],
[<StructureData: uuid: 6fef1191-23cf-4796-97fc-f3bb58c0cddd (pk: 1615)>],
[<StructureData: uuid: 24b7aef2-11e0-4144-b217-c69947e950e5 (pk: 1624)>],
[<StructureData: uuid: 446dffcd-10e5-407c-8d51-0f28dc70563e (pk: 1647)>],
[<StructureData: uuid: 8c39cff8-d996-4087-bcb2-61fcd206328b (pk: 1650)>],
[<StructureData: uuid: 55dd5e10-c714-41d6-80e8-2ec5150ba521 (pk: 1666)>],
[<StructureData: uuid: b98809f4-7b9b-4d5f-8304-4a88c23401d6 (pk: 1690)>],
[<StructureData: uuid: 89158754-46e0-4cca-8551-bc890c703061 (pk: 1704)>],
[<StructureData: uuid: 4aa12967-46c5-4495-89c7-38072c13feac (pk: 1706)>],
[<StructureData: uuid: 47daae4c-79cf-4612-89b2-5b9a52df573d (pk: 1711)>],
[<StructureData: uuid: 6337c5ca-a25e-4bdc-af0a-da4f492d930b (pk: 1714)>],
[<StructureData: uuid: 3fe64ac4-b082-4ce3-8f17-c5500d9d44c8 (pk: 1736)>],
[<StructureData: uuid: 8e77b5ef-658e-46e2-b803-5893d0b27590 (pk: 1739)>],
[<StructureData: uuid: 493203ea-9466-4d70-a876-bd0ac699f986 (pk: 1753)>],
[<StructureData: uuid: 330bfcb0-5ac7-4a7d-a3c8-1c7fba3d314e (pk: 1757)>],
[<StructureData: uuid: a9ebb371-8aec-4bb0-92a4-83e003794d6c (pk: 1774)>],
[<StructureData: uuid: 9ef06025-32ea-46b1-ac80-efa9a7790aa3 (pk: 1775)>],
[<StructureData: uuid: 92749b00-7eb3-456e-b0a9-571dfcf56a2d (pk: 1777)>],
[<StructureData: uuid: 264dd871-f5f0-446b-b007-d209b98b94ea (pk: 1779)>],
[<StructureData: uuid: b9b07519-4c9d-44cc-b55f-881099be2480 (pk: 1797)>],
[<StructureData: uuid: 13f11496-b1dd-4fed-8cca-8116459e4219 (pk: 1802)>],
[<StructureData: uuid: d9f53138-6c01-4ee5-a19f-1770f7b4c8c1 (pk: 1809)>],
[<StructureData: uuid: 84d91c87-ca25-4616-890d-74220e8e19e7 (pk: 1813)>],
[<StructureData: uuid: a2ae295e-c7c6-43f3-8b2c-597104a8b440 (pk: 1835)>],
[<StructureData: uuid: 429941ab-104a-4f45-a2b7-b2d681faab77 (pk: 1850)>],
[<StructureData: uuid: 64b20793-85dc-4aac-a47f-a8b498202141 (pk: 1858)>],
[<StructureData: uuid: 4d39753f-e8aa-4594-872e-c8f3fbe57547 (pk: 1876)>],
[<StructureData: uuid: 2e3a11d6-a975-44e8-b121-b592fa8dd497 (pk: 1894)>],
[<StructureData: uuid: d363b50d-c5ec-4bbf-9293-bd59456e133e (pk: 1916)>],
[<StructureData: uuid: ec745a43-95a6-4c8b-b7a2-58969b2cdce5 (pk: 1925)>],
[<StructureData: uuid: f3ea12d6-7c1d-41a2-a900-63f63434ef27 (pk: 1965)>]]
```

Note that this command is very literal and does in fact retrieve all the crystal structure nodes that are stored in the database, which may be very slow if your database becomes very large. One solution is to tell the `QueryBuilder` that we are, for example, only interested in 5 crystal structure nodes. This can be done with the `limit()` method as follows:

```query = QueryBuilder()
query.append(StructureData)
query.limit(5)
query.all()
```
```[[<StructureData: uuid: 5307c08a-90c2-445b-8cdc-d3fdcb2eb47c (pk: 14)>],
[<StructureData: uuid: c44af50c-90e8-47e0-984c-7edb7eda5205 (pk: 15)>],
[<StructureData: uuid: 68d5b4c2-f43a-48ac-9cd3-e9f405f89765 (pk: 20)>],
[<StructureData: uuid: 6d0a3c87-3dac-4c1d-a09d-e19b2f908ca9 (pk: 28)>],
[<StructureData: uuid: 528d37da-e757-4ed1-b291-6c84103fa55b (pk: 50)>]]
```

Another option is to use the concept of array slicing, native to Python, to specify only to return a subset of the total return set. Notice that this example can be very slow in big databases. When you want performance, use the functionality native to the `QueryBuilder`, like `limit`, which limits the number of results directly at the database level!

The following will return the first 7 results.

```query.limit(None)
query.all()[:7]
```
```[[<StructureData: uuid: 5307c08a-90c2-445b-8cdc-d3fdcb2eb47c (pk: 14)>],
[<StructureData: uuid: c44af50c-90e8-47e0-984c-7edb7eda5205 (pk: 15)>],
[<StructureData: uuid: 68d5b4c2-f43a-48ac-9cd3-e9f405f89765 (pk: 20)>],
[<StructureData: uuid: 6d0a3c87-3dac-4c1d-a09d-e19b2f908ca9 (pk: 28)>],
[<StructureData: uuid: 528d37da-e757-4ed1-b291-6c84103fa55b (pk: 50)>],
[<StructureData: uuid: c91903cf-74cd-4bce-9c63-fce6516f7bdf (pk: 73)>]]
```

If you want to know a little bit more about the retrieved crystal structure nodes, you can loop through the returned results by using the `iterall()` method, short for “iterate over all”. This allows you, for instance, to print the formula of the structures:

```query = QueryBuilder()
query.append(StructureData)
query.limit(5)
for structure, in query.iterall():
print(structure.get_formula())
```
```O3TaTl
O3Sn2
NiO3Sr
O3PbZr
AlO3Y
```

This is just a simple example how we can employ the `QueryBuilder` to get details about the contents of our database. We have now seen simple queries for the `Node` and `StructureData` classes, but the same rules apply to all the AiiDA `Node` sub-classes. For example, you may want to count the number of entries for each of the `Node` sub-classes in the following list, as well as the `Node` class itself:

```class_list = [Node, StructureData, KpointsData, Dict, UpfData, Code]
```

### Exercise¶

Using the tools you have learned so far, it is possible to build a table of the number of occurrences of each of these `Node` classes that are stored in the database. You can loop over the `class_list` list and create a `QueryBuilder` instance for each `Node` (sub-)class. See if you can finish the following loop by completing the line with the comment, printing the count of each `Node` (sub-)class.

```for class_name in class_list:
query = QueryBuilder()
query.append(class_name)
print() # Finish this line to print the results!
```

If all went well, you should see something like the following, where of course the numbers may differ for your database:

Class name

Entries

Node

10273

StructureData

271

KpointsData

953

Dict

2922

UpfData

85

Code

10

## Projection and filters¶

Up until now we have always asked the `QueryBuilder` instances to return complete nodes. However, we might not necessarily be interested in all the node’s properties, but rather just a selected set or even just a single property. We can tell the `QueryBuilder` which properties we would like to be returned, by asking it to project those properties in the result. For example, you may only want to get the universally unique identifiers (UUIDs) of a set of nodes, which is stored in the `uuid` attribute.

```query = QueryBuilder()
query.append(Node, project=['uuid'])
query.limit(5)
query.all()
```
```[['7f079d70-e361-426a-af77-4fd86de8be88'],
['fc9a76c1-f07d-41ac-98c8-d81c5118e351'],
['b66bf73b-8b26-498b-815d-28bcd5fd52df'],
['7c5371ee-632d-4f08-be55-ee30587b9561'],
['b9bf60b7-3c56-4ccc-9b24-03ddedd77f68']]
```

By using the `project` keyword in the `append` call, you are specifying `query` to inform AiiDA that you are only interested in the `uuid` property of the `Node` class. Note that the value assigned to `project` is a list, since we may want to specify more than one property.

### Exercise¶

See if you can get the `QueryBuilder` to return both the PK and the UUID of the first 5 nodes in the following cell.

Important

In the context of the `QueryBuilder`, the PK of a node is called `id` and the UUID is called `uuid` (as seen above).

```query = QueryBuilder()
query.append(Node, project=)#? What should the value be for the project key
query.limit(5)
query.all()
```

To give you an idea of the various properties you can project for some of the base AiiDA classes you can consult the following table. Note that this is by no means an exhaustive list:

Class

Properties

Node

`id`, `uuid`, `node_type`, `label`, `description`, `ctime`, `mtime`

Computer

`id`, `uuid`, `name`, `hostname`, `description`, `transport_type`, `scheduler_type`

User

`id`, `email`, `first_name`, `last_name`, `institution`

Group

`id`, `uuid`, `label`, `type_string`, `time`, `description`

The same properties can also be used to filter for specific nodes in your database. Indeed, up until now, you have only asked the `QueryBuilder` to return all the instances of a certain type of node, or at best a limited number of those (without specifying which ones). But in general we might be interested in a very specific node. For example, we may have the PK of a certain node and we would like to know when it was created and last modified. You can tell the `QueryBuilder` instance to select nodes that only match that criterion, by telling it to filter based on that property.

```query = QueryBuilder()
query.append(Node, project=['ctime', 'mtime'], filters={'id': {'==': 1}})
query.all()
```
```[[datetime.datetime(2014, 10, 28, 20, 18, 53, 927563, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=60, name=None)),
datetime.datetime(2014, 10, 28, 20, 18, 54, 388764, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=60, name=None))]]
```

Note the syntax of the `filters` keyword. The value is a dictionary, where the keys indicate the node property it operates on, in this case the `id` property, represeting the node’s PK. The value of that key is again itself a dictionary, where the key indicates the logical operator EQUAL TO via two equality signs (`==`), and the value corresponds to the desired value of the property.

You may have multiple criteria that you want to filter for, in which case you can use the logical `or` and `and` operators. Let’s say, for example, you want the `QueryBuilder` to retrieve all the crystal structure nodes (`StructureData`) that were created no longer than 12 days ago and have an `a` in their UUID. You can express this criterion by making use of the `and` operator, which allows you to specify multiple filters that all have to be satisfied.

```from datetime import datetime, timedelta

query = QueryBuilder()
query.append(
StructureData,
filters={
'and': [
{'ctime': {'>': datetime.now() - timedelta(days=12)}},
{'uuid': {'like': '%a%'}}
]
}
)
query.all()
```
```[]
```

You may have noticed that the greater than (`>`) operator, and its related operators, can work with Python `datetime` objects. These are just a few of the operators that `QueryBuilder` understands. Below you find a table with some of the logical operators that you can use:

Operator

Data type

Example

Description

`==`

All

`{'==': '12'}`

Equality operator

`in`

All

`{'in':['FINISHED', 'PARSING']}`

Member of a set

`<`, `>`, `<=`, `>=`

float, int, datetime

`{'>': 5.2}`

Size comparison operator

`like`

char, str

`{'like': 'calculation%'}`

String comparison, `%` is a wildcard

`ilike`

char, str

`{'ilike': 'caLCulAtion%'}`

String comparison, capitalization insensitive

`or`

`{'or': [{'<': 5.3}, {'>': 6.3}]}`

Logical OR operator

`and`

`{'and': [{'>=': 2}, {'<=': 6}]}`

Logical AND operator

### Exercise¶

Try to write a query below that will retrieve all `Group` nodes whose `label` property starts with the string `tutorial`.

```# Write your query here
```

## Defining relationships between query clauses¶

So far we have seen how the `QueryBuilder` can be used to search the database for entries of a specific node type, potentially projecting only specific properties and filtering for certain property values. However, our nodes do not live in a vacuum. They are part of a directed acyclic graph and are thus linked to one another. Therefore, we typically want to be able to search for nodes based on a certain relationship that they might have with other nodes. Consider for example that you have a `StructureData` node that was produced by some calculation. How would you retrieve the calculation, while only having knowledge of the `StructureData` node?

To accomplish this, you need to be able to tell the `QueryBuilder` what the relationship is between the nodes you are interested in. With the `QueryBuilder`, the following can be done to find all the crystal structure nodes that have been created as an output by a `PwCalculation` process.

Important

In the graph, we are not looking for a `PwCalculation` process (since processes do not live in the graph, as you have learned previously). We are actually looking for a `CalcJobNode` whose `process_type` property indicates it was run by a `PwCalculation` process. Since this is a very common pattern, the `QueryBuilder` allows to directly append the `PwCalculation` process class as a short-cut, but it internally unwraps this into a query for a `CalcJobNode` with the appropriate filter on the `process_type` property.

```query = QueryBuilder()
query.append(PwCalculation, tag='calculation')
```
```<aiida.orm.querybuilder.QueryBuilder at 0x124f456d0>
```

Since we are looking for pairs of nodes, you need to `append` the second node as well to the `QueryBuilder` instance, `query`. In the second line above, to specify the relationship between the nodes, we need to be able to reference back to the `CalcJobNode` that is matched. Therefore, you gave it a tag with the `tag` keyword. This can now be used in the following line:

```query.append(StructureData, with_incoming='calculation')
```
```<aiida.orm.querybuilder.QueryBuilder at 0x124f456d0>
```

The goal was to find `StructureData` nodes, so we `append` that to the `query`. However, we didn’t want to find just any `StructureData` nodes; they had to be an output of `PwCalculation`.

Note how you expressed this relation by the `with_incoming` keyword, because we want a `StructureData` node having an incoming link from the `CalcJobNode` referenced by the `'calculation'` tag (i.e., the `StructureData` must be an output of the calculation).

What remains to do is execute the query:

```query.limit(5)
query.all()
```
```[[<StructureData: uuid: 949b4082-69c9-4c56-8e08-a9c80f5a8b08 (pk: 42)>],
[<StructureData: uuid: 6e6f8381-4dca-40cc-ac74-08e198ded7a4 (pk: 1260)>],
[<StructureData: uuid: 79a400aa-6f13-4aa2-b5ca-2e5f3a5d2fc4 (pk: 818)>],
[<StructureData: uuid: 15b1d607-c753-4790-ba9c-10923c845464 (pk: 621)>]]
```

What you have done can be visualized schematically, thanks to a little tool included in the very first notebook cell (i.e., if the following doesn’t work, you should re-run the very first cell and try again).

```generate_query_graph(query.get_json_compatible_queryhelp(), 'query1.png')
Image(filename='query1.png')
```
```/Users/chrisjsewell/Documents/GitHub/aiida-tutorials/.tox/notebooks/lib/python3.9/site-packages/aiida/orm/querybuilder.py:1839: AiidaDeprecationWarning: method is deprecated, use the `queryhelp` property instead
```

The `with_incoming` keyword is only one of many potential relationships that exist between the various AiiDA nodes and that are implemented in the `QueryBuilder`. The table below gives an overview of the implemented relationships, which nodes they are defined for and what relation it implicates. The full list of relations can be found on this page of the AiiDA documentation.

Entity from

Entity to

Relationship

Explanation

Node

Node

with_outgoing

One node as input of another node

Node

Node

with_incoming

One node as output of another node

Node

Node

with_descendants

One node as the ancestor of another node

Node

Node

with_ancestors

One node as descendant of another node

Group

Node

with_node

The group of a node

Node

Group

with_group

The node is a member of a group

Computer

Node

with_node

The computer of a node

Node

Computer

with_computer

The node of a computer

User

Node

with_node

The creator of a node is a user

Node

User

with_user

The node was created by a user

### Exercise¶

See if you can write a query that will return all the `UpfData` nodes that are a member of a `Group` whose name starts with the string `SSSP`.

```query = QueryBuilder()
# Visualize what is going on:
generate_query_graph(query.get_json_compatible_queryhelp(), 'query2.png')
Image(filename='query2.png')
```

## Attributes and extras¶

In the section above, you learned how you to `project` specific properties of a node and gave a list of properties that a node instance possesses. Since then, we have come across different AiiDA data nodes, such as `StructureData` and `UpfData`. As AiiDA employs the object-oriented programming paradigm, both `StructureData` and `UpfData` are examples of sub-classes of the `Node` class and therefore inherit its properties. That means that whatever property a `Node` has, both `StructureData` and `UpfData` will have too. However, there is a semantic difference between what `StructureData` and `UpfData` represent, and as such they have been explicitly defined as sub-classes to be able to add properties to one that would not make sense for the other. This would normally create issues for the type of database AiiDA uses, but this is solved through the concept of `attributes`. These are similar to properties, except that they are specific to the `Node` type that they are attached to. This allows you to add an `attribute` to a certain node, without having to change the implementation of all the others.

For example, the `Dict` nodes that are generated as output of `PwCalculation`s may have an attribute named `wfc_cutoff`. To project for this particular `attribute`, one can use exactly the same syntax as shown in the section above for the regular `Node` properties, and one has to only prepend `attributes.` to the attribute name.

Demonstration:

```query = QueryBuilder()
query.append(PwCalculation, tag='pw')
query.append(Dict, with_incoming='pw', project=["attributes.wfc_cutoff"])
query.limit(5)
query.all()
```
```[[816.341503518],
[816.341503518],
[816.341503518],
[816.341503518],
[816.341503518]]
```

Note that if any `Dict` node does not have this attribute, the `QueryBuilder` will return the Python keyword `None`. Similar to the `attributes`, nodes can also have `extras`, which work in the same way as for `attributes`, except that `extras` are mutable, which means that their value can be changed even after a node instance has been stored.

If you are not sure which attributes a given node has, you can use the `attributes` `Node` class attribute to simply retrieve them all. It will return a dictionary with all the attributes the node has.

Note that a node also has a number of additional methods and attributes. For instance, you can do `node.attributes_keys()` to get only the attribute keys or `node.get_attribute('wfc_cutoff')` to get the value of a single attribute (these two variants are more efficient if the node has a lot of attributes and you don’t need all data). Similarly, for `extras`, you have `node.extras`, `node.extras_keys()`, and `node.get_attribute('SOME_EXTRA_KEY')`.

```query = QueryBuilder()
query.append(PwCalculation)
node, = query.first()
print('Attributes dictionary:', node.attributes)
print('Extras dictionary:', node.extras)
```
```Attributes dictionary: {'job_id': '462206', 'sealed': True, 'resources': {'num_machines': 1, 'num_mpiprocs_per_machine': 8, 'default_mpiprocs_per_machine': 8}, 'exit_status': 0, 'parser_name': 'quantumespresso.basicpw', 'last_jobinfo': '{"job_id": "462206", "wallclock_time_seconds": 374, "title": "aiida-41078", "num_machines": 1, "job_state": "RUNNING", "queue_name": "normal", "num_mpiprocs": 8, "allocated_machines_raw": "nid00373", "submission_time": {"date": "2014-10-28T20:07:12.000000", "timezone": null}, "job_owner": "mounet", "detailedJobinfo": "Detailed jobinfo obtained with command \'sacct --format=AllocCPUS,Account,AssocID,AveCPU,AvePages,AveRSS,AveVMSize,Cluster,Comment,CPUTime,CPUTimeRAW,DerivedExitCode,Elapsed,Eligible,End,ExitCode,GID,Group,JobID,JobName,MaxRSS,MaxRSSNode,MaxRSSTask,MaxVMSize,MaxVMSizeNode,MaxVMSizeTask,MinCPU,MinCPUNode,MinCPUTask,NCPUS,NNodes,NodeList,NTasks,Priority,Partition,QOSRAW,ReqCPUS,Reserved,ResvCPU,ResvCPURAW,Start,State,Submit,Suspended,SystemCPU,Timelimit,TotalCPU,UID,User,UserCPU --parsable --jobs=462206\'\\nReturn Code: 0\\n-------------------------------------------------------------\\nstdout:\\nAllocCPUS|Account|AssocID|AveCPU|AvePages|AveRSS|AveVMSize|Cluster|Comment|CPUTime|CPUTimeRAW|DerivedExitCode|Elapsed|Eligible|End|ExitCode|GID|Group|JobID|JobName|MaxRSS|MaxRSSNode|MaxRSSTask|MaxVMSize|MaxVMSizeNode|MaxVMSizeTask|MinCPU|MinCPUNode|MinCPUTask|NCPUS|NNodes|NodeList|NTasks|Priority|Partition|QOSRAW|ReqCPUS|Reserved|ResvCPU|ResvCPURAW|Start|State|Submit|Suspended|SystemCPU|Timelimit|TotalCPU|UID|User|UserCPU|\\n8|ch3|1134|||||daint||00:50:00|3000|0:0|00:06:15|20:06:59|20:13:27|0:0|31143|ch3|462206|aiida-41078||||||||||8|1|nid00373||36531|normal|1|8|00:00:13|00:01:44|104|20:07:12|COMPLETED|20:06:59|00:00:00|00:00.184|02:00:00|00:01.640|22892|mounet|00:01.456|\\n1|ch3|1134|00:00:00|0|8528K|77072K|daint||00:06:15|375||00:06:15|20:07:12|20:13:27|0:0|||462206.batch|batch|8528K|nid00373|0|77072K|nid00373|0|00:00:00|nid00373|0|1|1|nid00373|1||||1|INVALID|INVALID||20:07:12|COMPLETED|20:07:12|00:00:00|00:00.184||00:01.640|||00:01.456|\\n\\nstderr:\\n\\n", "raw_data": ["462206", "R", "None", "daint01", "mounet", "1", "8", "nid00373", "normal", "2:00:00", "6:14", "2014-10-28T20:07:12", "aiida-41078"], "annotation": "None", "requested_wallclock_time_seconds": 7200}', 'process_label': 'PwCalculation', 'process_state': 'finished', 'retrieve_list': ['aiida.out', './out/aiida.save/data-file.xml', '_scheduler-stdout.txt', '_scheduler-stderr.txt'], 'remote_workdir': '/scratch/daint/mounet/aiida_run/8b/cb/b19a-4cd0-44d2-8eba-f86901618b2e', 'scheduler_state': 'DONE', 'linkname_retrieved': 'retrieved', 'max_wallclock_seconds': 7200, 'scheduler_lastchecktime': '2014-10-28T19:14:21.439271+00:00', 'retrieve_singlefile_list': [], 'custom_scheduler_commands': '#SBATCH --account=ch3'}
Extras dictionary: {'A': 'Pb', 'B': 'Hf'}
```

The chemical element symbol of a pseudopotential represented by a `UpfData` node is stored in the `element` attribute.

### Exercise¶

Using the knowledge on how filtering on `attributes` works, see if you can write a query that will search your database for pseudopotentials for silicon.

```query = QueryBuilder()
```

## Generating a provenance graph¶

Previously we have used `verdi graph generate` on the command-line, to generate a graph of the data provenance. To do this, AiiDA uses some of the queries you have learned about above. We can also visualise sections of the provenance in a more customisable way, using the `Graph` class.

For example, lets query for a calculation, then use methods of the `Graph` class to visualise the inputs and outputs of this calculation:

```query = QueryBuilder()
query.append(PwCalculation)
node, = query.first()

from aiida.tools.visualization import Graph
graph = Graph(graph_attr={"rankdir": "LR"})

graph.graphviz
```

The `Graph` class also has methods for recursing up or down the provenance tree. In this example, let’s query for a pseudopotential, and visualise which processes it is used in:

```query = QueryBuilder()
query.append(UpfData, filters={'attributes.element': {'==': 'Si'}})
node, = query.first()

graph = Graph(graph_attr={"rankdir": "LR"})

graph.recurse_descendants(
node.uuid,
depth=1
)
graph.graphviz
```

For further information on using `Graph` to generate provenance graphs, please see this section in the documentation.

## A small high-throughput study¶

The following section assumes that a specific dataset is present in your AiiDA database. If you are not running this script on the virtual machine of the AiiDA tutorial, this script will not produce the desired output. You can download the virtual machine image from aiida-tutorials.readthedocs.io along with the tutorial text (choose the correct version of the tutorial, depending on which version of AiiDA you want to try).

Important

This section relies on a specific dataset of previously run calculations. If you have already imported the data set from the “Organising your data” module, you should be good to go! If not, you can use the following command to import the required calculations:

```verdi archive import https://object.cscs.ch/v1/AUTH_b1d80408b3d340db9f03d373bbde5c1e/marvel-vms/tutorials/aiida_tutorial_2020_07_perovskites_v0.9.aiida
```

In this part of the tutorial, we will focus on how to systematically retrieve, parse and analyze the results of multiple calculations using AiiDA. While you may be able to do this on your own, to save time a set of calculations have already been done with AiiDA for you on 57 perovskites, using three different pseudopotential families (LDA, PBE and PBESOL, all from GBRV 1.2).

These calculations are spin-polarized (without spin-orbit coupling), use a Gaussian smearing and perform a variable-cell relaxation of the full unit cell. The goal of this part of the tutorial is to have you utilize what you have learnt in the previous sections and “screen” for magnetic and metallic perovskites in a “high-throughput” way. As you learned previously in the tutorial, AiiDA allows to organize calculations into groups. Once more check the list of groups in your database by typing:

```!verdi group list -a -A
```
```-
```
```
```
```/
```
```|
```
```
```
```  PK  Label            Type string    User
----  ---------------  -------------  -----------------
1  tutorial_pbesol  core           aiida@localhost
2  tutorial_lda     core           aiida@localhost
3  tutorial_pbe     core           aiida@localhost
4  GBRV_lda         core.upf       aiida@localhost
5  GBRV_pbe         core.upf       aiida@localhost
6  GBRV_pbesol      core.upf       aiida@localhost
7  20210706-063845  core.import    aiidateam@epfl.ch
```

The calculations needed for this task were put into three different groups whose labels start with `'tutorial'` (one for each pseudopotential family). The main task is to make a plot showing, for all perovskites and for each pseudopotential family, the total magnetization and the \$-TS\$ contribution from the smearing to the total energy.

### Start building the query¶

First you should instantiate a `QueryBuilder` instance. To this, you can `append` the groups of interest, which means that you should select only groups that start with the string `tutorial_`. The query can be executed after this `append` (this will not affect the final results) to check whether 3 groups are retrieved.

```query = QueryBuilder()
query.append(
Group,
filters={
'label': {'like': 'tutorial_%'}
},
project='label',
tag='group'
)
# Visualize:
print("Groups:", ', '.join([g for g, in query.all()]))
generate_query_graph(query.get_json_compatible_queryhelp(), 'query3.png')
Image(filename='query3.png')
```
```Groups: tutorial_lda, tutorial_pbe, tutorial_pbesol
```
```/Users/chrisjsewell/Documents/GitHub/aiida-tutorials/.tox/notebooks/lib/python3.9/site-packages/aiida/orm/querybuilder.py:1839: AiidaDeprecationWarning: method is deprecated, use the `queryhelp` property instead
```

Important

Most of the code cells below are incomplete, and need to be completed as an exercise. Look for the comments for more instructions.

### Append the calculations that are members of each group¶

Try to complete the incomplete lines below:

```# Retrieve every PwCalculation that is a member of the specified groups:
query.append(PwCalculation, tag='calculation', with_group=) # Complete the function call with the correct relationship-tag!
# Visualize:
generate_query_graph(query.get_json_compatible_queryhelp(), 'query4.png')
Image(filename='query4.png')
```

### Append the structures that are inputs to the calculation¶

We want to furthermore retrieve the crystal structures used as inputs for the calculations. This can be done by an `append` `StructureData`, and defining the relationship with the calculations with an appropriate relationship keyword, in this case `with_outgoing`.

For simplicity the formulas have been added in the `extras` of each crystal structure node under the key `formula`. (The function that does this is called `store_formula_in_extra` and can be found in the first cell of this notebook.)

Try to finish the code block below to project the formula, stored in the `extras` under the key `formula`.

```query.append(StructureData, project=, tag='structure', with_outgoing=) # Complete the function call with the correct relationship-tag!
# Visualize:
generate_query_graph(query.get_json_compatible_queryhelp(), 'query5.png')
Image(filename='query5.png')
```

### Append the output of the calculation¶

Every successful `PwCalculation` outputs a `Dict` node that stores the parsed results as key/value-pairs. You can find these pairs among the attributes of the `Dict` node. To facilitate querying, the parser takes care of always storing the values in the same units. For convenience, the units are also added as key/value-pairs (with the same key name, but with `_units` appended). Extend the query so that also the output `Dict` of each calculation is returned. Project only the attributes relevant to your analysis.

In particular, project (in this order):

• The smearing contribution;

• The units of the smearing contribution;

• The magnetization; and

• The units of the magnetization.

(To know the projection keys, you can try to load one `CalcJobNode` from one of the groups, get its output `Dict` and inspect its `attributes` as discussed before, to see the key/value-pairs that have been parsed.)

```query.append(Dict, tag='results', project=['attributes.energy_smearing', ...], with_incoming=) # Complete the function call with the correct relationship-tag!
# Visualize:
generate_query_graph(query.get_json_compatible_queryhelp(), 'query6.png')
Image(filename='query6.png')
```

### Plot the results¶

Getting a long list is not always helpful, and a graph can be much more clear and useful. To help you, we have already prepared a function that visualizes the results of the query. Run the following cell and you should get a graph with the results of your queries.

```plot_results(results)
```