Tutorial: CrossRegistry¶

The CrossRegistry allows you to conveniently get, aggregate, and label data stored at the CROSS data platform

Packages and data¶

In [1]:

Copied!





# to manage your .env file, you can use the python-dotenv package. 
# Install it with pip if you haven't already:
from dotenv import load_dotenv
import os

# Import the CrossRegistry class from the crosscontract package
from crosscontract import CrossRegistry
# to manage your .env file, you can use the python-dotenv package. 
# Install it with pip if you haven't already:
from dotenv import load_dotenv
import os

# Import the CrossRegistry class from the crosscontract package
from crosscontract import CrossRegistry

Creating the CrossRegistry¶

To create the registry, you simply provide your username and password. Here we assume that your credentials are stored in a .env file and we extract them from there.

Note Do not store your credentials in GitHub!

In [2]:

Copied!





# load the environment variables from the .env file
load_dotenv(".env")
username = os.getenv("CROSSUSER")

# create the registry using the environment variables
my_registry = CrossRegistry(
    username=os.getenv("CROSSUSER"), 
    password=os.getenv("PASSWORD")
)
# load the environment variables from the .env file
load_dotenv(".env")
username = os.getenv("CROSSUSER")

# create the registry using the environment variables
my_registry = CrossRegistry(
    username=os.getenv("CROSSUSER"), 
    password=os.getenv("PASSWORD")
)

Getting a variable¶

To get a variable, you need to know the name of the contract. To get on overview over your available contracts, you can use the contract_overview property.

In [3]:

Copied!

my_registry.contract_overview.query("name.str.startswith('result_')")
my_registry.contract_overview.query("name.str.startswith('result_')")

Out[3]:

	name	title	description
7	result_electricity_consumption	Result submission - Electricity consumption	Electricity consumption as submitted from scen...
14	result_electricity_supply	Result submission - Electricity supply	Electricity supply as submitted from scenario ...
15	result_h2_fec	Result submission - Hydrogen final energy cons...	Hydrogen final energy consumption as submitted...
19	result_h2_supply	Result submission - Hydrogen supply	Hydrogen supply as submitted from scenario runs
24	result_methane_consumption	Result submission - Methane final energy consu...	Methane final energy consumption as submitted ...
27	result_methane_supply	Result submission - Methane supply	Methane supply as submitted from scenario runs
28	result_liquids_consumption	Result submission - Liquid fuels final energy ...	Liquid fuels final energy consumption as submi...
31	result_liquids_supply	Result submission - Liquid fuels supply	Liquid fuels supply as submitted from scenario...
32	result_process_heat_energy_production	Result submission - Process heat production	Useful energy production of process heat as su...
35	result_space_heat_energy_supply	Result submission - Space Heat supply	Useful energy supply of space heat as submitte...
37	result_district_heat_energy_production	Result submission - District Heat production	Useful energy production of distric heat as su...
38	result_passenger_road_private_fec	Result submission - Passenger road private tra...	Final energy consumption of passenger road pri...
39	result_passenger_road_public_fec	Result submission - Passenger road public tran...	Final energy consumption of passenger road pub...
40	result_freight_road_fec	Result submission - Freight road transport fin...	Final energy consumption of freight transport ...
41	result_storage_installed_volume	Result submission - Installed storage volume	Installed storage size
43	result_storage_output	Result submission - Storage output	Installed storage size
44	result_elec_cons_typical_day	Result sumission - Electricity consumption	Electricity consumption as submitted from scen...
46	result_elec_supply_typical_day	Result submission - Electricity supply	Electricity supply as submitted from scenario ...

Given the name, you can add the variable to the registry or simply use dot notation. If you use dot notation, the registry will automatically add the variable to the registry.

In [4]:

Copied!

res_elec_supply = my_registry.result_electricity_supply
res_elec_supply
res_elec_supply = my_registry.result_electricity_supply
res_elec_supply

Out[4]:

CrossDataVariable(name=result_electricity_supply, filters=None)

As you can see, we can provide a filter for the data. This is available if you use the add() method and will filter the data already coming from the platform.

To add the variable again, we need to use overwrite=True as it is already in the registry. Here we filter to only get data for the year 2050. Note that this will affect all later usage, as the filter is general, i.e., applied when the registry fetches the data from the CROSS platform.

Note Currently server side filtering is rather restricted

Only one value per field is allowd
Only string columns can be filtered

In [5]:

Copied!





res_elec_supply = my_registry.add_variable(
    "result_electricity_supply", 
    filters={"scenario_name": "abroad-res-full"}, 
    overwrite=True
)
res_elec_supply = my_registry.add_variable(
    "result_electricity_supply", 
    filters={"scenario_name": "abroad-res-full"}, 
    overwrite=True
)

Assessing data¶

Now that you have the variable, you can access the data by using its data attribute. Using the data attribute provides you the data stored at the platform as pandas dataframe (with the filter already applied).

In [6]:

Copied!

res_elec_supply.data.head()
res_elec_supply.data.head()

Out[6]:

	model	scenario_group	scenario_name	scenario_variant	technology	country	year	unit
0	powercheck	cross202506	abroad-res-full	reference	methane_chp_woccs	CH	2040	TWh
1	powercheck	cross202506	abroad-res-full	reference	methane_chp_woccs	CH	2050	TWh
2	powercheck	cross202506	abroad-res-full	reference	methane_chp_ccs	CH	2040	TWh
3	powercheck	cross202506	abroad-res-full	reference	methane_chp_ccs	CH	2050	TWh
4	powercheck	cross202506	abroad-res-full	reference	methane_oc_woccs	CH	2040	TWh

While the .data property provides access to the full dataset, the get_data method allows you to specify additional filters, aggregate the data, and to label items based on the information in the contract (and the references to the Cross Dimensions).

Filtering is based on a dictionary with the key being the name of the column and the value a list with the allowed values
Aggregation is also dictionary based. The key is the name of the column over which to aggregate and the entry is an integer to specify the aggregation level. 0 is the highest aggregation level, i.e., the level with as little as possible details.
Labeling is based on the use_titles parameter. If set to true all columns will be automatically relabelled.
Columns allow to narrow the list of columns in the dataframe provided. Note that the filter does not drop colums at all. Columns are always applied at the very end of the transformation.

In [7]:

Copied!





res_elec_supply.get_data(
    filters={"year": [2050], "scenario_variant": ["reference"]},
    aggregation={"technology": 0},
    use_titles=True,
    columns=["model", "technology", "value"]
).pivot_table(index="model", columns="technology", values="value").round(1)
res_elec_supply.get_data(
    filters={"year": [2050], "scenario_variant": ["reference"]},
    aggregation={"technology": 0},
    use_titles=True,
    columns=["model", "technology", "value"]
).pivot_table(index="model", columns="technology", values="value").round(1)

Out[7]:

technology	Electricity storage	Electrochemical	Imports of electricity	Renewables	Thermal power plants
model
EHUB	22.6	4.7	15.0	79.9	1.2
PowerCheck	6.0	0.2	10.7	64.9	3.8
SES	0.0	NaN	18.0	220.2	5.2
SES-ETH	7.0	0.0	16.4	78.5	5.0
STEM	6.1	0.2	13.6	79.9	6.2
SecMOD	15.9	0.0	47.1	80.0	4.7

Aggregation¶

Aggregation is more flexible than only using one aggregation level. In principle there are three ways to aggregate:

Provide a single level of aggregation for the aggregation level (as above): aggregation={"technology": 0}
Aggregate to given set of identifiers: E.g. aggregation={"technology": ["renewable", "thermal"]}
Aggregate everything to a given level except some identifiers that should be kept: {"technology": {"level": 0, "keep": ["hydro_dam", "hydro_run"]}}

Note that the list of identifiers has to include the original identifiers and not the label or title of the column items as they appears after use_titles=True.

For the aggregation by title assume the example with aggregation={"technology": ["renewable", "thermal"]}. This aggregates all sub-categories of renewable and thermal but leaves the remaining items untouched:

In [8]:

Copied!





res_elec_supply.get_data(
    filters={"year": [2050], "scenario_variant": ["reference"]},
    aggregation={"technology": ["renewable", "thermal"]},
    use_titles=True,
    columns=["model", "technology", "value"]
).pivot_table(index="model", columns="technology", values="value").round(1)
res_elec_supply.get_data(
    filters={"year": [2050], "scenario_variant": ["reference"]},
    aggregation={"technology": ["renewable", "thermal"]},
    use_titles=True,
    columns=["model", "technology", "value"]
).pivot_table(index="model", columns="technology", values="value").round(1)

Out[8]:

technology	Discharge of batteries	Discharge of pumped hydro storage	Electricity storage	Fuel cell using hydrogen	Fuel cell using methane	Imports of electricity	Renewables	Thermal power plants
model
EHUB	18.0	4.6	NaN	4.7	0.0	15.0	79.9	1.2
PowerCheck	3.8	2.2	NaN	0.2	0.0	10.7	64.9	3.8
SES	0.0	0.0	0.0	NaN	NaN	18.0	220.2	5.2
SES-ETH	1.6	5.4	NaN	0.0	0.0	16.4	78.5	5.0
STEM	2.3	3.8	NaN	0.2	0.0	13.6	79.9	6.2
SecMOD	8.4	7.5	NaN	0.0	0.0	47.1	80.0	4.7

Now suppose you want to aggregate everything to level 0 but want to have hydro technologies more disaggregated: {"technology": {"level": 0, "keep": ["hydro_dam", "hydro_run"]}}

In [9]:

Copied!





res_elec_supply.get_data(
    filters={"year": [2050], "scenario_variant": ["reference"]},
    aggregation={"technology": {"level": 0, "keep": ["hydro_dam", "hydro_run"]}},
    use_titles=True,
    columns=["model", "technology", "value"]
).pivot_table(index="model", columns="technology", values="value").round(1)
res_elec_supply.get_data(
    filters={"year": [2050], "scenario_variant": ["reference"]},
    aggregation={"technology": {"level": 0, "keep": ["hydro_dam", "hydro_run"]}},
    use_titles=True,
    columns=["model", "technology", "value"]
).pivot_table(index="model", columns="technology", values="value").round(1)

Out[9]:

technology	Electricity storage	Electrochemical	Hydro Dams	Imports of electricity	Renewables	Thermal power plants
model
EHUB	22.6	4.7	18.4	15.0	61.5	1.2
PowerCheck	6.0	0.2	18.1	10.7	46.8	3.8
SES	0.0	NaN	20.0	18.0	200.2	5.2
SES-ETH	7.0	0.0	19.5	16.4	59.0	5.0
STEM	6.1	0.2	20.8	13.6	59.1	6.2
SecMOD	15.9	0.0	16.4	47.1	63.6	4.7

Examine dimensions¶

To use the flexible aggregation, must know the identifiers and the hierarchy within the dimensions. One way is to look it up at the CROSS webpage.

Alternatively, you can inspect the dimension associated with a column from the given variable:

In [10]:

Copied!





(
    res_elec_supply
    .dimensions["technology"]
    .data
    [["id", "level", "id_parent"]]
    .pivot(index="id", values="id_parent", columns="level")
    .sort_index()
    .fillna("")
)
(
    res_elec_supply
    .dimensions["technology"]
    .data
    [["id", "level", "id_parent"]]
    .pivot(index="id", values="id_parent", columns="level")
    .sort_index()
    .fillna("")
)

Out[10]:

level	0	1	2	3
id
battery_out		storage_elec
coal_cc			coal_pp
coal_cc_ccs				coal_cc
coal_cc_woccs				coal_cc
coal_chp			coal_pp
...	...	...	...	...
wood_cc_woccs				wood_cc
wood_chp			wood_pp
wood_chp_ccs				wood_chp
wood_chp_woccs				wood_chp
wood_pp		thermal

69 rows × 4 columns

In [ ]: