OpenClimate Python Client Documentation

OpenClimate is a datastore for emissions data and target pledges. The OpenClimate Python Client is a Python 3.8+ package that provides a high-level interface to the OpenClimate API.

The goal of this package is to make it easier to focus on the analysis by abstracting away the details of making HTTP requests and handling responses.

Note

This is a work in progress. We strongly encourage you to open issues and contribute code.

Installation

# for latest release
pip install openclimate

# for bleeding-edge up-to-date commit
pip install -e git+https://github.com/Open-Earth-Foundation/OpenClimate-pyclient.git

Quickstart Guide

Installation

Install openclimate using pip.

# for latest release
pip install openclimate

# for bleeding-edge up-to-date commit
pip install -e git+https://github.com/Open-Earth-Foundation/OpenClimate-pyclient.git

Once installed, import the package and create a Client() object.

from openclimate import Client
client = Client()

# if using jupyter or iPython
client.jupyter

Note

You need to run client.jupyter for the client package to work properly in Jupyter or iPython.

Emissions

Retrieve all emissions data for a single actor. Here I am retrieving emissions data for Canada

df = client.emissions(actor_id='CA')

Retrieve all emissions data for a list of actors. Here I am retrieving emission data for the United States, Canada, and Great Britain.

df = client.emissions(actor_id=['US','CA','GB'])

Return the different datasets available for a particular actor:

df = client.emissions_datasets(actor_id='US')

Only select data for a particular dataset

df = client.emissions_datasets(actor_id='US', datasource_id='GCB2022:national_fossil_emissions:v1.0')

Targets

Retrieve emissions targets for a particule actor

df = client.targets(actor_id='US')

Population

Retrieve population data.

df = client.population(actor_id=['US','CA','GB'])

GDP

Retrieve GDP data.

df = client.gdp(actor_id=['US','CA','GB'])

Searching for codes

use the following to list the actor_ids for countries:

df = client.country_codes()

search for actor codes:

df = client.search(query='Minnesota')

get all the parts of an actor. Here I am returning the actor_id for each US state.

df =client.parts(actor_id='US',part_type='adm1')

Autogenerated API

class openclimate.ActorOverview.ActorOverview(version: str = '/api/v1', base_url: str = 'https://openclimate.openearth.dev', server: str = 'https://openclimate.openearth.dev/api/v1')[source]

ActorOveriew API class get overview information of an actor

Returns:

object

Methods

country_codes([like, case_sensitive, regex])

returns two-letter country codes

overview(actor_id[, ignore_warnings])

Retretive actor overview

parts(actor_id[, part_type])

Retreive actor parts (e.g.

country_codes(like: Optional[str] = None, case_sensitive: bool = False, regex: bool = True, *args, **kwargs) DataFrame[source]

returns two-letter country codes

Parameters:
  • like (str, optional) – filters names. Defaults to None.

  • case_sensitive (bool, optional) – make search case-senstive. Defaults to False.

  • regex (bool, optional) – use regular expression like phrases. Defaults to True.

Returns:

pd.DataFrame

overview(actor_id: Union[str, List[str], Tuple[str]], ignore_warnings: bool = False)[source]

Retretive actor overview

Parameters:
  • actor_id (Union[str, List[str], Tuple[str]]) – actor identifier. Defaults to None.

  • ignore_warnings (bool) – ignore warning messages

Returns:

dictionary with actor overview

Return type:

List[Dict]

parts(actor_id: str, part_type: Optional[str] = None, *args, **kwargs) DataFrame[source]

Retreive actor parts (e.g. subnational, cities, …)

Parameters:
  • actor_id (str) – code for actor your want to retrieve

  • part_type (str, optional) – administrative level

Returns:

data for each emissions dataset

Return type:

DataFrame

class openclimate.Base.Base(version: str = '/api/v1', base_url: str = 'https://openclimate.openearth.dev', server: str = 'https://openclimate.openearth.dev/api/v1')[source]

Base API class define HTTP access to API

Returns:

object

base_url: str = 'https://openclimate.openearth.dev'
server: str = 'https://openclimate.openearth.dev/api/v1'
version: str = '/api/v1'
class openclimate.Client.Client(version: str = '/api/v1', base_url: str = 'https://openclimate.openearth.dev', server: str = 'https://openclimate.openearth.dev/api/v1')[source]

OpenClimate API Python Client

If you are using Jupyter

either run `python client = Client() client.jupyter `

or manually add the following lines of code `python import nest_asyncio nest_asyncio.apply() `

Attributes:
jupyter

Methods

country_codes([like, case_sensitive, regex])

get country codes and filter using like regex phrases

emissions(actor_id[, datasource_id, ...])

retreive actor emissions

emissions_datasets(actor_id[, ignore_warnings])

retreive actor emissions datasets

gdp(actor_id[, ignore_warnings])

retreive actor GDP

parts(actor_id[, part_type])

retreive actor parts

population(actor_id[, ignore_warnings])

retreive actor population

search([name, identifier, query, language, ...])

search actor names and identifiers

targets(actor_id[, ignore_warnings])

retreive actor targets

country_codes(like: Optional[str] = None, case_sensitive: bool = False, regex: bool = True, *args, **kwargs) DataFrame[source]

get country codes and filter using like regex phrases

Parameters:
  • like (str) – phrase to search for in name (optional)

  • case_senstive (bool) – case senstive search [default: False] (optional)

  • regex (bool) – use regex with like [default: True] (optional)

Returns:

dataframe of country codes

Return type:

DataFrame

emissions(actor_id: str, datasource_id: Optional[str] = None, ignore_warnings: bool = False) DataFrame[source]

retreive actor emissions

Parameters:
  • actor_id (str|List[str]) – code for actor your want to retrieve

  • datasource_id (str) – code emissions dataset

  • ignore_warnings (bool) – ignore warning messages

Returns:

data for each emissions dataset

Return type:

DataFrame

emissions_datasets(actor_id: str, ignore_warnings: bool = False) DataFrame[source]

retreive actor emissions datasets

Parameters:
  • actor_id (str) – code for actor your want to retrieve

  • ignore_warnings (bool) – ignore warning messages

Returns:

data of emission datasets

Return type:

DataFrame

gdp(actor_id: str, ignore_warnings: bool = False) DataFrame[source]

retreive actor GDP

Parameters:
  • actor_id (str|List[str]) – code for actor your want to retrieve

  • ignore_warnings (bool) – ignore warning messages

Returns:

dataframe of GDP

Return type:

DataFrame

property jupyter
parts(actor_id: str, part_type: Optional[str] = None, *args, **kwargs) DataFrame[source]

retreive actor parts

returns subnational, cities, companies, etc. within an actor_id

Parameters:
  • actor_id (str|List[str]) – code for actor your want to retrieve

  • part_type (str) – retrieve actors from administrative part [‘planet’, ‘country’, ‘adm1’, ‘adm2’, ‘city’, ‘organization’, ‘site’]

Returns:

dataframe of actors parts

Return type:

DataFrame

population(actor_id: str, ignore_warnings: bool = False) DataFrame[source]

retreive actor population

Parameters:
  • actor_id (str|List[str]) – code for actor your want to retrieve

  • ignore_warnings (bool) – ignore warning messages

Returns:

dataframe of population

Return type:

DataFrame

search(name: Optional[str] = None, identifier: Optional[str] = None, query: Optional[str] = None, language: Optional[str] = None, namespace: Optional[str] = None, *args, **kwargs) DataFrame[source]

search actor names and identifiers

Parameters:
  • query (str) – full search of identifiers and names that include the search parameter

  • name (str) – searches for actors with exact name match (e.g. “Minnesota”)

  • language (str, optional) – two letter language code [requires name to be set]

  • identifier (str) – searches for actors with exact identifier code match (e.g. “US”)

  • namespace (str, optional) – actor namespace code [requires identifier to be be set]

Returns:

dataframe of search results

Return type:

DataFrame

targets(actor_id: str, ignore_warnings: bool = False) DataFrame[source]

retreive actor targets

Parameters:
  • actor_id (str|List[str]) – code for actor your want to retrieve

  • ignore_warnings (bool) – ignore warning messages

Returns:

dataframe of targets

Return type:

DataFrame

class openclimate.Emissions.Emissions(version: str = '/api/v1', base_url: str = 'https://openclimate.openearth.dev', server: str = 'https://openclimate.openearth.dev/api/v1')[source]

Methods

datasets(actor_id[, ignore_warnings])

retreive emissions datasets for an actor

emissions(actor_id[, datasource_id, ...])

retrieve actor emissions

datasets(actor_id: Union[str, List[str], Tuple[str]], ignore_warnings: bool = False, *args, **kwargs) DataFrame[source]

retreive emissions datasets for an actor

Parameters:
  • actor_id (Union[str, List[str], Tuple[str]], optional) – actor code

  • ignore_warnings (bool, optional) – ignore warnings messages

Return type:

pd.DataFrame

emissions(actor_id: Union[str, List[str], Tuple[str]], datasource_id: Optional[str] = None, ignore_warnings: bool = False, *args, **kwargs) DataFrame[source]

retrieve actor emissions

Parameters:
  • actor_id (Union[str, List[str], Tuple[str]], optional) – actor code

  • datasource_id (str, optional) – emissions datasource. Defaults to None.

  • ignore_warnings (bool, optional) – ignore warnings messages

Returns:

_description_

Return type:

pd.DataFrame

class openclimate.GDP.GDP(version: str = '/api/v1', base_url: str = 'https://openclimate.openearth.dev', server: str = 'https://openclimate.openearth.dev/api/v1')[source]

Methods

gdp(actor_id[, ignore_warnings])

retreive actor GDP

gdp(actor_id: Union[str, List[str], Tuple[str]], ignore_warnings: bool = False, *args, **kwargs) DataFrame[source]

retreive actor GDP

Parameters:
  • actor_id (Union[str, List[str], Tuple[str]], optional) – actor code

  • ignore_warnings (bool) – ignore warning messages

Return type:

pd.DataFrame

class openclimate.Population.Population(version: str = '/api/v1', base_url: str = 'https://openclimate.openearth.dev', server: str = 'https://openclimate.openearth.dev/api/v1')[source]

Methods

population(actor_id[, ignore_warnings])

retreive actor population

population(actor_id: Union[str, List[str], Tuple[str]], ignore_warnings: bool = False, *args, **kwargs) DataFrame[source]

retreive actor population

Parameters:
  • actor_id (Union[str, List[str], Tuple[str]], optional) – actor code

  • ignore_warnings (bool) – ignore warning messages

Return type:

pd.DataFrame

class openclimate.Search.Search(version: str = '/api/v1', base_url: str = 'https://openclimate.openearth.dev', server: str = 'https://openclimate.openearth.dev/api/v1')[source]

Methods

search([name, identifier, query, language, ...])

search actors

search(name: Optional[str] = None, identifier: Optional[str] = None, query: Optional[str] = None, language: Optional[str] = None, namespace: Optional[str] = None, *args, **kwargs) DataFrame[source]

search actors

Parameters:
  • query (str) – full search of identifiers and names that include the search parameter

  • name (str) – searches for actors with exact name match (e.g. “Minnesota”)

  • language (str, optional) – two letter language code [requires name to be set]

  • identifier (str) – searches for actors with exact identifier code match (e.g. “US”)

  • namespace (str, optional) – actor namespace code [requires identifier to be be set]

Returns:

dataframe with search results

Return type:

pd.DataFrame

class openclimate.Targets.Targets(version: str = '/api/v1', base_url: str = 'https://openclimate.openearth.dev', server: str = 'https://openclimate.openearth.dev/api/v1')[source]

Methods

targets(actor_id[, ignore_warnings])

retreive actor targets

targets(actor_id: Union[str, List[str], Tuple[str]], ignore_warnings: bool = False, *args, **kwargs) DataFrame[source]

retreive actor targets

Parameters:
  • actor_id (Union[str, List[str], Tuple[str]], optional) – actor code

  • ignore_warnings (bool) – ignore warning messages

Return type:

pd.DataFrame

Emissions and Emissions per capita

In this tutorial I will use openclimate to create a time series emissions and emissions per capita for countries.

[1]:
from itertools import cycle
import matplotlib.pyplot as plt
from matplotlib.ticker import AutoMinorLocator
import numpy as np
import pandas as pd
[2]:
from openclimate import Client
client = Client()
client.jupyter

Let’s start by getting all the country codes

[3]:
df_names = client.parts('EARTH')[['actor_id', 'name']]
actor_ids = tuple(client.parts('EARTH')['actor_id'])

Emissions

Let’s use fossil CO2 emissions from the Global Carbon Budget 2022. You can use client.emissions_datasets() to list all datasets available. Be a little patient, this takes about 20 seconds to retrieve the data for 250 countries.

[4]:
%%time
df_emissions = client.emissions(actor_ids, 'GCB2022:national_fossil_emissions:v1.0')
CPU times: user 5.49 s, sys: 245 ms, total: 5.73 s
Wall time: 20.5 s

This returns a dataframe with total_emissions in tonnes of CO2.

[5]:
df_emissions.sample(5)
[5]:
actor_id year total_emissions datasource_id
167 AW 1965 592387 GCB2022:national_fossil_emissions:v1.0
227 TM 1993 27516396 GCB2022:national_fossil_emissions:v1.0
89 AL 1981 7339621 GCB2022:national_fossil_emissions:v1.0
165 GB 1915 489481088 GCB2022:national_fossil_emissions:v1.0
211 RU 1977 1964405077 GCB2022:national_fossil_emissions:v1.0

Lets’s first rank the countries by the their emissions in the most recent year and display the top 10 emitters.

[6]:
year = df_emissions.year.max()
df_recent = (
    df_emissions
    .loc[df_emissions.year == year]
    .assign(rank = lambda x: x['total_emissions'].rank(ascending=False))
    .assign(percent_of_global = lambda x: (x['total_emissions'] / x['total_emissions'].sum()) * 100)
    .sort_values(by='rank')
    .merge(df_names, on='actor_id')
    .loc[:, ['rank', 'name', 'actor_id', 'year', 'total_emissions', 'percent_of_global']]
)
df_recent.head(10)
[6]:
rank name actor_id year total_emissions percent_of_global
0 1.0 China CN 2021 11472369170 31.777308
1 2.0 United States of America US 2021 5007335888 13.869816
2 3.0 India IN 2021 2709683624 7.505551
3 4.0 Russian Federation RU 2021 1755547389 4.862690
4 5.0 Japan JP 2021 1067398435 2.956586
5 6.0 Iran IR 2021 748878751 2.074319
6 7.0 Germany DE 2021 674753565 1.868999
7 8.0 Saudi Arabia SA 2021 672379870 1.862425
8 9.0 Indonesia ID 2021 619277532 1.715336
9 10.0 Korea, the Republic of KR 2021 616074996 1.706466

China was responbible for the lion’s share of global CO2 emissions in 2021 at nearly 32%. This is as much as the next 6 countries combined! However, this is just a snapshot in time. Let’s plot time series for each of the top 7 emitters.

[7]:
fig = plt.figure(figsize=(6, 6))
ax = fig.add_subplot(111)

# top 7 emitters
top_emitters = list(df_recent.head(7).actor_id)

# wong color palette (https://davidmathlogic.com/colorblind/#%23D81B60-%231E88E5-%23FFC107-%23004D40)
colors = ['#000000', '#E69F00', '#56B4E9', '#009E73', '#F0E442', '#0072B2', '#D55E00', '#CC79A7']

for actor_id, color in zip(top_emitters, cycle(colors)):

    actor_name = df_names.loc[df_names['actor_id'] == actor_id, 'name'].values[0]
    filt = df_emissions['actor_id'] == actor_id
    df_tmp = df_emissions.loc[filt]

    ax.plot(np.array(df_tmp['year']), np.array(df_tmp['total_emissions']) / 10**9,
            linewidth=4,
            label = actor_name,
            color=color)

    ylim = [0, 12]
    ax.set_ylim(ylim)
    ax.set_xlim([1850, 2022])

    # Turn off the display of all ticks.
    ax.tick_params(which='both',     # Options for both major and minor ticks
                   top='off',        # turn off top ticks
                   left='off',       # turn off left ticks
                   right='off',      # turn off right ticks
                   bottom='off')     # turn off bottom ticks

    # Remove x tick marks
    plt.setp(ax.get_xticklabels(), rotation=0)

    # Hide the right and top spines
    ax.spines['right'].set_visible(False)
    ax.spines['left'].set_visible(False)
    ax.spines['top'].set_visible(False)
    ax.spines['bottom'].set_visible(False)

    # Only show ticks on the left and bottom spines
    ax.yaxis.set_ticks_position('left')
    ax.xaxis.set_ticks_position('bottom')

    # major/minor tick lines
    ax.xaxis.set_minor_locator(AutoMinorLocator(5))
    ax.grid(axis='y',
            which='major',
            color=[0.8, 0.8, 0.8], linestyle='-')

    ax.set_ylabel("Emissions (GCO$_2$e)", fontsize=12)
    ax.legend(loc='upper left', frameon=False)
_images/notebooks_emissions_and_emissions_per_capita_12_0.png

This tells a richer story. Now we see the US was the main annual contributor up until about the year 2000. After which, Chinese emissions sky rocketed while US emissions started declining.

An interesting feature in this graph is the dramatic drop in Russian emissions. This corresponds to the fall of the Soviet Union, which led to a huge drop in emissions. Key drivers of the emissions reductions were the decreasing beef consumption in the 1990s and carbon sequestration in soils on abandoned cropland Schierhorn et al., (2019).

Emissions per capita

Let’s retrieve population data and calulate emissions per capita for the seven countries with the highest annual emissions.

[8]:
# emissions for top seven countries with highest annual emissions in 2021
df_pop = client.population(tuple(df_recent.head(7)['actor_id']))

# calculate emissions per capita
df_percap = pd.merge(df_emissions, df_pop, on=['actor_id', 'year'])[['actor_id', 'year', 'total_emissions', 'population']]
df_percap = df_percap.assign(total_emissions_per_capita = lambda x: x['total_emissions'] / (x['population']))
[9]:
year = df_percap.year.max()
df_recent_percap = (
    df_percap
    .loc[df_percap.year == year]
    .assign(rank = lambda x: x['total_emissions_per_capita'].rank(ascending=False))
    .assign(percent_of_global = lambda x: (x['total_emissions'] / x['total_emissions'].sum()) * 100)
    .sort_values(by='rank')
    .merge(df_names, on='actor_id')
    .loc[:, ['rank', 'name', 'actor_id', 'year', 'total_emissions_per_capita', 'percent_of_global']]
)
[10]:
fig = plt.figure(figsize=(6, 6))
ax = fig.add_subplot(111)

# top 7 emitters
top_emitters = list(df_recent_percap.head(7).actor_id)

# wong color palette (https://davidmathlogic.com/colorblind/#%23D81B60-%231E88E5-%23FFC107-%23004D40)
colors = ['#000000', '#E69F00', '#56B4E9', '#009E73', '#F0E442', '#0072B2', '#D55E00', '#CC79A7']

for actor_id, color in zip(top_emitters, cycle(colors)):

    actor_name = df_names.loc[df_names['actor_id'] == actor_id, 'name'].values[0]
    filt = df_percap['actor_id'] == actor_id
    df_tmp = df_percap.loc[filt]

    ax.plot(np.array(df_tmp['year']), np.array(df_tmp['total_emissions_per_capita']),
            linewidth=4,
            label = actor_name,
            color=color)

    ylim = [0, 30]
    ax.set_ylim(ylim)
    ax.set_xlim([1950, 2022])

    # Turn off the display of all ticks.
    ax.tick_params(which='both',     # Options for both major and minor ticks
                   top='off',        # turn off top ticks
                   left='off',       # turn off left ticks
                   right='off',      # turn off right ticks
                   bottom='off')     # turn off bottom ticks

    # Remove x tick marks
    plt.setp(ax.get_xticklabels(), rotation=0)

    # Hide the right and top spines
    ax.spines['right'].set_visible(False)
    ax.spines['left'].set_visible(False)
    ax.spines['top'].set_visible(False)
    ax.spines['bottom'].set_visible(False)

    # Only show ticks on the left and bottom spines
    ax.yaxis.set_ticks_position('left')
    ax.xaxis.set_ticks_position('bottom')

    # major/minor tick lines
    ax.xaxis.set_minor_locator(AutoMinorLocator(5))
    ax.grid(axis='y',
            which='major',
            color=[0.8, 0.8, 0.8], linestyle='-')

    ax.set_ylabel("Emissions per capita (tCO$_2$e)", fontsize=12)
    ax.legend(loc='upper left', frameon=False)
_images/notebooks_emissions_and_emissions_per_capita_17_0.png

This graph shows that the average person in US emits about double the amount of CO2 annually as the average person in China, despite China having nearly 4 times the US population.

Cumulative emissions

This example will walk through calculating and visulaizing cumulative emissions.

[1]:
from itertools import cycle
import matplotlib.pyplot as plt
from matplotlib.ticker import AutoMinorLocator
from openclimate import Client
import numpy as np
import pandas as pd

We will first initialize a Client() object.

[2]:
client = Client()

If you are using a jupyter enviornment, you will need to first client.jupyter. This patches the asyncio library to work in Jupyter envionrments using nest-asyncio.

[3]:
client.jupyter

Get country codes

OpenClimate references each country by its two-letter ISO-3166 code. To access this in openclimate we can use the .parts() method to get all the “parts” of EARTH. Other codes we use are UN/LOCODEs for cities and LEI for companies. As a catch-all term, we call them an actor_id.

[4]:
df_country = client.parts('EARTH')

Looking at the dataframe that’s returned, we have a column with each country’s actor_id.

[5]:
df_country.head()
[5]:
actor_id name type has_data has_children children_have_data
5 AD Andorra country True None None
234 AE United Arab Emirates country True None None
0 AF Afghanistan country True None None
9 AG Antigua and Barbuda country True None None
7 AI Anguilla country True None None

Let’s save just the actor_id to a list

[6]:
iso_and_name = list(zip(df_country['actor_id'], df_country['name']))

Which datasets are available?

To get a list of datasets available for an actor you can use the .emissions_datasets() method. Here I am asking for datasets with Candian emissions.

[7]:
client.emissions_datasets('CA')
[7]:
actor_id datasource_id name publisher published URL
0 CA BP:statistical_review_june2022 Statistical Review of World Energy all data, 1... BP 2022-06-01T00:00:00.000Z https://www.bp.com/en/global/corporate/energy-...
1 CA EDGARv7.0:ghg Emissions Database for Global Atmospheric Rese... JRC 2022-01-01T00:00:00.000Z https://edgar.jrc.ec.europa.eu/dataset_ghg70
2 CA GCB2022:national_fossil_emissions:v1.0 Data supplement to the Global Carbon Budget 20... GCP 2022-11-04T00:00:00.000Z https://www.icos-cp.eu/science-and-impact/glob...
3 CA PRIMAP:10.5281/zenodo.7179775:v2.4 PRIMAP-hist_v2.4_no_extrap (scenario=HISTCR) PRIMAP 2022-10-17T00:00:00.000Z https://zenodo.org/record/7179775
4 CA UNFCCC:GHG_ANNEX1:2019-11-08 UNFCCC GHG total without LULUCF, ANNEX I count... UNFCCC 2019-11-08T00:00:00.000Z https://di.unfccc.int/time_series
5 CA climateTRACE:country_inventory climate TRACE: country inventory climate TRACE 2022-12-02T00:00:00.000Z https://climatetrace.org/inventory
6 CA WRI:climate_watch_historical_ghg:2022 Climate Watch Historical GHG Emissions WRI 2022-01-01T00:00:00.000Z https://www.climatewatchdata.org/ghg-emissions
7 CA IEA:GHG_energy_highlights:2022 Greenhouse Gas Emissions from Energy Highlights IEA 2022-09-01T00:00:00.000Z https://www.iea.org/data-and-statistics/data-p...

You can return datasets for multiple actors at once by passing them as a callable, such as a list or tuple. Here I am asking for Canadian and Italian emission datasets, but only returning a sample of 5 records.

[8]:
client.emissions_datasets(['CA', 'IT']).sample(5)
[8]:
actor_id datasource_id name publisher published URL
7 CA IEA:GHG_energy_highlights:2022 Greenhouse Gas Emissions from Energy Highlights IEA 2022-09-01T00:00:00.000Z https://www.iea.org/data-and-statistics/data-p...
17 IT IEA:GHG_energy_highlights:2022 Greenhouse Gas Emissions from Energy Highlights IEA 2022-09-01T00:00:00.000Z https://www.iea.org/data-and-statistics/data-p...
0 CA BP:statistical_review_june2022 Statistical Review of World Energy all data, 1... BP 2022-06-01T00:00:00.000Z https://www.bp.com/en/global/corporate/energy-...
14 IT climateTRACE:country_inventory climate TRACE: country inventory climate TRACE 2022-12-02T00:00:00.000Z https://climatetrace.org/inventory
5 CA climateTRACE:country_inventory climate TRACE: country inventory climate TRACE 2022-12-02T00:00:00.000Z https://climatetrace.org/inventory

Get emissions

If we just pass an actor_id to the .emissions() method, all the emissions will be returned.

[9]:
df_tmp = client.emissions(actor_id='US')
df_tmp.head()
[9]:
actor_id year total_emissions datasource_id
0 US 1990 5275397531 BP:statistical_review_june2022
1 US 1991 5225911642 BP:statistical_review_june2022
2 US 1992 5308410257 BP:statistical_review_june2022
3 US 1993 5412149078 BP:statistical_review_june2022
4 US 1994 5505379237 BP:statistical_review_june2022

Keep in mind that this will return all the data for that actor. Below are the datasets available.

[10]:
set(df_tmp['datasource_id'])
[10]:
{'BP:statistical_review_june2022',
 'EDGARv7.0:ghg',
 'GCB2022:national_fossil_emissions:v1.0',
 'IEA:GHG_energy_highlights:2022',
 'PRIMAP:10.5281/zenodo.7179775:v2.4',
 'UNFCCC:GHG_ANNEX1:2019-11-08',
 'WRI:climate_watch_historical_ghg:2022',
 'carbon_monitor:2022_12_14',
 'climateTRACE:country_inventory'}

In most cases, we want to filter this and use a particular dataset. We can do that with the datasource_id parameter.

[11]:
df_tmp = client.emissions(actor_id='US', datasource_id='PRIMAP:10.5281/zenodo.7179775:v2.4')

As a sanity check, let’s look at which datasets are returned

[12]:
set(df_tmp['datasource_id'])
[12]:
{'PRIMAP:10.5281/zenodo.7179775:v2.4'}

As you see, only PRIMAP was returned.

Get emissions for all countries

Now let’s get emissions for all countries

[13]:
%%time
iso_codes = [iso_code[0] for iso_code in iso_and_name]
df_emissions = client.emissions(
    actor_id=iso_codes,
    datasource_id='PRIMAP:10.5281/zenodo.7179775:v2.4'
)
CPU times: user 5.44 s, sys: 277 ms, total: 5.71 s
Wall time: 20.1 s

This takes about 30 seconds to retrieve all that data, even with asyncio working behind the scenes. This outputs a massive dataframe with the data from all countries concatenated together

[14]:
df_emissions.sample(5)
[14]:
actor_id year total_emissions datasource_id
347 BW 1925 1330000 PRIMAP:10.5281/zenodo.7179775:v2.4
484 ID 1967 108000000 PRIMAP:10.5281/zenodo.7179775:v2.4
308 FR 1751 34100000 PRIMAP:10.5281/zenodo.7179775:v2.4
435 AW 1961 1310000 PRIMAP:10.5281/zenodo.7179775:v2.4
431 HR 1925 2540000 PRIMAP:10.5281/zenodo.7179775:v2.4

Calculate cumulative emissions

let’s first make sure all the datasets have the same starting year

[15]:
all([df_emissions.loc[df_emissions['actor_id']==iso_code, 'year'].min() for iso_code in set(df_emissions['actor_id'])])
[15]:
True

Now we can calculate cumulative emissions

[16]:
df_out = df_emissions.assign(cumulative_emissions = df_emissions.groupby('actor_id')['total_emissions'].cumsum())

Now we have a column for cumulative emissions

[17]:
df_out.head()
[17]:
actor_id year total_emissions datasource_id cumulative_emissions
32 AD 1750 3740 PRIMAP:10.5281/zenodo.7179775:v2.4 3740
33 AD 1751 3750 PRIMAP:10.5281/zenodo.7179775:v2.4 7490
34 AD 1752 3760 PRIMAP:10.5281/zenodo.7179775:v2.4 11250
35 AD 1753 3770 PRIMAP:10.5281/zenodo.7179775:v2.4 15020
36 AD 1754 3780 PRIMAP:10.5281/zenodo.7179775:v2.4 18800

Rank country by cumulative emissions

Now that we now the cumulative emission, we can rank the countries by the cumulative emissions in the most recent year.

[18]:
last_year = df_out['year'].max()
df_sorted = (
    df_out.loc[df_out['year'] == last_year, ['actor_id', 'cumulative_emissions', 'year']]
    .sort_values(by='cumulative_emissions', ascending=False)
)

df_sorted['rank'] = df_sorted['cumulative_emissions'].rank(ascending=False)

Here are the top 10 cumulative emitters

[19]:
pd.merge(df_sorted.loc[df_sorted['rank'] <= 10], df_country[['actor_id', 'name']], on='actor_id')
[19]:
actor_id cumulative_emissions year rank name
0 US 561240060000 2021 1.0 United States of America
1 CN 375048000000 2021 2.0 China
2 RU 179731600000 2021 3.0 Russian Federation
3 IN 132717000000 2021 4.0 India
4 DE 117760000000 2021 5.0 Germany
5 GB 104375500000 2021 6.0 United Kingdom of Great Britain and Northern I...
6 JP 78204570000 2021 7.0 Japan
7 FR 64192400000 2021 8.0 France
8 UA 52563900000 2021 9.0 Ukraine
9 BR 47231630000 2021 10.0 Brazil

The United States and China are the top two emitters, with the U.S. emitting about 50% more emissions than China over the period from 1750 to 2021.

[20]:
561240060000 / 375048000000
[20]:
1.4964486145773341

Plot cumulative emissions

Now that we know the top emitters, we can plot a time series

[21]:
fig = plt.figure(figsize=(6, 6))
ax = fig.add_subplot(111)

# top 8 emitters
top_emitters = list(df_sorted.head(8).actor_id)

# wong color palette (https://davidmathlogic.com/colorblind/#%23D81B60-%231E88E5-%23FFC107-%23004D40)
colors = ['#000000', '#E69F00', '#56B4E9', '#009E73', '#F0E442', '#0072B2', '#D55E00', '#CC79A7']

for actor_id, color in zip(top_emitters, cycle(colors)):
    actor_name = df_country.loc[df_country['actor_id'] == actor_id, 'name'].values[0]
    filt = df_out['actor_id'] == actor_id
    df_tmp = df_out.loc[filt]

    ax.plot(np.array(df_tmp['year']), np.array(df_tmp['cumulative_emissions']) / 10**9,
            linewidth=4,
            label = actor_name,
            color=color)

    ylim = [0, 600]
    ax.set_ylim(ylim)
    ax.set_xlim([1850, 2022])

    # Turn off the display of all ticks.
    ax.tick_params(which='both',     # Options for both major and minor ticks
                   top='off',        # turn off top ticks
                   left='off',       # turn off left ticks
                   right='off',      # turn off right ticks
                   bottom='off')     # turn off bottom ticks

    # Remove x tick marks
    plt.setp(ax.get_xticklabels(), rotation=0)

    # Hide the right and top spines
    ax.spines['right'].set_visible(False)
    ax.spines['left'].set_visible(False)
    ax.spines['top'].set_visible(False)
    ax.spines['bottom'].set_visible(False)

    # Only show ticks on the left and bottom spines
    ax.yaxis.set_ticks_position('left')
    ax.xaxis.set_ticks_position('bottom')

    # major/minor tick lines
    ax.xaxis.set_minor_locator(AutoMinorLocator(5))
    ax.grid(axis='y',
            which='major',
            color=[0.8, 0.8, 0.8], linestyle='-')

    ax.set_ylabel("Cumulative Emissions (GtCO$_2$e)", fontsize=12)
    ax.legend(loc='upper left', frameon=False)
_images/notebooks_cumulative_emissions_44_0.png

Great Britain emissions

Here we will explore Great Britain’s emissions and ask whether or not they are on track to meeting their emissions pledges. “On track” will be defined as their current emissions being less than that of a scenario with uniform decrease in emissions.

[1]:
import matplotlib.pyplot as plt
from matplotlib.ticker import AutoMinorLocator
import numpy as np
import pandas as pd
from openclimate import Client
[2]:
# create an openclimate Client object
client = Client()
client.jupyter

Get data

We will the emissions data from UNFCCC to perform this analysis.

[3]:
actor_id = 'GB'
client.emissions_datasets(actor_id)
[3]:
actor_id datasource_id name publisher published URL
0 GB BP:statistical_review_june2022 Statistical Review of World Energy all data, 1... BP 2022-06-01T00:00:00.000Z https://www.bp.com/en/global/corporate/energy-...
1 GB EDGARv7.0:ghg Emissions Database for Global Atmospheric Rese... JRC 2022-01-01T00:00:00.000Z https://edgar.jrc.ec.europa.eu/dataset_ghg70
2 GB GCB2022:national_fossil_emissions:v1.0 Data supplement to the Global Carbon Budget 20... GCP 2022-11-04T00:00:00.000Z https://www.icos-cp.eu/science-and-impact/glob...
3 GB PRIMAP:10.5281/zenodo.7179775:v2.4 PRIMAP-hist_v2.4_no_extrap (scenario=HISTCR) PRIMAP 2022-10-17T00:00:00.000Z https://zenodo.org/record/7179775
4 GB UNFCCC:GHG_ANNEX1:2019-11-08 UNFCCC GHG total without LULUCF, ANNEX I count... UNFCCC 2019-11-08T00:00:00.000Z https://di.unfccc.int/time_series
5 GB carbon_monitor:2022_12_14 Carbon Monitor country CO2 emissions by sector Carbon Monitor 2022-12-14T00:00:00.000Z https://carbonmonitor.org/
6 GB climateTRACE:country_inventory climate TRACE: country inventory climate TRACE 2022-12-02T00:00:00.000Z https://climatetrace.org/inventory
7 GB WRI:climate_watch_historical_ghg:2022 Climate Watch Historical GHG Emissions WRI 2022-01-01T00:00:00.000Z https://www.climatewatchdata.org/ghg-emissions
8 GB openGHGmap:R2021A European OpenGHGMap NTNU 2021-01-01T00:00:00.000Z https://openghgmap.net/data/
9 GB BEIS:UK_regional_GHG:2022-06-30 UK local authority and regional greenhouse gas... BEIS 2022-06-30T00:00:00.000Z https://www.gov.uk/government/statistics/uk-lo...
10 GB IEA:GHG_energy_highlights:2022 Greenhouse Gas Emissions from Energy Highlights IEA 2022-09-01T00:00:00.000Z https://www.iea.org/data-and-statistics/data-p...
[4]:
emissions_datasource = 'UNFCCC:GHG_ANNEX1:2019-11-08'
df_gb = client.emissions(actor_id=actor_id, datasource_id=emissions_datasource)
df_ndc = client.targets(actor_id=actor_id)

# convert tonnes to megatonnes
df_gb['total_emissions'] = df_gb['total_emissions'] / 10**6

# filter ndc by target
filt = df_ndc['datasource_id']=='IGES:NDC_db:10.57405/iges-5005'
df_ndc = df_ndc.loc[filt]

The UK has pledged to reduce their emissions by 68% from 1990 levels by 2030.

[5]:
df_ndc
[5]:
actor_id target_type baseline_year baseline_value target_year target_value target_unit datasource_id
0 GB Absolute emission reduction 1990.0 None 2030 68 percent IGES:NDC_db:10.57405/iges-5005

A quick look at their emissions we see that Great Britain’s emissions have been decreasing for the last thirty years. But this brings up a a couple questions: - Is GB “on-track” to meeting this goal? - Will GB meet their goal if this long-term trend continues?

[6]:
plt.plot(np.array(df_gb['year'], dtype='float64'), np.array(df_gb['total_emissions']))
plt.xlabel('Year')
plt.ylabel('GB Annual Emissions [MtCO$_2$-eq]')
[6]:
Text(0, 0.5, 'GB Annual Emissions [MtCO$_2$-eq]')
_images/notebooks_great_britain_emissions_9_1.png

Is Great Britain on track?

To answer this question, we will simply ask if the current emissions are less than if the emissions decreased uniformly from baseline to their goal. We will also ask whether GB will meet their goal if the long-term trend contintues. Keep in mind that both of these are crude and imperfect metrics. More sophistiated approaches including using integrated assessment models (IAMs) that incorporate proposed actions.

[7]:
# implementation normal equations for ordinary least squares regression
def linear_eq(df, start_year=None, year_var='year', emissions_var='total_emissions'):
    '''simple linear regression'''
    filt = df[year_var]>=start_year
    x = df.loc[filt, year_var].values
    y = df.loc[filt, emissions_var].values

    # least-squares linear regression
    n = len(x)
    sum_x = np.sum(x)
    sum_y = np.sum(y)
    sum_xy = np.sum(x * y)
    sum_xx = np.sum(x * x)
    mean_x = np.mean(x)
    mean_y = np.mean(y)

    # calculate coefficients
    b = (n * sum_xy - sum_x * sum_y) / (n * sum_xx - sum_x**2)
    a = mean_y - b * mean_x

    # Make predictions using the regression line
    pred = lambda x: a + b * x
    return {'equation':pred, 'slope':b, 'intercept':a}
[8]:
baseline_year = df_ndc['baseline_year'].values[0]
current_year = df_gb['year'].max()
net_zero_year = 2050
target_year = df_ndc['target_year'].values[0]
target_value = int(df_ndc['target_value'])
target_percent = float(df_ndc['target_value'].squeeze())/100

pred = linear_eq(df_gb, start_year=baseline_year)
X_pred = np.arange(baseline_year, target_year + 1)
Y_pred = pred['equation'](X_pred)

# get baseline and target emissions
filt = df_gb['year'] == baseline_year
baseline_emissions = df_gb.loc[filt,'total_emissions']
target_emissions = df_gb.loc[filt,'total_emissions'] * (100 - target_value)/100
net_zero_emissions = 0

# current emissions
filt = df_gb['year'] == current_year
current_emissions = df_gb.loc[filt,'total_emissions']

# average annual reduction needed to achieve goal:
avg_rate = round(((baseline_emissions - target_emissions) / (target_year - baseline_year + 1)).values[0])
nz_rate = round(((baseline_emissions - net_zero_emissions) / (net_zero_year - baseline_year + 1)).values[0])

year_target_achieved = round((target_emissions - pred['intercept']) / pred['slope'])
print(f"To acheive goal, average rate of reduction needs to be {abs(avg_rate):.0f} MT/yr")
print(f"To acheive net-zero goal, average rate of reduction needs to be {abs(nz_rate):.0f} MT/yr")
print(f"GB reducing emissions by about {abs(pred['slope']):.0f} MT/yr")
print(f'Target emissions of {int(target_emissions.values)} MT/yr will be acheived around {int(year_target_achieved.values)}')
To acheive goal, average rate of reduction needs to be 13 MT/yr
To acheive net-zero goal, average rate of reduction needs to be 13 MT/yr
GB reducing emissions by about 12 MT/yr
Target emissions of 255 MT/yr will be acheived around 2037
/tmp/ipykernel_2165/3089394390.py:5: FutureWarning: Calling int on a single element Series is deprecated and will raise a TypeError in the future. Use int(ser.iloc[0]) instead
  target_value = int(df_ndc['target_value'])

From this quick analysis, we can see that GB maybe be slightly off track to meeting their goals

Plot of emissions and pledges

[9]:
fig = plt.figure(figsize=(6, 6))
ax = fig.add_subplot(111)

ax.plot(np.array(df_gb['year']), np.array(df_gb['total_emissions']),
        linewidth=4,
        label='Great Britian annual emissions',
       color=[0.0,0.0,0.0])

ax.plot(X_pred, Y_pred, '--',
        linewidth=2,
        color=[0.6,0.6,0.6],
       label='Linear trend')

ax.plot(np.array([baseline_year, float(target_year)]), np.array([float(baseline_emissions), float(target_emissions)]),
        '-',
        linewidth=2,
        color=[0.6,0.6,0.6],
       label='linear decrease')


ax.plot([df_ndc['baseline_year'], df_ndc['target_year']],
        [target_emissions, target_emissions],
        '-.',
        linewidth=1,
        color=[0.5,0.5,0.5],
       label='target emissions')

ylim = [200, 850]
ax.set_ylim(ylim)
ax.set_xlim([1990, 2035])

# Turn off the display of all ticks.
ax.tick_params(which='both',     # Options for both major and minor ticks
               top='off',        # turn off top ticks
               left='off',       # turn off left ticks
               right='off',      # turn off right ticks
               bottom='off')     # turn off bottom ticks

# Remove x tick marks
plt.setp(ax.get_xticklabels(), rotation=0)

# Hide the right and top spines
ax.spines['right'].set_visible(False)
ax.spines['left'].set_visible(False)
ax.spines['top'].set_visible(False)
ax.spines['bottom'].set_visible(False)

# Only show ticks on the left and bottom spines
ax.yaxis.set_ticks_position('left')
ax.xaxis.set_ticks_position('bottom')

# major/minor tick lines
ax.xaxis.set_minor_locator(AutoMinorLocator(5))
ax.grid(axis='y',
        which='major',
        color=[0.8, 0.8, 0.8], linestyle='-')

bline_emissions = baseline_emissions.values[0]
ylim_achieved = [(bline_emissions - ylim[0])/ (bline_emissions*target_percent)*100,
                 (bline_emissions - ylim[1])/ (bline_emissions*target_percent)*100]
ax2 = ax.twinx()
ax2.set_ylim(ylim_achieved)

# Hide the right and top spines
ax2.spines['right'].set_visible(False)
ax2.spines['left'].set_visible(False)
ax2.spines['top'].set_visible(False)
ax2.spines['bottom'].set_visible(False)

# Only show ticks on the left and bottom spines
ax2.yaxis.set_ticks_position('right')
ax2.xaxis.set_ticks_position('bottom')
ax2.yaxis.set_tick_params(size=0)

# Set the y-axis tick labels using a FixedFormatter
vals = ax2.get_yticks()
ax2.yaxis.set_major_locator(plt.FixedLocator(vals))
ax2.set_yticklabels([f"{int(x)}%" for x in vals])

ax2.set_ylabel("Percent achieved", fontsize=12)
ax.set_ylabel("Emissions (MtCO$_2$-eq)", fontsize=12)
ax.legend(loc='upper right', frameon=False)
/tmp/ipykernel_2165/3965963511.py:14: FutureWarning: Calling float on a single element Series is deprecated and will raise a TypeError in the future. Use float(ser.iloc[0]) instead
  ax.plot(np.array([baseline_year, float(target_year)]), np.array([float(baseline_emissions), float(target_emissions)]),
[9]:
<matplotlib.legend.Legend at 0x7f6bc950bee0>
_images/notebooks_great_britain_emissions_15_2.png

Now let’s do the same for net zero

[10]:
pred = linear_eq(df_gb, start_year=baseline_year)
X_pred = np.arange(baseline_year, net_zero_year + 1)
Y_pred = pred['equation'](X_pred)
[11]:
fig = plt.figure(figsize=(6, 6))
ax = fig.add_subplot(111)

ax.plot(np.array(df_gb['year']), np.array(df_gb['total_emissions']),
        linewidth=4,
        label='Great Britian annual emissions',
       color=[0.0,0.0,0.0])

ax.plot(X_pred, Y_pred, '--',
        linewidth=2,
        color=[0.6,0.6,0.6],
       label='Linear trend')

ax.plot(np.array((baseline_year, float(net_zero_year))), np.array((float(baseline_emissions), net_zero_emissions)),
        '-',
        linewidth=2,
        color=[0.6,0.6,0.6],
       label='linear decrease')

ylim = [0, 850]
ax.set_ylim(ylim)
ax.set_xlim([1990, 2055])

# Turn off the display of all ticks.
ax.tick_params(which='both',     # Options for both major and minor ticks
               top='off',        # turn off top ticks
               left='off',       # turn off left ticks
               right='off',      # turn off right ticks
               bottom='off')     # turn off bottom ticks

# Remove x tick marks
plt.setp(ax.get_xticklabels(), rotation=0)

# Hide the right and top spines
ax.spines['right'].set_visible(False)
ax.spines['left'].set_visible(False)
ax.spines['top'].set_visible(False)
ax.spines['bottom'].set_visible(False)

# Only show ticks on the left and bottom spines
ax.yaxis.set_ticks_position('left')
ax.xaxis.set_ticks_position('bottom')

# major/minor tick lines
ax.xaxis.set_minor_locator(AutoMinorLocator(5))
ax.grid(axis='y',
        which='major',
        color=[0.8, 0.8, 0.8], linestyle='-')

bline_emissions = baseline_emissions.values[0]
ylim_achieved = [(bline_emissions - ylim[0])/ (bline_emissions*target_percent)*100,
                 (bline_emissions - ylim[1])/ (bline_emissions*target_percent)*100]

ax.set_ylabel("Emissions (MtCO$_2$e)", fontsize=12)
ax.legend(loc='upper right', frameon=False)
/tmp/ipykernel_2165/2911775156.py:14: FutureWarning: Calling float on a single element Series is deprecated and will raise a TypeError in the future. Use float(ser.iloc[0]) instead
  ax.plot(np.array((baseline_year, float(net_zero_year))), np.array((float(baseline_emissions), net_zero_emissions)),
[11]:
<matplotlib.legend.Legend at 0x7f6bc949b0a0>
_images/notebooks_great_britain_emissions_18_2.png

In general, the emissions have been trending in the right direction over the past 30 years, but emissions will have to decline at an increasing rate in order to achieve these goals.

Canada Emissions Breakdown

In this notebook we will explore a territorial breakdown of Canadian emissions into provinces using data submitted to UNFCCC.

[1]:
from itertools import cycle
import matplotlib.pyplot as plt
from matplotlib.ticker import AutoMinorLocator
import pandas as pd
[2]:
from openclimate import Client
client = Client()
client.jupyter

Let’s display all of dataset’s available for Canada

[3]:
client.emissions_datasets('CA')
[3]:
actor_id datasource_id name publisher published URL
0 CA BP:statistical_review_june2022 Statistical Review of World Energy all data, 1... BP 2022-06-01T00:00:00.000Z https://www.bp.com/en/global/corporate/energy-...
1 CA EDGARv7.0:ghg Emissions Database for Global Atmospheric Rese... JRC 2022-01-01T00:00:00.000Z https://edgar.jrc.ec.europa.eu/dataset_ghg70
2 CA GCB2022:national_fossil_emissions:v1.0 Data supplement to the Global Carbon Budget 20... GCP 2022-11-04T00:00:00.000Z https://www.icos-cp.eu/science-and-impact/glob...
3 CA PRIMAP:10.5281/zenodo.7179775:v2.4 PRIMAP-hist_v2.4_no_extrap (scenario=HISTCR) PRIMAP 2022-10-17T00:00:00.000Z https://zenodo.org/record/7179775
4 CA UNFCCC:GHG_ANNEX1:2019-11-08 UNFCCC GHG total without LULUCF, ANNEX I count... UNFCCC 2019-11-08T00:00:00.000Z https://di.unfccc.int/time_series
5 CA climateTRACE:country_inventory climate TRACE: country inventory climate TRACE 2022-12-02T00:00:00.000Z https://climatetrace.org/inventory
6 CA WRI:climate_watch_historical_ghg:2022 Climate Watch Historical GHG Emissions WRI 2022-01-01T00:00:00.000Z https://www.climatewatchdata.org/ghg-emissions
7 CA IEA:GHG_energy_highlights:2022 Greenhouse Gas Emissions from Energy Highlights IEA 2022-09-01T00:00:00.000Z https://www.iea.org/data-and-statistics/data-p...

Now let’s gather emissions data for Canada from UNFCCC as well as emissions for each province from ECCC. Finally, let’s gather population data for Canada and its provinces from 2017 (this is the most recent year we have emissions for all provinces).

[4]:
# canadian emissions
df_ca = client.emissions('CA', 'UNFCCC:GHG_ANNEX1:2019-11-08')

# canadian province names
df_parts = client.parts('CA', part_type='adm1')[['actor_id', 'name']]

# province emissions
df_prov = client.emissions(df_parts.actor_id, 'ECCC:GHG_inventory:2022-04-13')

# candian population in 2017
df_ca_pop = (
    client.population('CA')
    .loc[lambda x: x['year'] == 2017, ['actor_id', 'year', 'population']]
)

# province population in 2017
df_prov_pop = (
    client.population(df_parts.actor_id)
    .loc[lambda x: x['year'] == 2017, ['actor_id', 'year', 'population']]
)

Now let’s select Canadian emissions and provincial emissions for 2020. I know it’s not the same year as population, but let’s assum popluation hasn’t changed signifcantly in three years. We are also going to convert to megatonnes of CO2 equivalents by dividing by dividing the emissions by a million.

[5]:
# national emissions in MTCO2e
national = df_ca.loc[df_ca['year'] == 2020, 'total_emissions'].values / 10**6

# province emissions and cumulative emissions in MTCO2e
df_out = (
    df_prov
    .loc[df_prov['year'] == 2020, ['total_emissions', 'actor_id']]
    .assign(total_emissions= lambda x: x['total_emissions'].div(10**6), inplace=True)
    .assign(percent_of_national = lambda x: (x['total_emissions'] / national) * 100)
    .sort_values(by='percent_of_national', ascending=False)
    .assign(cumulative = lambda x: x['percent_of_national'].cumsum())
    .merge(df_parts, on='actor_id')
    .merge(df_prov_pop, on='actor_id')
    .assign(percent_of_population = lambda x: (x['population'] / x['population'].sum()) * 100)
    .rename(columns={'total_emissions': 'total_emissions_[MTCO2e]'})
    .loc[:, ['actor_id', 'name', 'total_emissions_[MTCO2e]', 'percent_of_national',  'cumulative', 'population', 'percent_of_population']]
)

Now let’s display a table showing the province, emissions breakdown, and population.

[6]:
df_out
[6]:
actor_id name total_emissions_[MTCO2e] percent_of_national cumulative population percent_of_population
0 CA-AB Alberta 256.459542 38.143528 38.143528 4306039 11.674212
1 CA-ON Ontario 149.584918 22.247940 60.391467 14279196 38.712694
2 CA-QC Quebec 76.241175 11.339439 71.730907 8425996 22.843933
3 CA-SK Saskatchewan 65.894159 9.800515 81.531422 1168057 3.166749
4 CA-BC British Columbia 61.746788 9.183672 90.715094 4841078 13.124770
5 CA-MB Manitoba 21.674064 3.223609 93.938703 1343371 3.642047
6 CA-NS Nova Scotia 14.596446 2.170946 96.109649 957600 2.596174
7 CA-NB New Brunswick 12.440907 1.850351 97.960000 760868 2.062809
8 CA-NL Newfoundland and Labrador 9.500844 1.413072 99.373072 528430 1.432640
9 CA-PE Prince Edward Island 1.609972 0.239453 99.612525 152784 0.414217
10 CA-NT Northwest Territories 1.401465 0.208442 99.820966 44718 0.121236
11 CA-NU Nunavut 0.602920 0.089673 99.910639 38243 0.103682
12 CA-YT Yukon 0.600610 0.089329 99.999969 38669 0.104837

This shows that 5 provines (Albera, Ontario, Quebec, Saskatchewan, and British Columbia) make up over 90% of Canada’s overall emissions in 2020. Alberta, while only accounting for about 12% of Canada’s population contributes to 38% of the nation’s emissions. This is largely driven by oil/gas and agriculture sectors.

Finally, let’s make a first attempt at visulaizating this breakdow. This bar graph is clunky and could be improved.

[7]:
fig = plt.figure(figsize=(6, 6))
ax = fig.add_subplot(111)

previous = 0
for iterator, row in df_out.iterrows():
    emissions = row['percent_of_national']
    cumulative = row['cumulative']
    actor_id = row['actor_id']
    ax.bar(1, emissions, bottom=previous, label=actor_id)
    previous = cumulative
    ax.text(1.5, previous - (emissions/2),
          actor_id,
          fontsize=12,
          color='k')

# Turn off the display of all ticks.
ax.tick_params(which='both',     # Options for both major and minor ticks
               top='off',        # turn off top ticks
               left='off',       # turn off left ticks
               right='off',      # turn off right ticks
               bottom='off')     # turn off bottom ticks

# Remove x tick marks
plt.setp(ax.get_xticklabels(), rotation=0)

# Hide the right and top spines
ax.spines['right'].set_visible(False)
ax.spines['left'].set_visible(False)
ax.spines['top'].set_visible(False)
ax.spines['bottom'].set_visible(False)

# Only show ticks on the left and bottom spines
ax.yaxis.set_ticks_position('left')
ax.xaxis.set_ticks_position('bottom')

# grid and tick marks
ax.set_yticks(np.arange(10, 110, 10))
ax.grid(axis='y',
        which='major',
        color=[0.8, 0.8, 0.8], linestyle='-')

ax.set_axisbelow(True)
ax.set_xticks([])
ax.set_title("Territorial Breakdown of Canada's 2020 emissions")
ax.set_ylabel("% of national emissions")
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[7], line 37
     34 ax.xaxis.set_ticks_position('bottom')
     36 # grid and tick marks
---> 37 ax.set_yticks(np.arange(10, 110, 10))
     38 ax.grid(axis='y',
     39         which='major',
     40         color=[0.8, 0.8, 0.8], linestyle='-')
     42 ax.set_axisbelow(True)

NameError: name 'np' is not defined
_images/notebooks_canada_breakdown_13_1.png

Canada Target Gap

Are provincial pledges adequate enough to achieve Canada’s NDC goal?

Each party to the Paris Agreement creates a nationally determined contribution (NDC) or intended nationally determined contribution (INDC), these non-binding national plans highlight climate change mitigation, including climate-related targets for greenhouse gas emission reductions. Non-state actors on the other hand, are not formally recognized in the Paris Agreement’s global stocktake. Actions at subnational level are integral to the success of the Paris Agreement. Some non-state actors create climate plans with pledged emission targets. For instance, 11 of the 13 Canadian provinces/territories have pledged emissions targets.

This notebook will explore if the provincial pledges are enough to meet Canada’s NDC goal. We will use nationally reported data from the UNFCCC and provincial data from ECCC, as well as pledged targets. We find that the provincial pledges are not adequate enough to achieve Canada’s NDC goal. Assuming the emissions from provinces without targets remain constant at pre-pandemic 2019 levels, Canada will be about 167 MtCO2e shy of their NDC goal. As outlined in the AR6 summary for policymakers (SPM), feasible, effective, and low-cost options for mitigation and adaptation are already available.

[1]:
from itertools import cycle
import matplotlib.pyplot as plt
from matplotlib.ticker import AutoMinorLocator
import numpy as np
import openclimate as oc
import pandas as pd
[2]:
def get_emissions(part, data_id=None):
    data_id = 'ECCC:GHG_inventory:2022-04-13' if data_id is None else data_id
    try:
        return client.emissions(actor_id=part, datasource_id=data_id)
    except:
        return None

def get_target(part, year, data_id = None):
    data_id = 'C2ES:canadian_GHG_targets' if data_id is None else data_id
    try:
        part_targets = (
            client.targets(actor_id = part, ignore_warnings=True)
            .loc[lambda x: x['target_type'] == 'Absolute emission reduction',
                 ['actor_id', 'baseline_year', 'target_year', 'target_value', 'target_unit', 'datasource_id']]
        )

        part_target = part_targets.loc[part_targets['datasource_id']== data_id]

        closest_target = part_targets['target_year'][part_targets['target_year'] >= 2030].min()
        cols_out = ['actor_id', 'baseline_year', 'target_year','target_value', 'target_unit']
        target = part_targets.loc[part_targets['target_year'] == closest_target, cols_out]
        return target
    except:
        return None

def least_squares_regression(x, y):
    # Calculate the slope and intercept using normal equations
    X = np.vstack([x, np.ones(len(x))]).T
    theta = np.linalg.inv(X.T @ X) @ X.T @ y
    slope, intercept = theta[0], theta[1]
    predict = lambda x: slope * x + intercept
    return {"slope": slope, "intercept": intercept, "equation": predict}

[3]:
# Inititaliate OpenClimate
client = oc.Client()
client.jupyter

Get country emissions and targets

[4]:
iso2 = 'CA'
data_id = 'UNFCCC:GHG_ANNEX1:2019-11-08'

tonnes_to_megatonnes = 1 / 10**6

actor_parts = client.parts(actor_id = iso2, part_type = 'adm1')
df_nat = client.emissions(actor_id = iso2, datasource_id=data_id)

nat_targets = (
    client.targets(actor_id = iso2)
    .loc[lambda x: x['target_type'] == 'Absolute emission reduction',
         ['actor_id', 'baseline_year', 'target_year', 'target_value', 'target_unit']]
)

df_target = nat_targets.drop_duplicates().reset_index().iloc[-1]
baseline_year = int(df_target['baseline_year'])
baseline_emissions = float(df_nat.loc[df_nat['year'] == baseline_year, 'total_emissions']) * tonnes_to_megatonnes
target_year = int(df_target['target_year'])
percent = int(df_target['target_value'])
percent_decimal = percent / 100
emissions_cut = baseline_emissions * percent_decimal
target_emissions = baseline_emissions - emissions_cut

data = {
    'actor_id': iso2,
    'baseline_year': baseline_year,
    'baseline_emissions': baseline_emissions,
    'target_year': target_year,
    'target_emissions':target_emissions,
    'emissions_reduction': emissions_cut,
    'target_percent': percent
}

national = pd.DataFrame(data, index=[0])
/tmp/ipykernel_1679/3142983122.py:17: FutureWarning: Calling float on a single element Series is deprecated and will raise a TypeError in the future. Use float(ser.iloc[0]) instead
  baseline_emissions = float(df_nat.loc[df_nat['year'] == baseline_year, 'total_emissions']) * tonnes_to_megatonnes
[5]:
national
[5]:
actor_id baseline_year baseline_emissions target_year target_emissions emissions_reduction target_percent
0 CA 2005 741.182843 2030 407.650564 333.532279 45

Get province emissions and targets

[6]:
data_raw = []
data_scaled = []

tonnes_to_megatonnes = 1 / 10**6

for part in actor_parts['actor_id']:
    data_id = 'ECCC:GHG_inventory:2022-04-13'
    year = 2030
    df_part = get_emissions(part, data_id)

    try:
        df_target = get_target(part, year).drop_duplicates().reset_index().iloc[-1]
        baseline_year = int(df_target['baseline_year'])
        baseline_emissions = float(df_part.loc[df_part['year'] == baseline_year, 'total_emissions']) * tonnes_to_megatonnes
        target_year = int(df_target['target_year'])
        percent = int(df_target['target_value'])
        percent_decimal = percent / 100
        emissions_cut = baseline_emissions * percent_decimal
        n_years = target_year - baseline_year
        emissions_cut_per_year = emissions_cut / n_years
        target_emissions = baseline_emissions - emissions_cut

        data_raw.append(
            {
            'actor_id': part,
            'baseline_year': baseline_year,
            'baseline_emissions': baseline_emissions,
            'target_year': target_year,
            'target_emissions':target_emissions,
            'emissions_reduction': emissions_cut,
            'avg_reduction_per_year': emissions_cut_per_year,
            'percent': percent
            }
        )

        if target_year>year:
            x = [baseline_year, target_year]
            y = [baseline_emissions, target_emissions]
            lsr_dict = least_squares_regression(x, y)
            lsr = lsr_dict['equation']
            target_year = year
            target_emissions = lsr(target_year)
            emissions_cut = baseline_emissions * percent_decimal
            emissions_cut_per_year = emissions_cut / (target_year - baseline_year)

        data_scaled.append(
            {
            'actor_id': part,
            'baseline_year': baseline_year,
            'baseline_emissions': baseline_emissions,
            'normalized_target_year': target_year,
            'target_emissions':target_emissions,
            'emissions_reduction': emissions_cut,
            'avg_reduction_per_year': emissions_cut_per_year,
            'percent_reduction': percent
            }
        )

    except:
        continue

df_part_targets = pd.DataFrame(data_raw)
df_part_targets_scaled = pd.DataFrame(data_scaled)

Each province has targets with different baseline years, percent reduction, and target years

[7]:
df_part_targets
[7]:
actor_id baseline_year baseline_emissions target_year target_emissions emissions_reduction avg_reduction_per_year percent
0 CA-AB 2005 237.093201 2050 203.900153 33.193048 0.737623 14
1 CA-BC 2007 62.658881 2030 37.595329 25.063552 1.089720 40
2 CA-MB 2005 20.530551 2030 13.755469 6.775082 0.271003 33
3 CA-NB 2005 19.781112 2030 10.681800 9.099312 0.363972 46
4 CA-NL 2001 9.899129 2050 2.474782 7.424347 0.151517 75
5 CA-NS 2005 22.963779 2030 10.792976 12.170803 0.486832 53
6 CA-NT 2005 1.725190 2030 0.862595 0.862595 0.034504 50
7 CA-ON 2005 204.370140 2030 143.059098 61.311042 2.452442 30
8 CA-PE 2005 1.899135 2030 1.329395 0.569740 0.022790 30
9 CA-QC 1990 84.508702 2030 53.240482 31.268220 0.781705 37
10 CA-YT 2010 0.647988 2030 0.453592 0.194396 0.009720 30

In order to accurately compare the effectiveness of these targets to achieving the national goal, we scale all the pledges to 2030 (the target year at the national level) assuming linear rate of reduction

[8]:
df_part_targets_scaled
[8]:
actor_id baseline_year baseline_emissions normalized_target_year target_emissions emissions_reduction avg_reduction_per_year percent_reduction
0 CA-AB 2005 237.093201 2030 218.652619 33.193048 1.327722 14
1 CA-BC 2007 62.658881 2030 37.595329 25.063552 1.089720 40
2 CA-MB 2005 20.530551 2030 13.755469 6.775082 0.271003 33
3 CA-NB 2005 19.781112 2030 10.681800 9.099312 0.363972 46
4 CA-NL 2001 9.899129 2030 5.505128 7.424347 0.256012 75
5 CA-NS 2005 22.963779 2030 10.792976 12.170803 0.486832 53
6 CA-NT 2005 1.725190 2030 0.862595 0.862595 0.034504 50
7 CA-ON 2005 204.370140 2030 143.059098 61.311042 2.452442 30
8 CA-PE 2005 1.899135 2030 1.329395 0.569740 0.022790 30
9 CA-QC 1990 84.508702 2030 53.240482 31.268220 0.781705 37
10 CA-YT 2010 0.647988 2030 0.453592 0.194396 0.009720 30

Calculate target gap

If the provinces are on track to meeting Canada’s NDC goal, then the sum of each provincial emissions in the target yeat (\(E_{prov}\)) will equal the national emissions in the target yeat (\(E_{nat}\)). However, if the provincial emissions are either not enough or overshoot the national goal, there will be an emissions gap (\(E_{gap}\)), if this gap is positive then the provincial is not enough to meet the goal and if the gap is negative, the provinces have overachieved the goal.

\(E_{nat} + E_{gap}= \sum_{prov=1}^N E_{prov}\)

In this section of the notebook, we will calculate this gap as follows:

\(E_{gap} = \big(\sum_{prov=1}^N E_{prov}\big) - E_{nat}\)

This only takes into account provinces with with targets. Two provinces, Saskatchewan and Nunavut, do not have targets. In this case, we will assume their emissions remain at pre-pandemic 2019 levels, as we are unsure as to their future emissions trajectory. The revised gap that takes into account emissions from Saskatchewan (\(E_{2019,SK}\)) and Nunavut (\(E_{2019,NU}\)):

\(E_{gap} = \big(\sum_{prov=1}^N E_{prov}\big) + E_{2019,SK} + E_{2019,NU} - E_{nat}\)

[9]:
sum_subat_target = df_part_targets_scaled['target_emissions'].sum()
national_target = float(national['target_emissions'].values)

gap = sum_subat_target - national_target

print(f'''
If each province meets their goal (ignoring emissiong from Saskatchewan and Nunavut),
there will still be an {round(gap)} MtCO2e gap in the target.
''')

If each province meets their goal (ignoring emissiong from Saskatchewan and Nunavut),
there will still be an 88 MtCO2e gap in the target.

[10]:
missing_actors = list(set(actor_parts['actor_id']) - set(df_part_targets['actor_id']))
data_id = 'ECCC:GHG_inventory:2022-04-13'
df_missing = get_emissions(missing_actors, data_id)
df_missing = df_missing.assign(total_emissions = df_missing['total_emissions'] / 10**6)
missing_emissions = df_missing.loc[(df_missing['actor_id'].isin(missing_actors)) & (df_missing['year'] == 2019), 'total_emissions'].sum()
gap_revised = (sum_subat_target + missing_emissions) - national_target

print(f'''
If we assume Saskatchewan and Nunavut emissions remain constant at pre-pandemic levels,
then the emissions gap increases to {round(gap_revised)} MtCO2e.
''')

If we assume Saskatchewan and Nunavut emissions remain constant at pre-pandemic levels,
then the emissions gap increases to 167 MtCO2e.

[11]:
print(f'''
This emissions gap is similar to the reductions from all the pledges {round(df_part_targets_scaled['emissions_reduction'].sum())} MtCO2e.
Meaning provincial commitments need to roughly double to meet the national goal.
''')

This emissions gap is similar to the reductions from all the pledges 188 MtCO2e.
Meaning provincial commitments need to roughly double to meet the national goal.

Create figure

[12]:
df_tmp = (
    df_missing
    .loc[(df_missing['actor_id'].isin(missing_actors)) & (df_missing['year'] == 2019), ['actor_id','total_emissions']]
    .rename(columns={'total_emissions':'target_emissions'})
)

df_fin = pd.concat([df_part_targets_scaled[['actor_id', 'target_emissions']], df_tmp]).reset_index(drop=True)
df_fin = df_fin.sort_values(by='target_emissions', ascending=False)
df_fin['cumulative'] = df_fin['target_emissions'].cumsum()
[13]:
fig = plt.figure(figsize=(6, 6))
ax = fig.add_subplot(111)

ax.bar(0, national['target_emissions'], bottom=0, label='CA')

previous = 0
for iterator, row in df_fin.iterrows():
    emissions = row['target_emissions']
    cumulative = row['cumulative']
    actor_id = row['actor_id']
    ax.bar(1, emissions, bottom=previous, label=actor_id)
    previous = cumulative
    ax.text(1.5, previous - (emissions/2),
          actor_id,
          fontsize=12,
          color='k')

# Turn off the display of all ticks.
ax.tick_params(which='both',     # Options for both major and minor ticks
               top='off',        # turn off top ticks
               left='off',       # turn off left ticks
               right='off',      # turn off right ticks
               bottom='off')     # turn off bottom ticks

# Remove x tick marks
plt.setp(ax.get_xticklabels(), rotation=0)

# Hide the right and top spines
ax.spines['right'].set_visible(False)
ax.spines['left'].set_visible(False)
ax.spines['top'].set_visible(False)
ax.spines['bottom'].set_visible(False)

# Only show ticks on the left and bottom spines
ax.yaxis.set_ticks_position('left')
ax.xaxis.set_ticks_position('bottom')

# grid and tick marks
ax.set_yticks(np.arange(0, 700, 100))
ax.grid(axis='y',
        which='major',
        color=[0.8, 0.8, 0.8], linestyle='-')

ax.set_axisbelow(True)
ax.set_xticks([0, 1])
ax.set_xticklabels(['National', 'Provinces'])
ax.set_title("2030 Emission targets")
ax.set_ylabel("Emissions (MTCO$_2$-eq)")
[13]:
Text(0, 0.5, 'Emissions (MTCO$_2$-eq)')
_images/notebooks_canada_target_gap_19_1.png

Contribution Guide

Contributions are highly welcomed and appreciated. Every little help counts, so do not hesitate! You can make a high impact on openclimate just by using it and reporting issues.

The following sections cover some general guidelines regarding development in openclimate for maintainers and contributors.

Nothing here is set in stone and can’t be changed. Feel free to suggest improvements or changes in the workflow.

Feature requests and feedback

We are eager to hear about your requests for new features and any suggestions about the API, infrastructure, and so on. Feel free to submit these as issues with the label “feature request.”

Please make sure to explain in detail how the feature should work and keep the scope as narrow as possible. This will make it easier to implement in small PRs.

Report bugs

Report bugs for openclimate in the issue tracker with the label “bug”.

If you can write a demonstration test that currently fails but should pass that is a very useful commit to make as well, even if you cannot fix the bug itself.

Fix bugs

Look through the GitHub issues for bugs.

Talk to developers to find out how you can fix specific bugs.

Preparing Pull Requests

  1. Fork the OpenClimate-pyclient GitHub repository. It’s fine to use OpenClimate-pyclient as your fork repository name because it will live under your username.

  2. Clone your fork locally using git, connect your repository to the upstream (main project), and create a branch:

    $ git clone git@github.com:YOUR_GITHUB_USERNAME/OpenClimate-pyclient.git
    $ cd OpenClimate-pyclient
    $ git remote add upstream git@github.com:Open-Earth-Foundation/OpenClimate-pyclient.git
    
    # now, to fix a bug or add feature create your own branch off "master":
    
    $ git checkout -b your-bugfix-feature-branch-name master
    

    If you need some help with Git, follow this quick start guide: https://git.wiki.kernel.org/index.php/QuickStart

  3. Set up a [conda](environment) with all necessary dependencies:

    $ conda env create -f ci/environment-py3.8.yml
    
  4. Activate your environment:

    $ conda activate test_env_openclimate
    
  5. Install the openclimate package:

    $ pip install -e . --no-deps
    
  6. Before you modify anything, ensure that the setup works by executing all tests:

    $ pytest
    

    You want to see an output indicating no failures, like this:

    $ ========================== n passed, j warnings in 17.07s ===========================
    
  7. Finally, submit a pull request through the GitHub website using this data:

    head-fork: YOUR_GITHUB_USERNAME/
    compare: your-branch-name
    
    base-fork: Open-Earth-Foundation/OpenClimate-pyclient
    base: master
    

    The merged pull request will undergo the same testing that your local branch had to pass when pushing.

Code Contributors

Indices and tables