OpenClimate Python Client Documentation
OpenClimate is a datastore for emissions data and target pledges. The OpenClimate Python Client is a Python 3.8+ package that provides a high-level interface to the OpenClimate API.
The goal of this package is to make it easier to focus on the analysis by abstracting away the details of making HTTP requests and handling responses.
Note
This is a work in progress. We strongly encourage you to open issues and contribute code.
Installation
# for latest release
pip install openclimate
# for bleeding-edge up-to-date commit
pip install -e git+https://github.com/Open-Earth-Foundation/OpenClimate-pyclient.git
Quickstart Guide
Installation
Install openclimate using pip.
# for latest release
pip install openclimate
# for bleeding-edge up-to-date commit
pip install -e git+https://github.com/Open-Earth-Foundation/OpenClimate-pyclient.git
Once installed, import the package and create a Client() object.
from openclimate import Client
client = Client()
# if using jupyter or iPython
client.jupyter
Note
You need to run client.jupyter for the client package to work properly in Jupyter or iPython.
Emissions
Retrieve all emissions data for a single actor. Here I am retrieving emissions data for Canada
df = client.emissions(actor_id='CA')
Retrieve all emissions data for a list of actors. Here I am retrieving emission data for the United States, Canada, and Great Britain.
df = client.emissions(actor_id=['US','CA','GB'])
Return the different datasets available for a particular actor:
df = client.emissions_datasets(actor_id='US')
Only select data for a particular dataset
df = client.emissions_datasets(actor_id='US', datasource_id='GCB2022:national_fossil_emissions:v1.0')
Targets
Retrieve emissions targets for a particule actor
df = client.targets(actor_id='US')
Population
Retrieve population data.
df = client.population(actor_id=['US','CA','GB'])
GDP
Retrieve GDP data.
df = client.gdp(actor_id=['US','CA','GB'])
Searching for codes
use the following to list the actor_ids for countries:
df = client.country_codes()
search for actor codes:
df = client.search(query='Minnesota')
get all the parts of an actor. Here I am returning the actor_id for each US state.
df =client.parts(actor_id='US',part_type='adm1')
Autogenerated API
- class openclimate.ActorOverview.ActorOverview(version: str = '/api/v1', base_url: str = 'https://openclimate.openearth.dev', server: str = 'https://openclimate.openearth.dev/api/v1')[source]
ActorOveriew API class get overview information of an actor
- Returns:
object
Methods
country_codes
([like, case_sensitive, regex])returns two-letter country codes
overview
(actor_id[, ignore_warnings])Retretive actor overview
parts
(actor_id[, part_type])Retreive actor parts (e.g.
- country_codes(like: Optional[str] = None, case_sensitive: bool = False, regex: bool = True, *args, **kwargs) DataFrame [source]
returns two-letter country codes
- Parameters:
like (str, optional) – filters names. Defaults to None.
case_sensitive (bool, optional) – make search case-senstive. Defaults to False.
regex (bool, optional) – use regular expression like phrases. Defaults to True.
- Returns:
pd.DataFrame
- overview(actor_id: Union[str, List[str], Tuple[str]], ignore_warnings: bool = False)[source]
Retretive actor overview
- Parameters:
actor_id (Union[str, List[str], Tuple[str]]) – actor identifier. Defaults to None.
ignore_warnings (bool) – ignore warning messages
- Returns:
dictionary with actor overview
- Return type:
List[Dict]
- parts(actor_id: str, part_type: Optional[str] = None, *args, **kwargs) DataFrame [source]
Retreive actor parts (e.g. subnational, cities, …)
- Parameters:
actor_id (str) – code for actor your want to retrieve
part_type (str, optional) – administrative level
- Returns:
data for each emissions dataset
- Return type:
DataFrame
- class openclimate.Base.Base(version: str = '/api/v1', base_url: str = 'https://openclimate.openearth.dev', server: str = 'https://openclimate.openearth.dev/api/v1')[source]
Base API class define HTTP access to API
- Returns:
object
- base_url: str = 'https://openclimate.openearth.dev'
- server: str = 'https://openclimate.openearth.dev/api/v1'
- version: str = '/api/v1'
- class openclimate.Client.Client(version: str = '/api/v1', base_url: str = 'https://openclimate.openearth.dev', server: str = 'https://openclimate.openearth.dev/api/v1')[source]
OpenClimate API Python Client
If you are using Jupyter
either run
`python client = Client() client.jupyter `
or manually add the following lines of code
`python import nest_asyncio nest_asyncio.apply() `
- Attributes:
- jupyter
Methods
country_codes
([like, case_sensitive, regex])get country codes and filter using like regex phrases
emissions
(actor_id[, datasource_id, ...])retreive actor emissions
emissions_datasets
(actor_id[, ignore_warnings])retreive actor emissions datasets
gdp
(actor_id[, ignore_warnings])retreive actor GDP
parts
(actor_id[, part_type])retreive actor parts
population
(actor_id[, ignore_warnings])retreive actor population
search
([name, identifier, query, language, ...])search actor names and identifiers
targets
(actor_id[, ignore_warnings])retreive actor targets
- country_codes(like: Optional[str] = None, case_sensitive: bool = False, regex: bool = True, *args, **kwargs) DataFrame [source]
get country codes and filter using like regex phrases
- Parameters:
like (str) – phrase to search for in name (optional)
case_senstive (bool) – case senstive search [default: False] (optional)
regex (bool) – use regex with like [default: True] (optional)
- Returns:
dataframe of country codes
- Return type:
DataFrame
- emissions(actor_id: str, datasource_id: Optional[str] = None, ignore_warnings: bool = False) DataFrame [source]
retreive actor emissions
- Parameters:
actor_id (str|List[str]) – code for actor your want to retrieve
datasource_id (str) – code emissions dataset
ignore_warnings (bool) – ignore warning messages
- Returns:
data for each emissions dataset
- Return type:
DataFrame
- emissions_datasets(actor_id: str, ignore_warnings: bool = False) DataFrame [source]
retreive actor emissions datasets
- Parameters:
actor_id (str) – code for actor your want to retrieve
ignore_warnings (bool) – ignore warning messages
- Returns:
data of emission datasets
- Return type:
DataFrame
- gdp(actor_id: str, ignore_warnings: bool = False) DataFrame [source]
retreive actor GDP
- Parameters:
actor_id (str|List[str]) – code for actor your want to retrieve
ignore_warnings (bool) – ignore warning messages
- Returns:
dataframe of GDP
- Return type:
DataFrame
- property jupyter
- parts(actor_id: str, part_type: Optional[str] = None, *args, **kwargs) DataFrame [source]
retreive actor parts
returns subnational, cities, companies, etc. within an actor_id
- Parameters:
actor_id (str|List[str]) – code for actor your want to retrieve
part_type (str) – retrieve actors from administrative part [‘planet’, ‘country’, ‘adm1’, ‘adm2’, ‘city’, ‘organization’, ‘site’]
- Returns:
dataframe of actors parts
- Return type:
DataFrame
- population(actor_id: str, ignore_warnings: bool = False) DataFrame [source]
retreive actor population
- Parameters:
actor_id (str|List[str]) – code for actor your want to retrieve
ignore_warnings (bool) – ignore warning messages
- Returns:
dataframe of population
- Return type:
DataFrame
- search(name: Optional[str] = None, identifier: Optional[str] = None, query: Optional[str] = None, language: Optional[str] = None, namespace: Optional[str] = None, *args, **kwargs) DataFrame [source]
search actor names and identifiers
- Parameters:
query (str) – full search of identifiers and names that include the search parameter
name (str) – searches for actors with exact name match (e.g. “Minnesota”)
language (str, optional) – two letter language code [requires name to be set]
identifier (str) – searches for actors with exact identifier code match (e.g. “US”)
namespace (str, optional) – actor namespace code [requires identifier to be be set]
- Returns:
dataframe of search results
- Return type:
DataFrame
- class openclimate.Emissions.Emissions(version: str = '/api/v1', base_url: str = 'https://openclimate.openearth.dev', server: str = 'https://openclimate.openearth.dev/api/v1')[source]
Methods
datasets
(actor_id[, ignore_warnings])retreive emissions datasets for an actor
emissions
(actor_id[, datasource_id, ...])retrieve actor emissions
- datasets(actor_id: Union[str, List[str], Tuple[str]], ignore_warnings: bool = False, *args, **kwargs) DataFrame [source]
retreive emissions datasets for an actor
- Parameters:
actor_id (Union[str, List[str], Tuple[str]], optional) – actor code
ignore_warnings (bool, optional) – ignore warnings messages
- Return type:
pd.DataFrame
- emissions(actor_id: Union[str, List[str], Tuple[str]], datasource_id: Optional[str] = None, ignore_warnings: bool = False, *args, **kwargs) DataFrame [source]
retrieve actor emissions
- Parameters:
actor_id (Union[str, List[str], Tuple[str]], optional) – actor code
datasource_id (str, optional) – emissions datasource. Defaults to None.
ignore_warnings (bool, optional) – ignore warnings messages
- Returns:
_description_
- Return type:
pd.DataFrame
- class openclimate.GDP.GDP(version: str = '/api/v1', base_url: str = 'https://openclimate.openearth.dev', server: str = 'https://openclimate.openearth.dev/api/v1')[source]
Methods
gdp
(actor_id[, ignore_warnings])retreive actor GDP
- class openclimate.Population.Population(version: str = '/api/v1', base_url: str = 'https://openclimate.openearth.dev', server: str = 'https://openclimate.openearth.dev/api/v1')[source]
Methods
population
(actor_id[, ignore_warnings])retreive actor population
- population(actor_id: Union[str, List[str], Tuple[str]], ignore_warnings: bool = False, *args, **kwargs) DataFrame [source]
retreive actor population
- Parameters:
actor_id (Union[str, List[str], Tuple[str]], optional) – actor code
ignore_warnings (bool) – ignore warning messages
- Return type:
pd.DataFrame
- class openclimate.Search.Search(version: str = '/api/v1', base_url: str = 'https://openclimate.openearth.dev', server: str = 'https://openclimate.openearth.dev/api/v1')[source]
Methods
search
([name, identifier, query, language, ...])search actors
- search(name: Optional[str] = None, identifier: Optional[str] = None, query: Optional[str] = None, language: Optional[str] = None, namespace: Optional[str] = None, *args, **kwargs) DataFrame [source]
search actors
- Parameters:
query (str) – full search of identifiers and names that include the search parameter
name (str) – searches for actors with exact name match (e.g. “Minnesota”)
language (str, optional) – two letter language code [requires name to be set]
identifier (str) – searches for actors with exact identifier code match (e.g. “US”)
namespace (str, optional) – actor namespace code [requires identifier to be be set]
- Returns:
dataframe with search results
- Return type:
pd.DataFrame
Emissions and Emissions per capita
In this tutorial I will use openclimate to create a time series emissions and emissions per capita for countries.
[1]:
from itertools import cycle
import matplotlib.pyplot as plt
from matplotlib.ticker import AutoMinorLocator
import numpy as np
import pandas as pd
[2]:
from openclimate import Client
client = Client()
client.jupyter
Let’s start by getting all the country codes
[3]:
df_names = client.parts('EARTH')[['actor_id', 'name']]
actor_ids = tuple(client.parts('EARTH')['actor_id'])
Emissions
Let’s use fossil CO2 emissions from the Global Carbon Budget 2022. You can use client.emissions_datasets()
to list all datasets available. Be a little patient, this takes about 20 seconds to retrieve the data for 250 countries.
[4]:
%%time
df_emissions = client.emissions(actor_ids, 'GCB2022:national_fossil_emissions:v1.0')
CPU times: user 5.49 s, sys: 245 ms, total: 5.73 s
Wall time: 20.5 s
This returns a dataframe with total_emissions
in tonnes of CO2.
[5]:
df_emissions.sample(5)
[5]:
actor_id | year | total_emissions | datasource_id | |
---|---|---|---|---|
167 | AW | 1965 | 592387 | GCB2022:national_fossil_emissions:v1.0 |
227 | TM | 1993 | 27516396 | GCB2022:national_fossil_emissions:v1.0 |
89 | AL | 1981 | 7339621 | GCB2022:national_fossil_emissions:v1.0 |
165 | GB | 1915 | 489481088 | GCB2022:national_fossil_emissions:v1.0 |
211 | RU | 1977 | 1964405077 | GCB2022:national_fossil_emissions:v1.0 |
Lets’s first rank the countries by the their emissions in the most recent year and display the top 10 emitters.
[6]:
year = df_emissions.year.max()
df_recent = (
df_emissions
.loc[df_emissions.year == year]
.assign(rank = lambda x: x['total_emissions'].rank(ascending=False))
.assign(percent_of_global = lambda x: (x['total_emissions'] / x['total_emissions'].sum()) * 100)
.sort_values(by='rank')
.merge(df_names, on='actor_id')
.loc[:, ['rank', 'name', 'actor_id', 'year', 'total_emissions', 'percent_of_global']]
)
df_recent.head(10)
[6]:
rank | name | actor_id | year | total_emissions | percent_of_global | |
---|---|---|---|---|---|---|
0 | 1.0 | China | CN | 2021 | 11472369170 | 31.777308 |
1 | 2.0 | United States of America | US | 2021 | 5007335888 | 13.869816 |
2 | 3.0 | India | IN | 2021 | 2709683624 | 7.505551 |
3 | 4.0 | Russian Federation | RU | 2021 | 1755547389 | 4.862690 |
4 | 5.0 | Japan | JP | 2021 | 1067398435 | 2.956586 |
5 | 6.0 | Iran | IR | 2021 | 748878751 | 2.074319 |
6 | 7.0 | Germany | DE | 2021 | 674753565 | 1.868999 |
7 | 8.0 | Saudi Arabia | SA | 2021 | 672379870 | 1.862425 |
8 | 9.0 | Indonesia | ID | 2021 | 619277532 | 1.715336 |
9 | 10.0 | Korea, the Republic of | KR | 2021 | 616074996 | 1.706466 |
China was responbible for the lion’s share of global CO2 emissions in 2021 at nearly 32%. This is as much as the next 6 countries combined! However, this is just a snapshot in time. Let’s plot time series for each of the top 7 emitters.
[7]:
fig = plt.figure(figsize=(6, 6))
ax = fig.add_subplot(111)
# top 7 emitters
top_emitters = list(df_recent.head(7).actor_id)
# wong color palette (https://davidmathlogic.com/colorblind/#%23D81B60-%231E88E5-%23FFC107-%23004D40)
colors = ['#000000', '#E69F00', '#56B4E9', '#009E73', '#F0E442', '#0072B2', '#D55E00', '#CC79A7']
for actor_id, color in zip(top_emitters, cycle(colors)):
actor_name = df_names.loc[df_names['actor_id'] == actor_id, 'name'].values[0]
filt = df_emissions['actor_id'] == actor_id
df_tmp = df_emissions.loc[filt]
ax.plot(np.array(df_tmp['year']), np.array(df_tmp['total_emissions']) / 10**9,
linewidth=4,
label = actor_name,
color=color)
ylim = [0, 12]
ax.set_ylim(ylim)
ax.set_xlim([1850, 2022])
# Turn off the display of all ticks.
ax.tick_params(which='both', # Options for both major and minor ticks
top='off', # turn off top ticks
left='off', # turn off left ticks
right='off', # turn off right ticks
bottom='off') # turn off bottom ticks
# Remove x tick marks
plt.setp(ax.get_xticklabels(), rotation=0)
# Hide the right and top spines
ax.spines['right'].set_visible(False)
ax.spines['left'].set_visible(False)
ax.spines['top'].set_visible(False)
ax.spines['bottom'].set_visible(False)
# Only show ticks on the left and bottom spines
ax.yaxis.set_ticks_position('left')
ax.xaxis.set_ticks_position('bottom')
# major/minor tick lines
ax.xaxis.set_minor_locator(AutoMinorLocator(5))
ax.grid(axis='y',
which='major',
color=[0.8, 0.8, 0.8], linestyle='-')
ax.set_ylabel("Emissions (GCO$_2$e)", fontsize=12)
ax.legend(loc='upper left', frameon=False)

This tells a richer story. Now we see the US was the main annual contributor up until about the year 2000. After which, Chinese emissions sky rocketed while US emissions started declining.
An interesting feature in this graph is the dramatic drop in Russian emissions. This corresponds to the fall of the Soviet Union, which led to a huge drop in emissions. Key drivers of the emissions reductions were the decreasing beef consumption in the 1990s and carbon sequestration in soils on abandoned cropland Schierhorn et al., (2019).
Emissions per capita
Let’s retrieve population data and calulate emissions per capita for the seven countries with the highest annual emissions.
[8]:
# emissions for top seven countries with highest annual emissions in 2021
df_pop = client.population(tuple(df_recent.head(7)['actor_id']))
# calculate emissions per capita
df_percap = pd.merge(df_emissions, df_pop, on=['actor_id', 'year'])[['actor_id', 'year', 'total_emissions', 'population']]
df_percap = df_percap.assign(total_emissions_per_capita = lambda x: x['total_emissions'] / (x['population']))
[9]:
year = df_percap.year.max()
df_recent_percap = (
df_percap
.loc[df_percap.year == year]
.assign(rank = lambda x: x['total_emissions_per_capita'].rank(ascending=False))
.assign(percent_of_global = lambda x: (x['total_emissions'] / x['total_emissions'].sum()) * 100)
.sort_values(by='rank')
.merge(df_names, on='actor_id')
.loc[:, ['rank', 'name', 'actor_id', 'year', 'total_emissions_per_capita', 'percent_of_global']]
)
[10]:
fig = plt.figure(figsize=(6, 6))
ax = fig.add_subplot(111)
# top 7 emitters
top_emitters = list(df_recent_percap.head(7).actor_id)
# wong color palette (https://davidmathlogic.com/colorblind/#%23D81B60-%231E88E5-%23FFC107-%23004D40)
colors = ['#000000', '#E69F00', '#56B4E9', '#009E73', '#F0E442', '#0072B2', '#D55E00', '#CC79A7']
for actor_id, color in zip(top_emitters, cycle(colors)):
actor_name = df_names.loc[df_names['actor_id'] == actor_id, 'name'].values[0]
filt = df_percap['actor_id'] == actor_id
df_tmp = df_percap.loc[filt]
ax.plot(np.array(df_tmp['year']), np.array(df_tmp['total_emissions_per_capita']),
linewidth=4,
label = actor_name,
color=color)
ylim = [0, 30]
ax.set_ylim(ylim)
ax.set_xlim([1950, 2022])
# Turn off the display of all ticks.
ax.tick_params(which='both', # Options for both major and minor ticks
top='off', # turn off top ticks
left='off', # turn off left ticks
right='off', # turn off right ticks
bottom='off') # turn off bottom ticks
# Remove x tick marks
plt.setp(ax.get_xticklabels(), rotation=0)
# Hide the right and top spines
ax.spines['right'].set_visible(False)
ax.spines['left'].set_visible(False)
ax.spines['top'].set_visible(False)
ax.spines['bottom'].set_visible(False)
# Only show ticks on the left and bottom spines
ax.yaxis.set_ticks_position('left')
ax.xaxis.set_ticks_position('bottom')
# major/minor tick lines
ax.xaxis.set_minor_locator(AutoMinorLocator(5))
ax.grid(axis='y',
which='major',
color=[0.8, 0.8, 0.8], linestyle='-')
ax.set_ylabel("Emissions per capita (tCO$_2$e)", fontsize=12)
ax.legend(loc='upper left', frameon=False)

This graph shows that the average person in US emits about double the amount of CO2 annually as the average person in China, despite China having nearly 4 times the US population.
Cumulative emissions
This example will walk through calculating and visulaizing cumulative emissions.
[1]:
from itertools import cycle
import matplotlib.pyplot as plt
from matplotlib.ticker import AutoMinorLocator
from openclimate import Client
import numpy as np
import pandas as pd
We will first initialize a Client()
object.
[2]:
client = Client()
If you are using a jupyter enviornment, you will need to first client.jupyter
. This patches the asyncio
library to work in Jupyter envionrments using nest-asyncio.
[3]:
client.jupyter
Get country codes
OpenClimate references each country by its two-letter ISO-3166 code. To access this in openclimate
we can use the .parts()
method to get all the “parts” of EARTH. Other codes we use are UN/LOCODEs for cities and LEI for companies. As a catch-all term, we call them an
actor_id
.
[4]:
df_country = client.parts('EARTH')
Looking at the dataframe that’s returned, we have a column with each country’s actor_id
.
[5]:
df_country.head()
[5]:
actor_id | name | type | has_data | has_children | children_have_data | |
---|---|---|---|---|---|---|
5 | AD | Andorra | country | True | None | None |
234 | AE | United Arab Emirates | country | True | None | None |
0 | AF | Afghanistan | country | True | None | None |
9 | AG | Antigua and Barbuda | country | True | None | None |
7 | AI | Anguilla | country | True | None | None |
Let’s save just the actor_id
to a list
[6]:
iso_and_name = list(zip(df_country['actor_id'], df_country['name']))
Which datasets are available?
To get a list of datasets available for an actor you can use the .emissions_datasets()
method. Here I am asking for datasets with Candian emissions.
[7]:
client.emissions_datasets('CA')
[7]:
actor_id | datasource_id | name | publisher | published | URL | |
---|---|---|---|---|---|---|
0 | CA | BP:statistical_review_june2022 | Statistical Review of World Energy all data, 1... | BP | 2022-06-01T00:00:00.000Z | https://www.bp.com/en/global/corporate/energy-... |
1 | CA | EDGARv7.0:ghg | Emissions Database for Global Atmospheric Rese... | JRC | 2022-01-01T00:00:00.000Z | https://edgar.jrc.ec.europa.eu/dataset_ghg70 |
2 | CA | GCB2022:national_fossil_emissions:v1.0 | Data supplement to the Global Carbon Budget 20... | GCP | 2022-11-04T00:00:00.000Z | https://www.icos-cp.eu/science-and-impact/glob... |
3 | CA | PRIMAP:10.5281/zenodo.7179775:v2.4 | PRIMAP-hist_v2.4_no_extrap (scenario=HISTCR) | PRIMAP | 2022-10-17T00:00:00.000Z | https://zenodo.org/record/7179775 |
4 | CA | UNFCCC:GHG_ANNEX1:2019-11-08 | UNFCCC GHG total without LULUCF, ANNEX I count... | UNFCCC | 2019-11-08T00:00:00.000Z | https://di.unfccc.int/time_series |
5 | CA | climateTRACE:country_inventory | climate TRACE: country inventory | climate TRACE | 2022-12-02T00:00:00.000Z | https://climatetrace.org/inventory |
6 | CA | WRI:climate_watch_historical_ghg:2022 | Climate Watch Historical GHG Emissions | WRI | 2022-01-01T00:00:00.000Z | https://www.climatewatchdata.org/ghg-emissions |
7 | CA | IEA:GHG_energy_highlights:2022 | Greenhouse Gas Emissions from Energy Highlights | IEA | 2022-09-01T00:00:00.000Z | https://www.iea.org/data-and-statistics/data-p... |
You can return datasets for multiple actors at once by passing them as a callable, such as a list or tuple. Here I am asking for Canadian and Italian emission datasets, but only returning a sample of 5 records.
[8]:
client.emissions_datasets(['CA', 'IT']).sample(5)
[8]:
actor_id | datasource_id | name | publisher | published | URL | |
---|---|---|---|---|---|---|
7 | CA | IEA:GHG_energy_highlights:2022 | Greenhouse Gas Emissions from Energy Highlights | IEA | 2022-09-01T00:00:00.000Z | https://www.iea.org/data-and-statistics/data-p... |
17 | IT | IEA:GHG_energy_highlights:2022 | Greenhouse Gas Emissions from Energy Highlights | IEA | 2022-09-01T00:00:00.000Z | https://www.iea.org/data-and-statistics/data-p... |
0 | CA | BP:statistical_review_june2022 | Statistical Review of World Energy all data, 1... | BP | 2022-06-01T00:00:00.000Z | https://www.bp.com/en/global/corporate/energy-... |
14 | IT | climateTRACE:country_inventory | climate TRACE: country inventory | climate TRACE | 2022-12-02T00:00:00.000Z | https://climatetrace.org/inventory |
5 | CA | climateTRACE:country_inventory | climate TRACE: country inventory | climate TRACE | 2022-12-02T00:00:00.000Z | https://climatetrace.org/inventory |
Get emissions
If we just pass an actor_id
to the .emissions()
method, all the emissions will be returned.
[9]:
df_tmp = client.emissions(actor_id='US')
df_tmp.head()
[9]:
actor_id | year | total_emissions | datasource_id | |
---|---|---|---|---|
0 | US | 1990 | 5275397531 | BP:statistical_review_june2022 |
1 | US | 1991 | 5225911642 | BP:statistical_review_june2022 |
2 | US | 1992 | 5308410257 | BP:statistical_review_june2022 |
3 | US | 1993 | 5412149078 | BP:statistical_review_june2022 |
4 | US | 1994 | 5505379237 | BP:statistical_review_june2022 |
Keep in mind that this will return all the data for that actor. Below are the datasets available.
[10]:
set(df_tmp['datasource_id'])
[10]:
{'BP:statistical_review_june2022',
'EDGARv7.0:ghg',
'GCB2022:national_fossil_emissions:v1.0',
'IEA:GHG_energy_highlights:2022',
'PRIMAP:10.5281/zenodo.7179775:v2.4',
'UNFCCC:GHG_ANNEX1:2019-11-08',
'WRI:climate_watch_historical_ghg:2022',
'carbon_monitor:2022_12_14',
'climateTRACE:country_inventory'}
In most cases, we want to filter this and use a particular dataset. We can do that with the datasource_id
parameter.
[11]:
df_tmp = client.emissions(actor_id='US', datasource_id='PRIMAP:10.5281/zenodo.7179775:v2.4')
As a sanity check, let’s look at which datasets are returned
[12]:
set(df_tmp['datasource_id'])
[12]:
{'PRIMAP:10.5281/zenodo.7179775:v2.4'}
As you see, only PRIMAP was returned.
Get emissions for all countries
Now let’s get emissions for all countries
[13]:
%%time
iso_codes = [iso_code[0] for iso_code in iso_and_name]
df_emissions = client.emissions(
actor_id=iso_codes,
datasource_id='PRIMAP:10.5281/zenodo.7179775:v2.4'
)
CPU times: user 5.44 s, sys: 277 ms, total: 5.71 s
Wall time: 20.1 s
This takes about 30 seconds to retrieve all that data, even with asyncio
working behind the scenes. This outputs a massive dataframe with the data from all countries concatenated together
[14]:
df_emissions.sample(5)
[14]:
actor_id | year | total_emissions | datasource_id | |
---|---|---|---|---|
347 | BW | 1925 | 1330000 | PRIMAP:10.5281/zenodo.7179775:v2.4 |
484 | ID | 1967 | 108000000 | PRIMAP:10.5281/zenodo.7179775:v2.4 |
308 | FR | 1751 | 34100000 | PRIMAP:10.5281/zenodo.7179775:v2.4 |
435 | AW | 1961 | 1310000 | PRIMAP:10.5281/zenodo.7179775:v2.4 |
431 | HR | 1925 | 2540000 | PRIMAP:10.5281/zenodo.7179775:v2.4 |
Calculate cumulative emissions
let’s first make sure all the datasets have the same starting year
[15]:
all([df_emissions.loc[df_emissions['actor_id']==iso_code, 'year'].min() for iso_code in set(df_emissions['actor_id'])])
[15]:
True
Now we can calculate cumulative emissions
[16]:
df_out = df_emissions.assign(cumulative_emissions = df_emissions.groupby('actor_id')['total_emissions'].cumsum())
Now we have a column for cumulative emissions
[17]:
df_out.head()
[17]:
actor_id | year | total_emissions | datasource_id | cumulative_emissions | |
---|---|---|---|---|---|
32 | AD | 1750 | 3740 | PRIMAP:10.5281/zenodo.7179775:v2.4 | 3740 |
33 | AD | 1751 | 3750 | PRIMAP:10.5281/zenodo.7179775:v2.4 | 7490 |
34 | AD | 1752 | 3760 | PRIMAP:10.5281/zenodo.7179775:v2.4 | 11250 |
35 | AD | 1753 | 3770 | PRIMAP:10.5281/zenodo.7179775:v2.4 | 15020 |
36 | AD | 1754 | 3780 | PRIMAP:10.5281/zenodo.7179775:v2.4 | 18800 |
Rank country by cumulative emissions
Now that we now the cumulative emission, we can rank the countries by the cumulative emissions in the most recent year.
[18]:
last_year = df_out['year'].max()
df_sorted = (
df_out.loc[df_out['year'] == last_year, ['actor_id', 'cumulative_emissions', 'year']]
.sort_values(by='cumulative_emissions', ascending=False)
)
df_sorted['rank'] = df_sorted['cumulative_emissions'].rank(ascending=False)
Here are the top 10 cumulative emitters
[19]:
pd.merge(df_sorted.loc[df_sorted['rank'] <= 10], df_country[['actor_id', 'name']], on='actor_id')
[19]:
actor_id | cumulative_emissions | year | rank | name | |
---|---|---|---|---|---|
0 | US | 561240060000 | 2021 | 1.0 | United States of America |
1 | CN | 375048000000 | 2021 | 2.0 | China |
2 | RU | 179731600000 | 2021 | 3.0 | Russian Federation |
3 | IN | 132717000000 | 2021 | 4.0 | India |
4 | DE | 117760000000 | 2021 | 5.0 | Germany |
5 | GB | 104375500000 | 2021 | 6.0 | United Kingdom of Great Britain and Northern I... |
6 | JP | 78204570000 | 2021 | 7.0 | Japan |
7 | FR | 64192400000 | 2021 | 8.0 | France |
8 | UA | 52563900000 | 2021 | 9.0 | Ukraine |
9 | BR | 47231630000 | 2021 | 10.0 | Brazil |
The United States and China are the top two emitters, with the U.S. emitting about 50% more emissions than China over the period from 1750 to 2021.
[20]:
561240060000 / 375048000000
[20]:
1.4964486145773341
Plot cumulative emissions
Now that we know the top emitters, we can plot a time series
[21]:
fig = plt.figure(figsize=(6, 6))
ax = fig.add_subplot(111)
# top 8 emitters
top_emitters = list(df_sorted.head(8).actor_id)
# wong color palette (https://davidmathlogic.com/colorblind/#%23D81B60-%231E88E5-%23FFC107-%23004D40)
colors = ['#000000', '#E69F00', '#56B4E9', '#009E73', '#F0E442', '#0072B2', '#D55E00', '#CC79A7']
for actor_id, color in zip(top_emitters, cycle(colors)):
actor_name = df_country.loc[df_country['actor_id'] == actor_id, 'name'].values[0]
filt = df_out['actor_id'] == actor_id
df_tmp = df_out.loc[filt]
ax.plot(np.array(df_tmp['year']), np.array(df_tmp['cumulative_emissions']) / 10**9,
linewidth=4,
label = actor_name,
color=color)
ylim = [0, 600]
ax.set_ylim(ylim)
ax.set_xlim([1850, 2022])
# Turn off the display of all ticks.
ax.tick_params(which='both', # Options for both major and minor ticks
top='off', # turn off top ticks
left='off', # turn off left ticks
right='off', # turn off right ticks
bottom='off') # turn off bottom ticks
# Remove x tick marks
plt.setp(ax.get_xticklabels(), rotation=0)
# Hide the right and top spines
ax.spines['right'].set_visible(False)
ax.spines['left'].set_visible(False)
ax.spines['top'].set_visible(False)
ax.spines['bottom'].set_visible(False)
# Only show ticks on the left and bottom spines
ax.yaxis.set_ticks_position('left')
ax.xaxis.set_ticks_position('bottom')
# major/minor tick lines
ax.xaxis.set_minor_locator(AutoMinorLocator(5))
ax.grid(axis='y',
which='major',
color=[0.8, 0.8, 0.8], linestyle='-')
ax.set_ylabel("Cumulative Emissions (GtCO$_2$e)", fontsize=12)
ax.legend(loc='upper left', frameon=False)

Great Britain emissions
Here we will explore Great Britain’s emissions and ask whether or not they are on track to meeting their emissions pledges. “On track” will be defined as their current emissions being less than that of a scenario with uniform decrease in emissions.
[1]:
import matplotlib.pyplot as plt
from matplotlib.ticker import AutoMinorLocator
import numpy as np
import pandas as pd
from openclimate import Client
[2]:
# create an openclimate Client object
client = Client()
client.jupyter
Get data
We will the emissions data from UNFCCC to perform this analysis.
[3]:
actor_id = 'GB'
client.emissions_datasets(actor_id)
[3]:
actor_id | datasource_id | name | publisher | published | URL | |
---|---|---|---|---|---|---|
0 | GB | BP:statistical_review_june2022 | Statistical Review of World Energy all data, 1... | BP | 2022-06-01T00:00:00.000Z | https://www.bp.com/en/global/corporate/energy-... |
1 | GB | EDGARv7.0:ghg | Emissions Database for Global Atmospheric Rese... | JRC | 2022-01-01T00:00:00.000Z | https://edgar.jrc.ec.europa.eu/dataset_ghg70 |
2 | GB | GCB2022:national_fossil_emissions:v1.0 | Data supplement to the Global Carbon Budget 20... | GCP | 2022-11-04T00:00:00.000Z | https://www.icos-cp.eu/science-and-impact/glob... |
3 | GB | PRIMAP:10.5281/zenodo.7179775:v2.4 | PRIMAP-hist_v2.4_no_extrap (scenario=HISTCR) | PRIMAP | 2022-10-17T00:00:00.000Z | https://zenodo.org/record/7179775 |
4 | GB | UNFCCC:GHG_ANNEX1:2019-11-08 | UNFCCC GHG total without LULUCF, ANNEX I count... | UNFCCC | 2019-11-08T00:00:00.000Z | https://di.unfccc.int/time_series |
5 | GB | carbon_monitor:2022_12_14 | Carbon Monitor country CO2 emissions by sector | Carbon Monitor | 2022-12-14T00:00:00.000Z | https://carbonmonitor.org/ |
6 | GB | climateTRACE:country_inventory | climate TRACE: country inventory | climate TRACE | 2022-12-02T00:00:00.000Z | https://climatetrace.org/inventory |
7 | GB | WRI:climate_watch_historical_ghg:2022 | Climate Watch Historical GHG Emissions | WRI | 2022-01-01T00:00:00.000Z | https://www.climatewatchdata.org/ghg-emissions |
8 | GB | openGHGmap:R2021A | European OpenGHGMap | NTNU | 2021-01-01T00:00:00.000Z | https://openghgmap.net/data/ |
9 | GB | BEIS:UK_regional_GHG:2022-06-30 | UK local authority and regional greenhouse gas... | BEIS | 2022-06-30T00:00:00.000Z | https://www.gov.uk/government/statistics/uk-lo... |
10 | GB | IEA:GHG_energy_highlights:2022 | Greenhouse Gas Emissions from Energy Highlights | IEA | 2022-09-01T00:00:00.000Z | https://www.iea.org/data-and-statistics/data-p... |
[4]:
emissions_datasource = 'UNFCCC:GHG_ANNEX1:2019-11-08'
df_gb = client.emissions(actor_id=actor_id, datasource_id=emissions_datasource)
df_ndc = client.targets(actor_id=actor_id)
# convert tonnes to megatonnes
df_gb['total_emissions'] = df_gb['total_emissions'] / 10**6
# filter ndc by target
filt = df_ndc['datasource_id']=='IGES:NDC_db:10.57405/iges-5005'
df_ndc = df_ndc.loc[filt]
The UK has pledged to reduce their emissions by 68% from 1990 levels by 2030.
[5]:
df_ndc
[5]:
actor_id | target_type | baseline_year | baseline_value | target_year | target_value | target_unit | datasource_id | |
---|---|---|---|---|---|---|---|---|
0 | GB | Absolute emission reduction | 1990.0 | None | 2030 | 68 | percent | IGES:NDC_db:10.57405/iges-5005 |
A quick look at their emissions we see that Great Britain’s emissions have been decreasing for the last thirty years. But this brings up a a couple questions: - Is GB “on-track” to meeting this goal? - Will GB meet their goal if this long-term trend continues?
[6]:
plt.plot(np.array(df_gb['year'], dtype='float64'), np.array(df_gb['total_emissions']))
plt.xlabel('Year')
plt.ylabel('GB Annual Emissions [MtCO$_2$-eq]')
[6]:
Text(0, 0.5, 'GB Annual Emissions [MtCO$_2$-eq]')

Is Great Britain on track?
To answer this question, we will simply ask if the current emissions are less than if the emissions decreased uniformly from baseline to their goal. We will also ask whether GB will meet their goal if the long-term trend contintues. Keep in mind that both of these are crude and imperfect metrics. More sophistiated approaches including using integrated assessment models (IAMs) that incorporate proposed actions.
[7]:
# implementation normal equations for ordinary least squares regression
def linear_eq(df, start_year=None, year_var='year', emissions_var='total_emissions'):
'''simple linear regression'''
filt = df[year_var]>=start_year
x = df.loc[filt, year_var].values
y = df.loc[filt, emissions_var].values
# least-squares linear regression
n = len(x)
sum_x = np.sum(x)
sum_y = np.sum(y)
sum_xy = np.sum(x * y)
sum_xx = np.sum(x * x)
mean_x = np.mean(x)
mean_y = np.mean(y)
# calculate coefficients
b = (n * sum_xy - sum_x * sum_y) / (n * sum_xx - sum_x**2)
a = mean_y - b * mean_x
# Make predictions using the regression line
pred = lambda x: a + b * x
return {'equation':pred, 'slope':b, 'intercept':a}
[8]:
baseline_year = df_ndc['baseline_year'].values[0]
current_year = df_gb['year'].max()
net_zero_year = 2050
target_year = df_ndc['target_year'].values[0]
target_value = int(df_ndc['target_value'])
target_percent = float(df_ndc['target_value'].squeeze())/100
pred = linear_eq(df_gb, start_year=baseline_year)
X_pred = np.arange(baseline_year, target_year + 1)
Y_pred = pred['equation'](X_pred)
# get baseline and target emissions
filt = df_gb['year'] == baseline_year
baseline_emissions = df_gb.loc[filt,'total_emissions']
target_emissions = df_gb.loc[filt,'total_emissions'] * (100 - target_value)/100
net_zero_emissions = 0
# current emissions
filt = df_gb['year'] == current_year
current_emissions = df_gb.loc[filt,'total_emissions']
# average annual reduction needed to achieve goal:
avg_rate = round(((baseline_emissions - target_emissions) / (target_year - baseline_year + 1)).values[0])
nz_rate = round(((baseline_emissions - net_zero_emissions) / (net_zero_year - baseline_year + 1)).values[0])
year_target_achieved = round((target_emissions - pred['intercept']) / pred['slope'])
print(f"To acheive goal, average rate of reduction needs to be {abs(avg_rate):.0f} MT/yr")
print(f"To acheive net-zero goal, average rate of reduction needs to be {abs(nz_rate):.0f} MT/yr")
print(f"GB reducing emissions by about {abs(pred['slope']):.0f} MT/yr")
print(f'Target emissions of {int(target_emissions.values)} MT/yr will be acheived around {int(year_target_achieved.values)}')
To acheive goal, average rate of reduction needs to be 13 MT/yr
To acheive net-zero goal, average rate of reduction needs to be 13 MT/yr
GB reducing emissions by about 12 MT/yr
Target emissions of 255 MT/yr will be acheived around 2037
/tmp/ipykernel_2165/3089394390.py:5: FutureWarning: Calling int on a single element Series is deprecated and will raise a TypeError in the future. Use int(ser.iloc[0]) instead
target_value = int(df_ndc['target_value'])
From this quick analysis, we can see that GB maybe be slightly off track to meeting their goals
Plot of emissions and pledges
[9]:
fig = plt.figure(figsize=(6, 6))
ax = fig.add_subplot(111)
ax.plot(np.array(df_gb['year']), np.array(df_gb['total_emissions']),
linewidth=4,
label='Great Britian annual emissions',
color=[0.0,0.0,0.0])
ax.plot(X_pred, Y_pred, '--',
linewidth=2,
color=[0.6,0.6,0.6],
label='Linear trend')
ax.plot(np.array([baseline_year, float(target_year)]), np.array([float(baseline_emissions), float(target_emissions)]),
'-',
linewidth=2,
color=[0.6,0.6,0.6],
label='linear decrease')
ax.plot([df_ndc['baseline_year'], df_ndc['target_year']],
[target_emissions, target_emissions],
'-.',
linewidth=1,
color=[0.5,0.5,0.5],
label='target emissions')
ylim = [200, 850]
ax.set_ylim(ylim)
ax.set_xlim([1990, 2035])
# Turn off the display of all ticks.
ax.tick_params(which='both', # Options for both major and minor ticks
top='off', # turn off top ticks
left='off', # turn off left ticks
right='off', # turn off right ticks
bottom='off') # turn off bottom ticks
# Remove x tick marks
plt.setp(ax.get_xticklabels(), rotation=0)
# Hide the right and top spines
ax.spines['right'].set_visible(False)
ax.spines['left'].set_visible(False)
ax.spines['top'].set_visible(False)
ax.spines['bottom'].set_visible(False)
# Only show ticks on the left and bottom spines
ax.yaxis.set_ticks_position('left')
ax.xaxis.set_ticks_position('bottom')
# major/minor tick lines
ax.xaxis.set_minor_locator(AutoMinorLocator(5))
ax.grid(axis='y',
which='major',
color=[0.8, 0.8, 0.8], linestyle='-')
bline_emissions = baseline_emissions.values[0]
ylim_achieved = [(bline_emissions - ylim[0])/ (bline_emissions*target_percent)*100,
(bline_emissions - ylim[1])/ (bline_emissions*target_percent)*100]
ax2 = ax.twinx()
ax2.set_ylim(ylim_achieved)
# Hide the right and top spines
ax2.spines['right'].set_visible(False)
ax2.spines['left'].set_visible(False)
ax2.spines['top'].set_visible(False)
ax2.spines['bottom'].set_visible(False)
# Only show ticks on the left and bottom spines
ax2.yaxis.set_ticks_position('right')
ax2.xaxis.set_ticks_position('bottom')
ax2.yaxis.set_tick_params(size=0)
# Set the y-axis tick labels using a FixedFormatter
vals = ax2.get_yticks()
ax2.yaxis.set_major_locator(plt.FixedLocator(vals))
ax2.set_yticklabels([f"{int(x)}%" for x in vals])
ax2.set_ylabel("Percent achieved", fontsize=12)
ax.set_ylabel("Emissions (MtCO$_2$-eq)", fontsize=12)
ax.legend(loc='upper right', frameon=False)
/tmp/ipykernel_2165/3965963511.py:14: FutureWarning: Calling float on a single element Series is deprecated and will raise a TypeError in the future. Use float(ser.iloc[0]) instead
ax.plot(np.array([baseline_year, float(target_year)]), np.array([float(baseline_emissions), float(target_emissions)]),
[9]:
<matplotlib.legend.Legend at 0x7f6bc950bee0>

Now let’s do the same for net zero
[10]:
pred = linear_eq(df_gb, start_year=baseline_year)
X_pred = np.arange(baseline_year, net_zero_year + 1)
Y_pred = pred['equation'](X_pred)
[11]:
fig = plt.figure(figsize=(6, 6))
ax = fig.add_subplot(111)
ax.plot(np.array(df_gb['year']), np.array(df_gb['total_emissions']),
linewidth=4,
label='Great Britian annual emissions',
color=[0.0,0.0,0.0])
ax.plot(X_pred, Y_pred, '--',
linewidth=2,
color=[0.6,0.6,0.6],
label='Linear trend')
ax.plot(np.array((baseline_year, float(net_zero_year))), np.array((float(baseline_emissions), net_zero_emissions)),
'-',
linewidth=2,
color=[0.6,0.6,0.6],
label='linear decrease')
ylim = [0, 850]
ax.set_ylim(ylim)
ax.set_xlim([1990, 2055])
# Turn off the display of all ticks.
ax.tick_params(which='both', # Options for both major and minor ticks
top='off', # turn off top ticks
left='off', # turn off left ticks
right='off', # turn off right ticks
bottom='off') # turn off bottom ticks
# Remove x tick marks
plt.setp(ax.get_xticklabels(), rotation=0)
# Hide the right and top spines
ax.spines['right'].set_visible(False)
ax.spines['left'].set_visible(False)
ax.spines['top'].set_visible(False)
ax.spines['bottom'].set_visible(False)
# Only show ticks on the left and bottom spines
ax.yaxis.set_ticks_position('left')
ax.xaxis.set_ticks_position('bottom')
# major/minor tick lines
ax.xaxis.set_minor_locator(AutoMinorLocator(5))
ax.grid(axis='y',
which='major',
color=[0.8, 0.8, 0.8], linestyle='-')
bline_emissions = baseline_emissions.values[0]
ylim_achieved = [(bline_emissions - ylim[0])/ (bline_emissions*target_percent)*100,
(bline_emissions - ylim[1])/ (bline_emissions*target_percent)*100]
ax.set_ylabel("Emissions (MtCO$_2$e)", fontsize=12)
ax.legend(loc='upper right', frameon=False)
/tmp/ipykernel_2165/2911775156.py:14: FutureWarning: Calling float on a single element Series is deprecated and will raise a TypeError in the future. Use float(ser.iloc[0]) instead
ax.plot(np.array((baseline_year, float(net_zero_year))), np.array((float(baseline_emissions), net_zero_emissions)),
[11]:
<matplotlib.legend.Legend at 0x7f6bc949b0a0>

In general, the emissions have been trending in the right direction over the past 30 years, but emissions will have to decline at an increasing rate in order to achieve these goals.
Canada Emissions Breakdown
In this notebook we will explore a territorial breakdown of Canadian emissions into provinces using data submitted to UNFCCC.
[1]:
from itertools import cycle
import matplotlib.pyplot as plt
from matplotlib.ticker import AutoMinorLocator
import pandas as pd
[2]:
from openclimate import Client
client = Client()
client.jupyter
Let’s display all of dataset’s available for Canada
[3]:
client.emissions_datasets('CA')
[3]:
actor_id | datasource_id | name | publisher | published | URL | |
---|---|---|---|---|---|---|
0 | CA | BP:statistical_review_june2022 | Statistical Review of World Energy all data, 1... | BP | 2022-06-01T00:00:00.000Z | https://www.bp.com/en/global/corporate/energy-... |
1 | CA | EDGARv7.0:ghg | Emissions Database for Global Atmospheric Rese... | JRC | 2022-01-01T00:00:00.000Z | https://edgar.jrc.ec.europa.eu/dataset_ghg70 |
2 | CA | GCB2022:national_fossil_emissions:v1.0 | Data supplement to the Global Carbon Budget 20... | GCP | 2022-11-04T00:00:00.000Z | https://www.icos-cp.eu/science-and-impact/glob... |
3 | CA | PRIMAP:10.5281/zenodo.7179775:v2.4 | PRIMAP-hist_v2.4_no_extrap (scenario=HISTCR) | PRIMAP | 2022-10-17T00:00:00.000Z | https://zenodo.org/record/7179775 |
4 | CA | UNFCCC:GHG_ANNEX1:2019-11-08 | UNFCCC GHG total without LULUCF, ANNEX I count... | UNFCCC | 2019-11-08T00:00:00.000Z | https://di.unfccc.int/time_series |
5 | CA | climateTRACE:country_inventory | climate TRACE: country inventory | climate TRACE | 2022-12-02T00:00:00.000Z | https://climatetrace.org/inventory |
6 | CA | WRI:climate_watch_historical_ghg:2022 | Climate Watch Historical GHG Emissions | WRI | 2022-01-01T00:00:00.000Z | https://www.climatewatchdata.org/ghg-emissions |
7 | CA | IEA:GHG_energy_highlights:2022 | Greenhouse Gas Emissions from Energy Highlights | IEA | 2022-09-01T00:00:00.000Z | https://www.iea.org/data-and-statistics/data-p... |
Now let’s gather emissions data for Canada from UNFCCC as well as emissions for each province from ECCC. Finally, let’s gather population data for Canada and its provinces from 2017 (this is the most recent year we have emissions for all provinces).
[4]:
# canadian emissions
df_ca = client.emissions('CA', 'UNFCCC:GHG_ANNEX1:2019-11-08')
# canadian province names
df_parts = client.parts('CA', part_type='adm1')[['actor_id', 'name']]
# province emissions
df_prov = client.emissions(df_parts.actor_id, 'ECCC:GHG_inventory:2022-04-13')
# candian population in 2017
df_ca_pop = (
client.population('CA')
.loc[lambda x: x['year'] == 2017, ['actor_id', 'year', 'population']]
)
# province population in 2017
df_prov_pop = (
client.population(df_parts.actor_id)
.loc[lambda x: x['year'] == 2017, ['actor_id', 'year', 'population']]
)
Now let’s select Canadian emissions and provincial emissions for 2020. I know it’s not the same year as population, but let’s assum popluation hasn’t changed signifcantly in three years. We are also going to convert to megatonnes of CO2 equivalents by dividing by dividing the emissions by a million.
[5]:
# national emissions in MTCO2e
national = df_ca.loc[df_ca['year'] == 2020, 'total_emissions'].values / 10**6
# province emissions and cumulative emissions in MTCO2e
df_out = (
df_prov
.loc[df_prov['year'] == 2020, ['total_emissions', 'actor_id']]
.assign(total_emissions= lambda x: x['total_emissions'].div(10**6), inplace=True)
.assign(percent_of_national = lambda x: (x['total_emissions'] / national) * 100)
.sort_values(by='percent_of_national', ascending=False)
.assign(cumulative = lambda x: x['percent_of_national'].cumsum())
.merge(df_parts, on='actor_id')
.merge(df_prov_pop, on='actor_id')
.assign(percent_of_population = lambda x: (x['population'] / x['population'].sum()) * 100)
.rename(columns={'total_emissions': 'total_emissions_[MTCO2e]'})
.loc[:, ['actor_id', 'name', 'total_emissions_[MTCO2e]', 'percent_of_national', 'cumulative', 'population', 'percent_of_population']]
)
Now let’s display a table showing the province, emissions breakdown, and population.
[6]:
df_out
[6]:
actor_id | name | total_emissions_[MTCO2e] | percent_of_national | cumulative | population | percent_of_population | |
---|---|---|---|---|---|---|---|
0 | CA-AB | Alberta | 256.459542 | 38.143528 | 38.143528 | 4306039 | 11.674212 |
1 | CA-ON | Ontario | 149.584918 | 22.247940 | 60.391467 | 14279196 | 38.712694 |
2 | CA-QC | Quebec | 76.241175 | 11.339439 | 71.730907 | 8425996 | 22.843933 |
3 | CA-SK | Saskatchewan | 65.894159 | 9.800515 | 81.531422 | 1168057 | 3.166749 |
4 | CA-BC | British Columbia | 61.746788 | 9.183672 | 90.715094 | 4841078 | 13.124770 |
5 | CA-MB | Manitoba | 21.674064 | 3.223609 | 93.938703 | 1343371 | 3.642047 |
6 | CA-NS | Nova Scotia | 14.596446 | 2.170946 | 96.109649 | 957600 | 2.596174 |
7 | CA-NB | New Brunswick | 12.440907 | 1.850351 | 97.960000 | 760868 | 2.062809 |
8 | CA-NL | Newfoundland and Labrador | 9.500844 | 1.413072 | 99.373072 | 528430 | 1.432640 |
9 | CA-PE | Prince Edward Island | 1.609972 | 0.239453 | 99.612525 | 152784 | 0.414217 |
10 | CA-NT | Northwest Territories | 1.401465 | 0.208442 | 99.820966 | 44718 | 0.121236 |
11 | CA-NU | Nunavut | 0.602920 | 0.089673 | 99.910639 | 38243 | 0.103682 |
12 | CA-YT | Yukon | 0.600610 | 0.089329 | 99.999969 | 38669 | 0.104837 |
This shows that 5 provines (Albera, Ontario, Quebec, Saskatchewan, and British Columbia) make up over 90% of Canada’s overall emissions in 2020. Alberta, while only accounting for about 12% of Canada’s population contributes to 38% of the nation’s emissions. This is largely driven by oil/gas and agriculture sectors.
Finally, let’s make a first attempt at visulaizating this breakdow. This bar graph is clunky and could be improved.
[7]:
fig = plt.figure(figsize=(6, 6))
ax = fig.add_subplot(111)
previous = 0
for iterator, row in df_out.iterrows():
emissions = row['percent_of_national']
cumulative = row['cumulative']
actor_id = row['actor_id']
ax.bar(1, emissions, bottom=previous, label=actor_id)
previous = cumulative
ax.text(1.5, previous - (emissions/2),
actor_id,
fontsize=12,
color='k')
# Turn off the display of all ticks.
ax.tick_params(which='both', # Options for both major and minor ticks
top='off', # turn off top ticks
left='off', # turn off left ticks
right='off', # turn off right ticks
bottom='off') # turn off bottom ticks
# Remove x tick marks
plt.setp(ax.get_xticklabels(), rotation=0)
# Hide the right and top spines
ax.spines['right'].set_visible(False)
ax.spines['left'].set_visible(False)
ax.spines['top'].set_visible(False)
ax.spines['bottom'].set_visible(False)
# Only show ticks on the left and bottom spines
ax.yaxis.set_ticks_position('left')
ax.xaxis.set_ticks_position('bottom')
# grid and tick marks
ax.set_yticks(np.arange(10, 110, 10))
ax.grid(axis='y',
which='major',
color=[0.8, 0.8, 0.8], linestyle='-')
ax.set_axisbelow(True)
ax.set_xticks([])
ax.set_title("Territorial Breakdown of Canada's 2020 emissions")
ax.set_ylabel("% of national emissions")
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[7], line 37
34 ax.xaxis.set_ticks_position('bottom')
36 # grid and tick marks
---> 37 ax.set_yticks(np.arange(10, 110, 10))
38 ax.grid(axis='y',
39 which='major',
40 color=[0.8, 0.8, 0.8], linestyle='-')
42 ax.set_axisbelow(True)
NameError: name 'np' is not defined

Canada Target Gap
Are provincial pledges adequate enough to achieve Canada’s NDC goal?
Each party to the Paris Agreement creates a nationally determined contribution (NDC) or intended nationally determined contribution (INDC), these non-binding national plans highlight climate change mitigation, including climate-related targets for greenhouse gas emission reductions. Non-state actors on the other hand, are not formally recognized in the Paris Agreement’s global stocktake. Actions at subnational level are integral to the success of the Paris Agreement. Some non-state actors create climate plans with pledged emission targets. For instance, 11 of the 13 Canadian provinces/territories have pledged emissions targets.
This notebook will explore if the provincial pledges are enough to meet Canada’s NDC goal. We will use nationally reported data from the UNFCCC and provincial data from ECCC, as well as pledged targets. We find that the provincial pledges are not adequate enough to achieve Canada’s NDC goal. Assuming the emissions from provinces without targets remain constant at pre-pandemic 2019 levels, Canada will be about 167 MtCO2e shy of their NDC goal. As outlined in the AR6 summary for policymakers (SPM), feasible, effective, and low-cost options for mitigation and adaptation are already available.
[1]:
from itertools import cycle
import matplotlib.pyplot as plt
from matplotlib.ticker import AutoMinorLocator
import numpy as np
import openclimate as oc
import pandas as pd
[2]:
def get_emissions(part, data_id=None):
data_id = 'ECCC:GHG_inventory:2022-04-13' if data_id is None else data_id
try:
return client.emissions(actor_id=part, datasource_id=data_id)
except:
return None
def get_target(part, year, data_id = None):
data_id = 'C2ES:canadian_GHG_targets' if data_id is None else data_id
try:
part_targets = (
client.targets(actor_id = part, ignore_warnings=True)
.loc[lambda x: x['target_type'] == 'Absolute emission reduction',
['actor_id', 'baseline_year', 'target_year', 'target_value', 'target_unit', 'datasource_id']]
)
part_target = part_targets.loc[part_targets['datasource_id']== data_id]
closest_target = part_targets['target_year'][part_targets['target_year'] >= 2030].min()
cols_out = ['actor_id', 'baseline_year', 'target_year','target_value', 'target_unit']
target = part_targets.loc[part_targets['target_year'] == closest_target, cols_out]
return target
except:
return None
def least_squares_regression(x, y):
# Calculate the slope and intercept using normal equations
X = np.vstack([x, np.ones(len(x))]).T
theta = np.linalg.inv(X.T @ X) @ X.T @ y
slope, intercept = theta[0], theta[1]
predict = lambda x: slope * x + intercept
return {"slope": slope, "intercept": intercept, "equation": predict}
[3]:
# Inititaliate OpenClimate
client = oc.Client()
client.jupyter
Get country emissions and targets
[4]:
iso2 = 'CA'
data_id = 'UNFCCC:GHG_ANNEX1:2019-11-08'
tonnes_to_megatonnes = 1 / 10**6
actor_parts = client.parts(actor_id = iso2, part_type = 'adm1')
df_nat = client.emissions(actor_id = iso2, datasource_id=data_id)
nat_targets = (
client.targets(actor_id = iso2)
.loc[lambda x: x['target_type'] == 'Absolute emission reduction',
['actor_id', 'baseline_year', 'target_year', 'target_value', 'target_unit']]
)
df_target = nat_targets.drop_duplicates().reset_index().iloc[-1]
baseline_year = int(df_target['baseline_year'])
baseline_emissions = float(df_nat.loc[df_nat['year'] == baseline_year, 'total_emissions']) * tonnes_to_megatonnes
target_year = int(df_target['target_year'])
percent = int(df_target['target_value'])
percent_decimal = percent / 100
emissions_cut = baseline_emissions * percent_decimal
target_emissions = baseline_emissions - emissions_cut
data = {
'actor_id': iso2,
'baseline_year': baseline_year,
'baseline_emissions': baseline_emissions,
'target_year': target_year,
'target_emissions':target_emissions,
'emissions_reduction': emissions_cut,
'target_percent': percent
}
national = pd.DataFrame(data, index=[0])
/tmp/ipykernel_1679/3142983122.py:17: FutureWarning: Calling float on a single element Series is deprecated and will raise a TypeError in the future. Use float(ser.iloc[0]) instead
baseline_emissions = float(df_nat.loc[df_nat['year'] == baseline_year, 'total_emissions']) * tonnes_to_megatonnes
[5]:
national
[5]:
actor_id | baseline_year | baseline_emissions | target_year | target_emissions | emissions_reduction | target_percent | |
---|---|---|---|---|---|---|---|
0 | CA | 2005 | 741.182843 | 2030 | 407.650564 | 333.532279 | 45 |
Get province emissions and targets
[6]:
data_raw = []
data_scaled = []
tonnes_to_megatonnes = 1 / 10**6
for part in actor_parts['actor_id']:
data_id = 'ECCC:GHG_inventory:2022-04-13'
year = 2030
df_part = get_emissions(part, data_id)
try:
df_target = get_target(part, year).drop_duplicates().reset_index().iloc[-1]
baseline_year = int(df_target['baseline_year'])
baseline_emissions = float(df_part.loc[df_part['year'] == baseline_year, 'total_emissions']) * tonnes_to_megatonnes
target_year = int(df_target['target_year'])
percent = int(df_target['target_value'])
percent_decimal = percent / 100
emissions_cut = baseline_emissions * percent_decimal
n_years = target_year - baseline_year
emissions_cut_per_year = emissions_cut / n_years
target_emissions = baseline_emissions - emissions_cut
data_raw.append(
{
'actor_id': part,
'baseline_year': baseline_year,
'baseline_emissions': baseline_emissions,
'target_year': target_year,
'target_emissions':target_emissions,
'emissions_reduction': emissions_cut,
'avg_reduction_per_year': emissions_cut_per_year,
'percent': percent
}
)
if target_year>year:
x = [baseline_year, target_year]
y = [baseline_emissions, target_emissions]
lsr_dict = least_squares_regression(x, y)
lsr = lsr_dict['equation']
target_year = year
target_emissions = lsr(target_year)
emissions_cut = baseline_emissions * percent_decimal
emissions_cut_per_year = emissions_cut / (target_year - baseline_year)
data_scaled.append(
{
'actor_id': part,
'baseline_year': baseline_year,
'baseline_emissions': baseline_emissions,
'normalized_target_year': target_year,
'target_emissions':target_emissions,
'emissions_reduction': emissions_cut,
'avg_reduction_per_year': emissions_cut_per_year,
'percent_reduction': percent
}
)
except:
continue
df_part_targets = pd.DataFrame(data_raw)
df_part_targets_scaled = pd.DataFrame(data_scaled)
Each province has targets with different baseline years, percent reduction, and target years
[7]:
df_part_targets
[7]:
actor_id | baseline_year | baseline_emissions | target_year | target_emissions | emissions_reduction | avg_reduction_per_year | percent | |
---|---|---|---|---|---|---|---|---|
0 | CA-AB | 2005 | 237.093201 | 2050 | 203.900153 | 33.193048 | 0.737623 | 14 |
1 | CA-BC | 2007 | 62.658881 | 2030 | 37.595329 | 25.063552 | 1.089720 | 40 |
2 | CA-MB | 2005 | 20.530551 | 2030 | 13.755469 | 6.775082 | 0.271003 | 33 |
3 | CA-NB | 2005 | 19.781112 | 2030 | 10.681800 | 9.099312 | 0.363972 | 46 |
4 | CA-NL | 2001 | 9.899129 | 2050 | 2.474782 | 7.424347 | 0.151517 | 75 |
5 | CA-NS | 2005 | 22.963779 | 2030 | 10.792976 | 12.170803 | 0.486832 | 53 |
6 | CA-NT | 2005 | 1.725190 | 2030 | 0.862595 | 0.862595 | 0.034504 | 50 |
7 | CA-ON | 2005 | 204.370140 | 2030 | 143.059098 | 61.311042 | 2.452442 | 30 |
8 | CA-PE | 2005 | 1.899135 | 2030 | 1.329395 | 0.569740 | 0.022790 | 30 |
9 | CA-QC | 1990 | 84.508702 | 2030 | 53.240482 | 31.268220 | 0.781705 | 37 |
10 | CA-YT | 2010 | 0.647988 | 2030 | 0.453592 | 0.194396 | 0.009720 | 30 |
In order to accurately compare the effectiveness of these targets to achieving the national goal, we scale all the pledges to 2030 (the target year at the national level) assuming linear rate of reduction
[8]:
df_part_targets_scaled
[8]:
actor_id | baseline_year | baseline_emissions | normalized_target_year | target_emissions | emissions_reduction | avg_reduction_per_year | percent_reduction | |
---|---|---|---|---|---|---|---|---|
0 | CA-AB | 2005 | 237.093201 | 2030 | 218.652619 | 33.193048 | 1.327722 | 14 |
1 | CA-BC | 2007 | 62.658881 | 2030 | 37.595329 | 25.063552 | 1.089720 | 40 |
2 | CA-MB | 2005 | 20.530551 | 2030 | 13.755469 | 6.775082 | 0.271003 | 33 |
3 | CA-NB | 2005 | 19.781112 | 2030 | 10.681800 | 9.099312 | 0.363972 | 46 |
4 | CA-NL | 2001 | 9.899129 | 2030 | 5.505128 | 7.424347 | 0.256012 | 75 |
5 | CA-NS | 2005 | 22.963779 | 2030 | 10.792976 | 12.170803 | 0.486832 | 53 |
6 | CA-NT | 2005 | 1.725190 | 2030 | 0.862595 | 0.862595 | 0.034504 | 50 |
7 | CA-ON | 2005 | 204.370140 | 2030 | 143.059098 | 61.311042 | 2.452442 | 30 |
8 | CA-PE | 2005 | 1.899135 | 2030 | 1.329395 | 0.569740 | 0.022790 | 30 |
9 | CA-QC | 1990 | 84.508702 | 2030 | 53.240482 | 31.268220 | 0.781705 | 37 |
10 | CA-YT | 2010 | 0.647988 | 2030 | 0.453592 | 0.194396 | 0.009720 | 30 |
Calculate target gap
If the provinces are on track to meeting Canada’s NDC goal, then the sum of each provincial emissions in the target yeat (\(E_{prov}\)) will equal the national emissions in the target yeat (\(E_{nat}\)). However, if the provincial emissions are either not enough or overshoot the national goal, there will be an emissions gap (\(E_{gap}\)), if this gap is positive then the provincial is not enough to meet the goal and if the gap is negative, the provinces have overachieved the goal.
\(E_{nat} + E_{gap}= \sum_{prov=1}^N E_{prov}\)
In this section of the notebook, we will calculate this gap as follows:
\(E_{gap} = \big(\sum_{prov=1}^N E_{prov}\big) - E_{nat}\)
This only takes into account provinces with with targets. Two provinces, Saskatchewan and Nunavut, do not have targets. In this case, we will assume their emissions remain at pre-pandemic 2019 levels, as we are unsure as to their future emissions trajectory. The revised gap that takes into account emissions from Saskatchewan (\(E_{2019,SK}\)) and Nunavut (\(E_{2019,NU}\)):
\(E_{gap} = \big(\sum_{prov=1}^N E_{prov}\big) + E_{2019,SK} + E_{2019,NU} - E_{nat}\)
[9]:
sum_subat_target = df_part_targets_scaled['target_emissions'].sum()
national_target = float(national['target_emissions'].values)
gap = sum_subat_target - national_target
print(f'''
If each province meets their goal (ignoring emissiong from Saskatchewan and Nunavut),
there will still be an {round(gap)} MtCO2e gap in the target.
''')
If each province meets their goal (ignoring emissiong from Saskatchewan and Nunavut),
there will still be an 88 MtCO2e gap in the target.
[10]:
missing_actors = list(set(actor_parts['actor_id']) - set(df_part_targets['actor_id']))
data_id = 'ECCC:GHG_inventory:2022-04-13'
df_missing = get_emissions(missing_actors, data_id)
df_missing = df_missing.assign(total_emissions = df_missing['total_emissions'] / 10**6)
missing_emissions = df_missing.loc[(df_missing['actor_id'].isin(missing_actors)) & (df_missing['year'] == 2019), 'total_emissions'].sum()
gap_revised = (sum_subat_target + missing_emissions) - national_target
print(f'''
If we assume Saskatchewan and Nunavut emissions remain constant at pre-pandemic levels,
then the emissions gap increases to {round(gap_revised)} MtCO2e.
''')
If we assume Saskatchewan and Nunavut emissions remain constant at pre-pandemic levels,
then the emissions gap increases to 167 MtCO2e.
[11]:
print(f'''
This emissions gap is similar to the reductions from all the pledges {round(df_part_targets_scaled['emissions_reduction'].sum())} MtCO2e.
Meaning provincial commitments need to roughly double to meet the national goal.
''')
This emissions gap is similar to the reductions from all the pledges 188 MtCO2e.
Meaning provincial commitments need to roughly double to meet the national goal.
Create figure
[12]:
df_tmp = (
df_missing
.loc[(df_missing['actor_id'].isin(missing_actors)) & (df_missing['year'] == 2019), ['actor_id','total_emissions']]
.rename(columns={'total_emissions':'target_emissions'})
)
df_fin = pd.concat([df_part_targets_scaled[['actor_id', 'target_emissions']], df_tmp]).reset_index(drop=True)
df_fin = df_fin.sort_values(by='target_emissions', ascending=False)
df_fin['cumulative'] = df_fin['target_emissions'].cumsum()
[13]:
fig = plt.figure(figsize=(6, 6))
ax = fig.add_subplot(111)
ax.bar(0, national['target_emissions'], bottom=0, label='CA')
previous = 0
for iterator, row in df_fin.iterrows():
emissions = row['target_emissions']
cumulative = row['cumulative']
actor_id = row['actor_id']
ax.bar(1, emissions, bottom=previous, label=actor_id)
previous = cumulative
ax.text(1.5, previous - (emissions/2),
actor_id,
fontsize=12,
color='k')
# Turn off the display of all ticks.
ax.tick_params(which='both', # Options for both major and minor ticks
top='off', # turn off top ticks
left='off', # turn off left ticks
right='off', # turn off right ticks
bottom='off') # turn off bottom ticks
# Remove x tick marks
plt.setp(ax.get_xticklabels(), rotation=0)
# Hide the right and top spines
ax.spines['right'].set_visible(False)
ax.spines['left'].set_visible(False)
ax.spines['top'].set_visible(False)
ax.spines['bottom'].set_visible(False)
# Only show ticks on the left and bottom spines
ax.yaxis.set_ticks_position('left')
ax.xaxis.set_ticks_position('bottom')
# grid and tick marks
ax.set_yticks(np.arange(0, 700, 100))
ax.grid(axis='y',
which='major',
color=[0.8, 0.8, 0.8], linestyle='-')
ax.set_axisbelow(True)
ax.set_xticks([0, 1])
ax.set_xticklabels(['National', 'Provinces'])
ax.set_title("2030 Emission targets")
ax.set_ylabel("Emissions (MTCO$_2$-eq)")
[13]:
Text(0, 0.5, 'Emissions (MTCO$_2$-eq)')

Contribution Guide
Contributions are highly welcomed and appreciated. Every little help counts,
so do not hesitate! You can make a high impact on openclimate
just by using it and
reporting issues.
The following sections cover some general guidelines
regarding development in openclimate
for maintainers and contributors.
Nothing here is set in stone and can’t be changed. Feel free to suggest improvements or changes in the workflow.
Feature requests and feedback
We are eager to hear about your requests for new features and any suggestions about the API, infrastructure, and so on. Feel free to submit these as issues with the label “feature request.”
Please make sure to explain in detail how the feature should work and keep the scope as narrow as possible. This will make it easier to implement in small PRs.
Report bugs
Report bugs for openclimate
in the issue tracker
with the label “bug”.
If you can write a demonstration test that currently fails but should pass that is a very useful commit to make as well, even if you cannot fix the bug itself.
Fix bugs
Look through the GitHub issues for bugs.
Talk to developers to find out how you can fix specific bugs.
Preparing Pull Requests
Fork the OpenClimate-pyclient GitHub repository. It’s fine to use
OpenClimate-pyclient
as your fork repository name because it will live under your username.Clone your fork locally using git, connect your repository to the upstream (main project), and create a branch:
$ git clone git@github.com:YOUR_GITHUB_USERNAME/OpenClimate-pyclient.git $ cd OpenClimate-pyclient $ git remote add upstream git@github.com:Open-Earth-Foundation/OpenClimate-pyclient.git # now, to fix a bug or add feature create your own branch off "master": $ git checkout -b your-bugfix-feature-branch-name master
If you need some help with Git, follow this quick start guide: https://git.wiki.kernel.org/index.php/QuickStart
Set up a [conda](environment) with all necessary dependencies:
$ conda env create -f ci/environment-py3.8.yml
Activate your environment:
$ conda activate test_env_openclimate
Install the openclimate package:
$ pip install -e . --no-deps
Before you modify anything, ensure that the setup works by executing all tests:
$ pytest
You want to see an output indicating no failures, like this:
$ ========================== n passed, j warnings in 17.07s ===========================
Finally, submit a pull request through the GitHub website using this data:
head-fork: YOUR_GITHUB_USERNAME/ compare: your-branch-name base-fork: Open-Earth-Foundation/OpenClimate-pyclient base: master
The merged pull request will undergo the same testing that your local branch had to pass when pushing.
Code Contributors
Luke Gloege - Climate Data Engineer, Open Earth Foundation. ORCID: [0000-0001-9062-6960]