# World Of Waste

The amount of waste produced every year across the globe is frightening. But there’s a glimmer of hope on the horizon.

R
environment
international
Author

Christian Gebhard

Published

February 7, 2021

## Introduction

Waste is an international matter: everyone produces it. Each and every country has to deal with it, and “dealing” means at least two things: the actual management of the waste and coping with the costs and consequences of waste production.

(Preview image credits: Offiikart, CC0, via Wikimedia Commons)

“Wastes” are substances or objects which are disposed of or are intended to be disposed of or are required to be disposed of by the provisions of national law.1
— The Basel Convention, 1989

The consequences are abundant and quite severe. The Wikipedia article on waste mentions three “costs” waste causes:

• environmental costs: starting from toxic effects of hazardous waste on the environment and water, to direct or indirect health issues for animals and humans (e.g. toxic gases when waste is burnt or facilitation of spreading of plagues by vermins and rodents), and the production of green house gases
• social costs: waste management often is linked to social injustice, as waste is frequently moved from developed countries to developing regions; often causing even more problems for already politically, economically environmntally chalanged communities.
• economic cost: the correct management of waste is an expensive endeavour. If a community wants to avoid social and environmental damage, the proper handling and disposition is expensive and has to be payed for by someone: be it the producers themselves or the society via taxes.

I want to get an understanding on worldwide waste production and analyze, whether there are trends in countries or regions.

As always, if you’re not interested into how to get there, you can jump to Conclusions.

United Nations Data

The public data repository of the United NationsUNdata” offers many datasets, including a dataset on “Total amount of municipal waste collected” for the years 1990 - 2015. It is based on official statistics provided by national offices or environmental ministries in response to the biennial UNSD/UNEP Questionnaire on Environment Statistics, complemented with data from OECD and Eurostat.

Municipal waste, collected by or on behalf of municipalities, by public or private enterprises, includes waste originating from: households, commerce and trade, small businesses, office buildings and institutions (schools, hospitals, government buildings). It also includes bulky waste (e.g., white goods, old furniture, mattresses) and waste from selected municipal services, e.g., waste from park and garden maintenance, waste from street cleaning services (street sweepings, the content of litter containers, market cleansing waste), if managed as waste. The definition excludes waste from municipal sewage network and treatment, municipal construction and demolition waste.
Municipal waste collected refers to waste collected by or on behalf of municipalities, as well as municipal waste collected by the private sector. It includes mixed waste, and fractions collected separately for recovery operations (through door-to-door collection and/or through voluntary deposits). For data sourced from OECD and Eurostat, values correspond to municipal waste generated.2
— United Nations Statistics Division, 28.02.2020

The waste-data was downloaded as a semicolon-separated txt file from http://data.un.org/Data.aspx?d=ENV&f=variableID:1814&c=2,3,4,5&s=countryName:asc,yr:desc&v=4 on January 24th 2021.

“All data and metadata provided on UNdata’s website are available free of charge and may be copied freely, duplicated and further distributed provided that UNdata is cited as the reference.”

Packages and Sessioninfo

Code
library("tidyverse")
library("rmarkdown")
library("countrycode")
library("broom")
library("ggpubr")
library("rstatix")
library("plotly")

sessionInfo()
R version 4.2.1 (2022-06-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.4 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
[1] LC_CTYPE=de_DE.UTF-8       LC_NUMERIC=C
[3] LC_TIME=de_DE.UTF-8        LC_COLLATE=de_DE.UTF-8
[5] LC_MONETARY=de_DE.UTF-8    LC_MESSAGES=de_DE.UTF-8
[7] LC_PAPER=de_DE.UTF-8       LC_NAME=C
[11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base

other attached packages:
[1] plotly_4.10.0     rstatix_0.7.0     ggpubr_0.4.0      broom_1.0.0
[5] countrycode_1.4.0 rmarkdown_2.14    forcats_0.5.1     stringr_1.4.0
[13] tibble_3.1.7      ggplot2_3.3.6     tidyverse_1.3.1   r2symbols_0.1

loaded via a namespace (and not attached):
[1] lubridate_1.8.0     tufte_0.12          assertthat_0.2.1
[4] digest_0.6.29       utf8_1.2.2          R6_2.5.1
[7] cellranger_1.1.0    backports_1.4.1     reprex_2.0.1
[10] evaluate_0.15       httr_1.4.3          pillar_1.7.0
[16] data.table_1.14.2   rstudioapi_0.13     car_3.1-0
[19] xaringanExtra_0.6.0 htmlwidgets_1.5.4   munsell_0.5.0
[22] compiler_4.2.1      modelr_0.1.8        xfun_0.31
[25] pkgconfig_2.0.3     htmltools_0.5.2     tidyselect_1.1.2
[28] viridisLite_0.4.0   fansi_1.0.3         crayon_1.5.1
[31] tzdb_0.3.0          dbplyr_2.2.1        withr_2.5.0
[34] grid_4.2.1          jsonlite_1.8.0      gtable_0.3.0
[37] lifecycle_1.0.1     DBI_1.1.3           magrittr_2.0.3
[40] scales_1.2.0        carData_3.0-5       cli_3.3.0
[43] stringi_1.7.6       ggsignif_0.6.3      fs_1.5.2
[46] xml2_1.3.3          ellipsis_0.3.2      generics_0.1.3
[49] vctrs_0.4.1         tools_4.2.1         glue_1.6.2
[52] hms_1.1.1           abind_1.4-5         fastmap_1.1.0
[55] yaml_2.3.5          colorspace_2.0-3    rvest_1.0.2
[58] bspm_0.3.9          knitr_1.39          haven_2.5.0        

## Reading and Inspecting the Data

The Waste Data

The downloaded csv contains some metadata starting from row 1588, which returned an error reading the whole file into a single tibble. Therefore I split it up into the actual data waste_loaded and the meta data waste_meta.

There are 1587 rows and 5 columns in this dataset.

Economic And Population Data

We’ll pull in tables on population data (pop) and on the GDP per capita (eco_GDPpc).

For this I use data from the gapminder foundation, as their data seemed more complete than the UN data (which didn’t provide data for each year, but roughly only every fifth year).

Gapminder uses data e.g. from the World Bank and provides this data under an open license:
FREE DATA FROM WORLD BANK VIA GAPMINDER.ORG, released under the CC-BY LICENSE

# A tibble: 8 × 3
Country      Year Population
<chr>       <int>      <dbl>
1 Afghanistan  1800    3280000
2 Afghanistan  1801    3280000
3 Afghanistan  1802    3280000
4 Afghanistan  1803    3280000
5 Afghanistan  1804    3280000
6 Afghanistan  1805    3280000
7 Afghanistan  1806    3280000
8 Afghanistan  1807    3280000
# A tibble: 14 × 3
Country      Year GDPpc
<chr>       <int> <dbl>
1 Afghanistan  1800   603
2 Afghanistan  1801   603
3 Afghanistan  1802   603
4 Afghanistan  1803   603
5 Afghanistan  1804   603
6 Afghanistan  1805   603
7 Afghanistan  1806   603
8 Afghanistan  1807   603
9 Afghanistan  1808   603
10 Afghanistan  1809   603
11 Afghanistan  1810   604
12 Afghanistan  1811   604
13 Afghanistan  1812   604
14 Afghanistan  1813   604

In the next step we will clean the dataset.

### Cleaning the Data

We may need the footnotes later, when we’re interpreting the data, but for now I’ll drop this column. Apart from the cleaning, I did some plausibility checks. If you’re interested, you can find them “below deck”, since they did not contribute to the main story.

Story line

First, let’s have look at the included variables. It seems, that there are no missing values3:

 Country_Area            Year          Value               Unit
Length:1587        Min.   :1990   Min.   :     4.64   Length:1587
Class :character   1st Qu.:2001   1st Qu.:   471.50   Class :character
Mode  :character   Median :2006   Median :  2719.00   Mode  :character
Mean   :2006   Mean   : 12086.31
3rd Qu.:2011   3rd Qu.:  6274.50
Max.   :2016   Max.   :234471.00                     

Is there always the same Unit used?

# A tibble: 1 × 2
Unit            n
<chr>       <int>
1 1000 tonnes  1587

In fact this is the case. We can drop the unit column and keep it in mind for the rest of the analysis:

Luckily, there is not much cleaning needed at this stage. We can gladly move on to the exploratory analysis.

Below deck

The UNdata website states that they collected data from 1990 to 2015.

#### Validating the most recent values

Let’s check, if the 2016-data seems valid and the description on the website is just outdated. To do this we’ll compare the values of 2016 to the countries’ values of the years before:

Looking through the data, the entries for 2016 seem plausible. Wherever there were values to compare, they fell within the same range of the years before. According to this validation, I think it’s reasonable to keep the 2016 data. They probably didn’t update the description on the datasets website.

#### Validating the Value-Column

The Value column seems quite unevenly distributed. Let’s inspect it visually:

There are some extreme values here. We need to check these before we continue with the analysis.

Both China and the United States of America have a huge population and economy. I can believe that they produce a lot of waste. But what is Botswana doing there on the last page of the above table? Is this an outlier?

# A tibble: 2 × 3
Country_Area  Year   Value
<chr>        <dbl>   <dbl>
1 Botswana      2015  85946.
2 Botswana      2013 130999.

Botswana reported in two years: 2013 and 2015. As of 2018 Botswana had a total estimated population of only 2.25 million4. It seems impossible, that they produce as much waste as China or the USA. On the other hand, the two reported values are not that far apart from each other, which makes each one more plausible. But since there are only two entries, this is quite a weak confirmation. After some consideration and plotting along the way, I decided to exclude Botswana from further analysis. This exclusion will be done at a later point, when creating the tibble waste_pop_gdp.

This concludes this “Below deck” section. If you’ve already read the main story line until here, you can just scroll on to to the next section. If not, I recommend reading the main line of this section before and continue there.

### Participating Countries

There are 128 countries included in this dataset:

As the data is collected from questionnaires, sent to the competent authorities in each country, the quality and completeness depends on the reporting-motivation of the countries. Let’s see, how many countries replied to the questionnaires over the years:

There is a considerable jump in the participating countries from 1994 to 1995 and a drop in the year 2016. As I want a data basis that is as consistent as possible, I’ll just keep the 20 years from 1995 to 2015 as waste_20 (marked in blue).

### Combining the waste data with population and economic data

To analyse the development over the years, we need to take into account the population development. To achieve this, we can join waste_20 with pop. And while we’re at it, lets add the GDP data as well and store all in waste_pop_gdp. Since we want to only use countries where there is waste-, population-, and economic data, we’ll use inner_join.

     ISO              Country               Year
Length:1414        Length:1414        Min.   :1995
Class :character   Class :character   1st Qu.:2001
Mode  :character   Mode  :character   Median :2006
Mean   :2006
3rd Qu.:2011
Max.   :2015
waste_abs           waste_rel          Population
Min.   :     5.07   Min.   :0.000829   Min.   :3.070e+04
1st Qu.:   620.25   1st Qu.:0.274029   1st Qu.:3.090e+06
Median :  2888.00   Median :0.411155   Median :8.350e+06
Mean   : 12123.54   Mean   :0.445070   Mean   :4.173e+07
3rd Qu.:  7024.00   3rd Qu.:0.559332   3rd Qu.:2.608e+07
Max.   :234471.00   Max.   :4.305063   Max.   :1.410e+09
GDPpc
Min.   :   693
1st Qu.: 10125
Median : 20550
Mean   : 24717
3rd Qu.: 35800
Max.   :124000  

Now, that we have all required data, let’s head on to the actual analysis.

## Development over the years

First of all, let’s see how the overall waste collection in absolute numbers developed over time. As not all countries reported their waste production every year, I’ll later set the waste in relation to the population of the reporting countries each year. This way we can perform the analysis independently of the number an size of the countries that reported each year. I will call this ‘waste per capita’.

### Absolute waste over time

As mentioned, the first aspect is the absolute amount of waste that was reported. I’ll have a look at the masses over time and then provide a ranking of the Top 10 waste collectors. To account for the fact that not every country reports every year, this will be a ranking of the average in the last five years.

Note that a higher amount of collected waste does not necessarily mean more produced waste. A country with a comprehensive waste collection and disposition infrastructure might turn out ‘worse’ in this ranking than regions where there is no proper waste management and large portions of household/municipal waste is being disposed of improperly.

Absolute Waste

The jump in 2000 seems quite significant. Since these are absolute numbers, regardless of how many countries reported in 2000 and how large the countries’ populations and economies are, this could be due to more countries reporting. However, as seen in Figure 2 there is more a gradual increase of countries reporting and not a comparable step as seen here. It could be a systematic error, too. However I cannot prove this latter hypothesis.5

It is also unclear what happened in 2015. This might be related to the declining number of reporting countries from 2011-2015, which can be seen in Figure 2.

Who collects the most amount of municipal waste?

# A tibble: 10 × 3
Country        mean_waste_mio_tonns waste_percap_kg
<chr>                         <dbl>           <dbl>
1 United States                 230.              731
2 China                         175.              126
3 Egypt                          94.9            1098
4 Germany                        50.5             621
5 Japan                          45               351
6 Mexico                         41.6             357
7 France                         34.4             539
8 Brazil                         33               164
9 Turkey                         31               409
10 United Kingdom                 31               477

Below deck Is there a real jump up in 2000 and a sudden drop again in 2015? Let’s theorize…

#### Hypothisis 1: It’s due to a systematic error

As mentioned before, there is this time frame from 2000 to around 2014, where the reported amount seems to be “elevated” but basically still following the overall trend. If this was due to one or a few “big producers” reporting in these years, I wouldn’t expect all those 14 years “elevated”. Instead I’d assume to see only a singular jump every now and then, whenever these countries reported in. The continuous elevation of these 14 years makes a systematic error possible, especially when taking into consideration the low value of 2015. Here’s the graphical representation of what I’m saying:

#### Hypothesis 2: This is a true effect

Of course this could also be valid data and the increase in waste is due to more countries or more people included (via the reporting countries they live in).

The number of people doubled in 2000, so this could explain the sudden rise in absolute waste collected. Since the reported waste did not double as well, but ‘only’ increased around 25%, this would imply that the “additional people” had quite a low ‘per capita’ waste collection. As you’ll see later, this would fit the drop in yearly average ‘per capita’ which can also be observed in the year 2000.

One last thing is the sudden drop in reported waste in 2015, which is not reflected in the number of people. Unfortunately I couldn’t resolve this bit at the moment, since I couldn’t find any information regarding possible systematic reporting changes on the UNdata website.

While it would be very welcome to see a true downward trend, at this point I’m not yet convinced that this is the case. For now I’ll go with the second hypothesis, but we’ll have to take a deeper dive into the data on the country level, to see if there’s more to discover.

### Waste relative to population over time

To mitigate effects of irregular reporting and changing participants each year, we’ll set the waste in relation to the number of people that were represented by the reporting countries each year.

It is astounding, how much waste is collected: since 2000 the average collected municipal waste per capita per year was between 250kg and 300kg. Sure, “municipal waste” includes not just the mixed household waste6, but this huge amount of waste still has to be disposed of.

Interestingly we see a steep drop in the year 2000. If you’ve read the previous “Below deck” section you might know this could be linked to a sudden increase in the populations included in the reporting. They seem to lower the average quite a bit.

### Comparing the countries I: distribution

How are the values of waste per capita of all countries distributed over the years?

The majority of countries collect less than 1 ton per person per year. Accordingly the complete boxplots including the whiskers are lying below the 1 ton line. Some countries however collect more than one ton (blue) or even more than 2 tonnes (red) per person and per year. These are:

[1] "Countries with  1 - 2 tonnes of waste per year per capita:"
# A tibble: 8 × 1
Country
<chr>
1 Antigua and Barbuda
2 Egypt
3 Kyrgyz Republic
4 Maldives
5 Monaco
6 Montenegro
7 Qatar
8 Singapore          
[1] "Countries with  more than 2 tonnes of waste per year per capita:"
# A tibble: 1 × 1
Country
<chr>
1 Kuwait 

These figures are to be seen as what they are: numbers reported to a questionnaire. Please don’t see this as a “shaming”. Consider the options:

• They might really produce that much waste per capita, or
• they might include waste-categories into “municipal waste” that others don’t, or,
• they collect a bigger fraction of the produced waste, while others only collect a small part and the rest will be discarded by the population and is not accounted for in these numbers; and finally
• they might just be ‘honest’, while others report lower numbers.

### Comparing the countries II: development over time (overview)

The following plot is not very pretty, but it gives a rough visual clue for each country over time.

There is quite a heterogeneity across the countries, not just in the scales, but also in the appearance of the development over time:

• One frequent scheme is a steady upward trend over the whole period.
• Another frequent scheme is a bell shaped curve or an upside down U, with the peak quite often being at around 2005-2010.
• Only few countries show a steadily falling curve over the whole period.
• Some countries only reported one or two values and therefore they barely show a clear trend.

### Comparing the countries II: development over time (deep dive)

Due to the bell shaped curve in some countries a comparison of the trend is difficult. To make things easier, I will only look at the ten years from 2005 to 2015. Since 2005 was the peak in many countries with a bell shaped curve, the development after that should be more or less steady. But this is still 10 years of development, so we should gain a good insight in the major trend in these countries as well.

For better comparison between the many countries we’ll scale the values to a baseline and calculate all other values as percentage of the baseline value. This baseline will be separately computed for each country as the 5-year-mean around 20057.

I will also remove all countries that had less than 3 observations in the last 10 years, since there is not much sense in deriving a trend from such few data points.

Countries with significant change

All countries

Here’s an overview of all countries with sufficient data in the time of interest. Can you find yours?

## Is the waste per capita correlated to the GDP per capita?

The hypothesis goes, that a “rich” society consumes more and by this produces more waste. As we have already combined the waste data with population and GDP values for each year and country we can easily do this.

Main Story Line

If we throw all the data into the correlation, we do in fact see a positive correlation of approx. $$\rho = 0.76$$.

The following is an interactive visualization of all the years. Can you find your country? Feel free to zoom in and play around.

As a last part let’s see, how the two variables are correlated in each country. The following table is limited to countries that had at least 5 datapoints over 20 years and showed a significant correlation (Spearman) after p-value correction:

Interestingly in four countries there is a significant negative correlation. The vast majority of countries support the hypothesis, that a higher GDP comes with a higher waste-production.8

Below deck

To better compare the figures I analyzed the distributions of the values, first visually…

…and then with a Shapiro-Wilks-Test:

Code
waste_pop_gdp %>%
shapiro_test(waste_rel, GDPpc) %>%
paged_table()

Both visually, as well as according to the Shapiro-Wilks-Test we can savely reject the null-hypothesis, that each of the variables is normally distributed.

## Conclusions

• Waste is a global problem and the yearly collected amounts seem to increase steadily9.
• On average (across all reporting nations) between 250-300kg of municipal waste was collected each year per person with a few countries reaching 2-3 tonns per capita per year.
• In some countries there is a positive turnaround with decreasing numbers of reported waste in recent years. Let’s hope this is due to less waste production, not due to less waste collection.
• In most countries there is a correlation between the GDP per capita and the collected waste per capita.

## Footnotes

1. Source:, “Municipal waste collected.xlsx”, retrieved on 2021-01-30↩︎

2. Actually, there are, since not every country reported every year. But since these “missing reports” do not show up in the dataset, it does not include any NAs.↩︎

3. source: Wikipedia↩︎

4. checkout the “Below deck” section for more↩︎

5. see the Introduction section on the UN data to learn what’s included.↩︎

6. i.e. 2003 through 2007, as far as there are values available for that period↩︎

7. Well, there is another possibility: As the reported values are waste collected, a higher GDP could just mean, that the communities can afford to collect the garbage more consistently. But finding the true causality is not possible in this blogpost with the available data, so in the end it remains a correlation, nothing more.↩︎

8. ignoring a probable outlier in 2015, the last of the analysed years↩︎

## Citation

BibTeX citation:
@online{gebhard2021,
author = {Christian Gebhard},
title = {World {Of} {Waste}},
date = {2021-02-07},
url = {https://jollydata.blog/world-of-waste.html},
langid = {en}
}