Waste is an international matter: everyone produces it. Each and every country has to deal with it, and “dealing” means at least two things: the actual management of the waste and coping with the costs and consequences of waste production.
(Preview image credits: Offiikart, CC0, via Wikimedia Commons)
About Waste
“Wastes” are substances or objects which are disposed of or are intended to be disposed of or are required to be disposed of by the provisions of national law.1
environmental costs: starting from toxic effects of hazardous waste on the environment and water, to direct or indirect health issues for animals and humans (e.g. toxic gases when waste is burnt or facilitation of spreading of plagues by vermins and rodents), and the production of green house gases
social costs: waste management often is linked to social injustice, as waste is frequently moved from developed countries to developing regions; often causing even more problems for already politically, economically environmntally chalanged communities.
economic cost: the correct management of waste is an expensive endeavour. If a community wants to avoid social and environmental damage, the proper handling and disposition is expensive and has to be payed for by someone: be it the producers themselves or the society via taxes.
I want to get an understanding on worldwide waste production and analyze, whether there are trends in countries or regions.
As always, if you’re not interested into how to get there, you can jump to Conclusions.
United Nations Data
The public data repository of the United Nations “UNdata” offers many datasets, including a dataset on “Total amount of municipal waste collected” for the years 1990 - 2015. It is based on official statistics provided by national offices or environmental ministries in response to the biennial UNSD/UNEP Questionnaire on Environment Statistics, complemented with data from OECD and Eurostat.
Municipal waste, collected by or on behalf of municipalities, by public or private enterprises, includes waste originating from: households, commerce and trade, small businesses, office buildings and institutions (schools, hospitals, government buildings). It also includes bulky waste (e.g., white goods, old furniture, mattresses) and waste from selected municipal services, e.g., waste from park and garden maintenance, waste from street cleaning services (street sweepings, the content of litter containers, market cleansing waste), if managed as waste. The definition excludes waste from municipal sewage network and treatment, municipal construction and demolition waste. Municipal waste collected refers to waste collected by or on behalf of municipalities, as well as municipal waste collected by the private sector. It includes mixed waste, and fractions collected separately for recovery operations (through door-to-door collection and/or through voluntary deposits). For data sourced from OECD and Eurostat, values correspond to municipal waste generated.2
License: As I couldn’t find any specific license mentioned on the UNdata website, I refer to their “Terms of Use”:
“All data and metadata provided on UNdata’s website are available free of charge and may be copied freely, duplicated and further distributed provided that UNdata is cited as the reference.”
The downloaded csv contains some metadata starting from row 1588, which returned an error reading the whole file into a single tibble. Therefore I split it up into the actual data waste_loaded and the meta data waste_meta.
There are 1587 rows and 5 columns in this dataset.
Economic And Population Data
We’ll pull in tables on population data (pop) and on the GDP per capita (eco_GDPpc).
For this I use data from the gapminder foundation, as their data seemed more complete than the UN data (which didn’t provide data for each year, but roughly only every fifth year).
Gapminder uses data e.g. from the World Bank and provides this data under an open license:
FREE DATA FROM WORLD BANK VIA GAPMINDER.ORG, released under the CC-BY LICENSE
We may need the footnotes later, when we’re interpreting the data, but for now I’ll drop this column. Apart from the cleaning, I did some plausibility checks. If you’re interested, you can find them “below deck”, since they did not contribute to the main story.
Story line
First, let’s have look at the included variables. It seems, that there are no missing values3:
Country_Area Year Value Unit
Length:1587 Min. :1990 Min. : 4.64 Length:1587
Class :character 1st Qu.:2001 1st Qu.: 471.50 Class :character
Mode :character Median :2006 Median : 2719.00 Mode :character
Mean :2006 Mean : 12086.31
3rd Qu.:2011 3rd Qu.: 6274.50
Max. :2016 Max. :234471.00
Is there always the same Unit used?
# A tibble: 1 × 2
Unit n
<chr> <int>
1 1000 tonnes 1587
In fact this is the case. We can drop the unit column and keep it in mind for the rest of the analysis:
Luckily, there is not much cleaning needed at this stage. We can gladly move on to the exploratory analysis.
Below deck
The UNdata website states that they collected data from 1990 to 2015.
Validating the most recent values
Let’s check, if the 2016-data seems valid and the description on the website is just outdated. To do this we’ll compare the values of 2016 to the countries’ values of the years before:
Looking through the data, the entries for 2016 seem plausible. Wherever there were values to compare, they fell within the same range of the years before. According to this validation, I think it’s reasonable to keep the 2016 data. They probably didn’t update the description on the datasets website.
Validating the Value-Column
The Value column seems quite unevenly distributed. Let’s inspect it visually:
There are some extreme values here. We need to check these before we continue with the analysis.
Both China and the United States of America have a huge population and economy. I can believe that they produce a lot of waste. But what is Botswana doing there on the last page of the above table? Is this an outlier?
# A tibble: 2 × 3
Country_Area Year Value
<chr> <dbl> <dbl>
1 Botswana 2015 85946.
2 Botswana 2013 130999.
Botswana reported in two years: 2013 and 2015. As of 2018 Botswana had a total estimated population of only 2.25 million4. It seems impossible, that they produce as much waste as China or the USA. On the other hand, the two reported values are not that far apart from each other, which makes each one more plausible. But since there are only two entries, this is quite a weak confirmation. After some consideration and plotting along the way, I decided to exclude Botswana from further analysis. This exclusion will be done at a later point, when creating the tibble waste_pop_gdp.
This concludes this “Below deck” section. If you’ve already read the main story line until here, you can just scroll on to to the next section. If not, I recommend reading the main line of this section before and continue there.
Participating Countries
There are 128 countries included in this dataset:
As the data is collected from questionnaires, sent to the competent authorities in each country, the quality and completeness depends on the reporting-motivation of the countries. Let’s see, how many countries replied to the questionnaires over the years:
There is a considerable jump in the participating countries from 1994 to 1995 and a drop in the year 2016. As I want a data basis that is as consistent as possible, I’ll just keep the 20 years from 1995 to 2015 as waste_20 (marked in blue).
Combining the waste data with population and economic data
To analyse the development over the years, we need to take into account the population development. To achieve this, we can join waste_20 with pop. And while we’re at it, lets add the GDP data as well and store all in waste_pop_gdp. Since we want to only use countries where there is waste-, population-, and economic data, we’ll use inner_join.
ISO Country Year
Length:1414 Length:1414 Min. :1995
Class :character Class :character 1st Qu.:2001
Mode :character Mode :character Median :2006
Mean :2006
3rd Qu.:2011
Max. :2015
waste_abs waste_rel Population
Min. : 5.07 Min. :0.000829 Min. :3.070e+04
1st Qu.: 620.25 1st Qu.:0.274029 1st Qu.:3.090e+06
Median : 2888.00 Median :0.411155 Median :8.350e+06
Mean : 12123.54 Mean :0.445070 Mean :4.173e+07
3rd Qu.: 7024.00 3rd Qu.:0.559332 3rd Qu.:2.608e+07
Max. :234471.00 Max. :4.305063 Max. :1.410e+09
GDPpc
Min. : 693
1st Qu.: 10125
Median : 20550
Mean : 24717
3rd Qu.: 35800
Max. :124000
Now, that we have all required data, let’s head on to the actual analysis.
Development over the years
First of all, let’s see how the overall waste collection in absolute numbers developed over time. As not all countries reported their waste production every year, I’ll later set the waste in relation to the population of the reporting countries each year. This way we can perform the analysis independently of the number an size of the countries that reported each year. I will call this ‘waste per capita’.
Absolute waste over time
As mentioned, the first aspect is the absolute amount of waste that was reported. I’ll have a look at the masses over time and then provide a ranking of the Top 10 waste collectors. To account for the fact that not every country reports every year, this will be a ranking of the average in the last five years.
Note that a higher amount of collected waste does not necessarily mean more produced waste. A country with a comprehensive waste collection and disposition infrastructure might turn out ‘worse’ in this ranking than regions where there is no proper waste management and large portions of household/municipal waste is being disposed of improperly.
Absolute Waste
The jump in 2000 seems quite significant. Since these are absolute numbers, regardless of how many countries reported in 2000 and how large the countries’ populations and economies are, this could be due to more countries reporting. However, as seen in Figure 2 there is more a gradual increase of countries reporting and not a comparable step as seen here. It could be a systematic error, too. However I cannot prove this latter hypothesis.5
It is also unclear what happened in 2015. This might be related to the declining number of reporting countries from 2011-2015, which can be seen in Figure 2.
Who collects the most amount of municipal waste?
# A tibble: 10 × 3
Country mean_waste_mio_tonns waste_percap_kg
<chr> <dbl> <dbl>
1 United States 230. 731
2 China 175. 126
3 Egypt 94.9 1098
4 Germany 50.5 621
5 Japan 45 351
6 Mexico 41.6 357
7 France 34.4 539
8 Brazil 33 164
9 Turkey 31 409
10 United Kingdom 31 477
Below deck Is there a real jump up in 2000 and a sudden drop again in 2015? Let’s theorize…
Hypothisis 1: It’s due to a systematic error
As mentioned before, there is this time frame from 2000 to around 2014, where the reported amount seems to be “elevated” but basically still following the overall trend. If this was due to one or a few “big producers” reporting in these years, I wouldn’t expect all those 14 years “elevated”. Instead I’d assume to see only a singular jump every now and then, whenever these countries reported in. The continuous elevation of these 14 years makes a systematic error possible, especially when taking into consideration the low value of 2015. Here’s the graphical representation of what I’m saying:
Hypothesis 2: This is a true effect
Of course this could also be valid data and the increase in waste is due to more countries or more people included (via the reporting countries they live in).
The number of people doubled in 2000, so this could explain the sudden rise in absolute waste collected. Since the reported waste did not double as well, but ‘only’ increased around 25%, this would imply that the “additional people” had quite a low ‘per capita’ waste collection. As you’ll see later, this would fit the drop in yearly average ‘per capita’ which can also be observed in the year 2000.
One last thing is the sudden drop in reported waste in 2015, which is not reflected in the number of people. Unfortunately I couldn’t resolve this bit at the moment, since I couldn’t find any information regarding possible systematic reporting changes on the UNdata website.
While it would be very welcome to see a true downward trend, at this point I’m not yet convinced that this is the case. For now I’ll go with the second hypothesis, but we’ll have to take a deeper dive into the data on the country level, to see if there’s more to discover.
Waste relative to population over time
To mitigate effects of irregular reporting and changing participants each year, we’ll set the waste in relation to the number of people that were represented by the reporting countries each year.
It is astounding, how much waste is collected: since 2000 the average collected municipal waste per capita per year was between 250kg and 300kg. Sure, “municipal waste” includes not just the mixed household waste6, but this huge amount of waste still has to be disposed of.
Interestingly we see a steep drop in the year 2000. If you’ve read the previous “Below deck” section you might know this could be linked to a sudden increase in the populations included in the reporting. They seem to lower the average quite a bit.
Comparing the countries I: distribution
How are the values of waste per capita of all countries distributed over the years?
The majority of countries collect less than 1 ton per person per year. Accordingly the complete boxplots including the whiskers are lying below the 1 ton line. Some countries however collect more than one ton (blue) or even more than 2 tonnes (red) per person and per year. These are:
[1] "Countries with 1 - 2 tonnes of waste per year per capita:"
# A tibble: 8 × 1
Country
<chr>
1 Antigua and Barbuda
2 Egypt
3 Kyrgyz Republic
4 Maldives
5 Monaco
6 Montenegro
7 Qatar
8 Singapore
[1] "Countries with more than 2 tonnes of waste per year per capita:"
# A tibble: 1 × 1
Country
<chr>
1 Kuwait
These figures are to be seen as what they are: numbers reported to a questionnaire. Please don’t see this as a “shaming”. Consider the options:
They might really produce that much waste per capita, or
they might include waste-categories into “municipal waste” that others don’t, or,
they collect a bigger fraction of the produced waste, while others only collect a small part and the rest will be discarded by the population and is not accounted for in these numbers; and finally
they might just be ‘honest’, while others report lower numbers.
Comparing the countries II: development over time (overview)
The following plot is not very pretty, but it gives a rough visual clue for each country over time.
There is quite a heterogeneity across the countries, not just in the scales, but also in the appearance of the development over time:
One frequent scheme is a steady upward trend over the whole period.
Another frequent scheme is a bell shaped curve or an upside down U, with the peak quite often being at around 2005-2010.
Only few countries show a steadily falling curve over the whole period.
Some countries only reported one or two values and therefore they barely show a clear trend.
Comparing the countries II: development over time (deep dive)
Due to the bell shaped curve in some countries a comparison of the trend is difficult. To make things easier, I will only look at the ten years from 2005 to 2015. Since 2005 was the peak in many countries with a bell shaped curve, the development after that should be more or less steady. But this is still 10 years of development, so we should gain a good insight in the major trend in these countries as well.
For better comparison between the many countries we’ll scale the values to a baseline and calculate all other values as percentage of the baseline value. This baseline will be separately computed for each country as the 5-year-mean around 20057.
I will also remove all countries that had less than 3 observations in the last 10 years, since there is not much sense in deriving a trend from such few data points.
Countries with significant change
All countries
Here’s an overview of all countries with sufficient data in the time of interest. Can you find yours?
Is the waste per capita correlated to the GDP per capita?
The hypothesis goes, that a “rich” society consumes more and by this produces more waste. As we have already combined the waste data with population and GDP values for each year and country we can easily do this.
Main Story Line
If we throw all the data into the correlation, we do in fact see a positive correlation of approx. \(\rho = 0.76\).
The following is an interactive visualization of all the years. Can you find your country? Feel free to zoom in and play around.
As a last part let’s see, how the two variables are correlated in each country. The following table is limited to countries that had at least 5 datapoints over 20 years and showed a significant correlation (Spearman) after p-value correction:
Interestingly in four countries there is a significant negative correlation. The vast majority of countries support the hypothesis, that a higher GDP comes with a higher waste-production.8
Below deck
To better compare the figures I analyzed the distributions of the values, first visually…
Both visually, as well as according to the Shapiro-Wilks-Test we can savely reject the null-hypothesis, that each of the variables is normally distributed.
Conclusions
Waste is a global problem and the yearly collected amounts seem to increase steadily9.
On average (across all reporting nations) between 250-300kg of municipal waste was collected each year per person with a few countries reaching 2-3 tonns per capita per year.
In some countries there is a positive turnaround with decreasing numbers of reported waste in recent years. Let’s hope this is due to less waste production, not due to less waste collection.
In most countries there is a correlation between the GDP per capita and the collected waste per capita.
Source:, “Municipal waste collected.xlsx”, retrieved on 2021-01-30↩︎
Actually, there are, since not every country reported every year. But since these “missing reports” do not show up in the dataset, it does not include any NAs.↩︎
see the Introduction section on the UN data to learn what’s included.↩︎
i.e. 2003 through 2007, as far as there are values available for that period↩︎
Well, there is another possibility: As the reported values are waste collected, a higher GDP could just mean, that the communities can afford to collect the garbage more consistently. But finding the true causality is not possible in this blogpost with the available data, so in the end it remains a correlation, nothing more.↩︎
ignoring a probable outlier in 2015, the last of the analysed years↩︎