29  Pivot table

import pandas as pd
import plotly.express as px
f = '../../data/Temixco_2018_10Min.parquet'
tmx = pd.read_parquet(f)
tmx.info()
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 52560 entries, 2018-01-01 00:00:00 to 2018-12-31 23:50:00
Data columns (total 7 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   Ib      52423 non-null  float64
 1   Ig      52423 non-null  float64
 2   To      52560 non-null  float64
 3   RH      52560 non-null  float64
 4   WS      52560 non-null  float64
 5   WD      52560 non-null  float64
 6   P       52560 non-null  float64
dtypes: float64(7)
memory usage: 3.2 MB
tmx
Ib Ig To RH WS WD P
time
2018-01-01 00:00:00 NaN NaN 18.70 36.34 1.422 316.0 87864.11
2018-01-01 00:10:00 0.002 0.0 18.95 35.29 1.008 283.7 87876.37
2018-01-01 00:20:00 0.170 0.0 18.94 35.43 1.565 326.0 87888.64
2018-01-01 00:30:00 0.371 0.0 18.77 35.89 2.175 354.5 87887.21
2018-01-01 00:40:00 0.305 0.0 18.81 36.34 1.902 348.0 87886.91
... ... ... ... ... ... ... ...
2018-12-31 23:10:00 0.125 0.0 18.51 47.29 1.715 332.2 87484.32
2018-12-31 23:20:00 0.000 0.0 18.26 48.02 1.703 320.5 87470.70
2018-12-31 23:30:00 0.044 0.0 18.39 46.84 2.887 335.7 87455.03
2018-12-31 23:40:00 0.170 0.0 17.99 47.85 1.528 358.8 87470.02
2018-12-31 23:50:00 0.003 0.0 17.75 49.65 0.598 322.3 87467.29

52560 rows × 7 columns

29.1 Usando pivot_table vamos a crear un mapa de calor o heatmap de la temperatura promedio horario para cada mes.


# Extraer el mes y la hora de la columna de tiempo
tmx['month'] = tmx.index.month
tmx['hour'] = tmx.index.hour

# Crear una tabla dinámica con el promedio de temperatura por mes y hora
tabla_To = tmx.pivot_table(values='To', index='hour', columns='month', aggfunc='mean')
tabla_To
month 1 2 3 4 5 6 7 8 9 10 11 12
hour
0 17.797527 21.013452 23.430914 23.038722 24.222097 20.725278 21.910376 19.661129 20.025889 20.056183 18.505333 17.276828
1 17.038118 20.359226 22.676129 22.344833 23.548548 20.424667 21.210376 19.261613 19.807389 19.713333 18.061167 16.754785
2 16.406237 19.759821 21.900699 21.774000 22.782258 20.069056 20.591774 19.024462 19.327056 19.205269 17.666278 16.252204
3 15.751398 19.020060 21.246237 21.230556 22.294194 19.710833 19.907366 18.792688 18.959778 18.844086 17.328056 15.943118
4 15.315054 18.469524 20.674140 20.588556 21.784194 19.380722 19.424785 18.455269 18.625000 18.541075 16.964611 15.516828
5 14.731774 17.912083 20.108011 19.925944 21.264194 18.967167 18.919462 18.220645 18.321444 18.254355 16.646056 15.051774
6 14.243011 17.352143 19.515806 19.667333 21.011505 18.926611 18.629946 17.999570 18.017833 17.951774 16.259444 14.591559
7 14.144946 17.177262 19.789355 20.778000 22.228871 20.020667 20.064624 19.058387 18.998111 18.653280 16.766722 14.611183
8 16.113602 19.568750 22.644624 23.237944 24.060968 21.398222 22.405538 20.913065 20.966111 21.080484 19.197944 17.168441
9 18.582742 21.940238 24.034892 24.313667 25.332473 22.435333 23.276720 21.888333 22.225611 22.462849 20.750056 19.405215
10 19.662634 23.101071 25.281505 25.626611 26.667097 23.423889 24.267634 22.873763 23.141333 23.405806 21.768278 20.582419
11 21.047473 24.426488 26.656935 27.097667 28.177581 24.520944 25.400108 23.957204 24.182333 24.279140 22.837556 21.801398
12 22.511989 25.710119 28.073656 28.414556 29.548871 25.413778 26.383226 24.956237 25.001167 25.309516 23.931389 23.073656
13 23.698978 26.940714 29.319570 29.703389 30.806183 26.349556 27.326828 25.764624 25.927500 26.102043 24.886944 24.227366
14 24.734032 27.736548 30.375269 30.679944 31.759892 27.109611 28.186022 26.515323 26.632722 26.756613 25.510556 25.285806
15 25.394462 28.568571 31.080108 31.150833 32.473441 27.594333 28.685806 26.903333 26.946944 27.080699 25.755167 25.810376
16 25.626290 28.967262 31.513226 31.264611 32.682849 27.448889 29.004677 26.933817 26.729833 26.867634 25.694111 25.882366
17 25.271613 28.759226 31.464624 30.977500 32.501290 27.067222 28.620968 26.318441 26.234611 26.215591 24.991667 25.286183
18 23.819355 27.769583 30.556720 30.136556 31.797581 26.396333 27.632957 25.405484 25.101833 24.513548 22.842056 23.088602
19 21.203441 25.858690 29.009516 28.896667 30.441667 25.065944 25.984355 23.872527 23.471500 22.615108 20.710333 20.438978
20 20.058065 24.091726 27.188172 27.495611 28.625753 23.667278 24.360054 22.503011 22.514444 21.817903 20.161111 19.482151
21 19.655054 23.379286 26.053441 25.832722 27.127043 22.594222 23.418441 21.286989 21.853778 20.983871 19.688611 18.848763
22 19.170968 22.675655 25.015699 24.846111 25.843065 21.769444 23.231505 20.366559 21.297111 20.616505 19.376778 18.376667
23 18.374194 21.880179 24.181935 23.692333 24.967581 21.095611 22.748172 19.931344 20.678333 20.259355 18.986722 17.964355
fig = px.imshow(
    tabla_To,
    aspect="auto",
    x=['Enero', 'Febrero', 'Marzo', 'Abril', 'Mayo', 'Junio', 'Julio', 'Agosto', 'Septiembre', 'Octubre', 'Noviembre', 'Diciembre'],
    y=[str(hour) + ':00' for hour in range(24)],
    color_continuous_scale='jet',
)
fig.show()

29.2 Usando pivot_table y groupby vamos a obtener los 10 paises que m’as contaminan desde el 2000 y vamos a visualizar sus emisiones por año.

f = '../../data/owid-co2-data.csv'
co2 = pd.read_csv(f)
co2
country year iso_code population gdp cement_co2 cement_co2_per_capita co2 co2_growth_abs co2_growth_prct ... share_global_other_co2 share_of_temperature_change_from_ghg temperature_change_from_ch4 temperature_change_from_co2 temperature_change_from_ghg temperature_change_from_n2o total_ghg total_ghg_excluding_lucf trade_co2 trade_co2_share
0 Afghanistan 1850 AFG 3752993.0 NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 Afghanistan 1851 AFG 3767956.0 NaN NaN NaN NaN NaN NaN ... NaN 0.157 0.000 0.000 0.000 0.0 NaN NaN NaN NaN
2 Afghanistan 1852 AFG 3783940.0 NaN NaN NaN NaN NaN NaN ... NaN 0.156 0.000 0.000 0.000 0.0 NaN NaN NaN NaN
3 Afghanistan 1853 AFG 3800954.0 NaN NaN NaN NaN NaN NaN ... NaN 0.156 0.000 0.000 0.000 0.0 NaN NaN NaN NaN
4 Afghanistan 1854 AFG 3818038.0 NaN NaN NaN NaN NaN NaN ... NaN 0.155 0.000 0.000 0.000 0.0 NaN NaN NaN NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
47410 Zimbabwe 2018 ZWE 15052191.0 2.678627e+10 0.558 0.037 10.715 1.419 15.265 ... NaN 0.114 0.001 0.001 0.002 0.0 116.76 29.37 -0.088 -0.825
47411 Zimbabwe 2019 ZWE 15354606.0 2.514642e+10 0.473 0.031 9.775 -0.939 -8.765 ... NaN 0.113 0.001 0.001 0.002 0.0 116.03 28.70 0.143 1.463
47412 Zimbabwe 2020 ZWE 15669663.0 2.317871e+10 0.496 0.032 7.850 -1.926 -19.700 ... NaN 0.112 0.001 0.001 0.002 0.0 113.20 25.99 0.818 10.421
47413 Zimbabwe 2021 ZWE 15993525.0 2.514009e+10 0.531 0.033 8.396 0.547 6.962 ... NaN 0.110 0.001 0.001 0.002 0.0 NaN NaN 1.088 12.956
47414 Zimbabwe 2022 ZWE 16320539.0 2.590159e+10 0.531 0.033 8.856 0.460 5.477 ... NaN 0.110 0.001 0.001 0.002 0.0 NaN NaN NaN NaN

47415 rows × 79 columns

co2 = co2.loc[co2.year>=2000]
co2
country year iso_code population gdp cement_co2 cement_co2_per_capita co2 co2_growth_abs co2_growth_prct ... share_global_other_co2 share_of_temperature_change_from_ghg temperature_change_from_ch4 temperature_change_from_co2 temperature_change_from_ghg temperature_change_from_n2o total_ghg total_ghg_excluding_lucf trade_co2 trade_co2_share
150 Afghanistan 2000 AFG 19542986.0 1.128379e+10 0.010 0.001 1.047 -0.045 -4.078 ... NaN 0.079 0.000 0.000 0.001 0.0 11.82 14.21 NaN NaN
151 Afghanistan 2001 AFG 19688634.0 1.102127e+10 0.007 0.000 1.069 0.022 2.098 ... NaN 0.078 0.000 0.000 0.001 0.0 12.73 12.60 NaN NaN
152 Afghanistan 2002 AFG 21000258.0 1.880487e+10 0.011 0.001 1.341 0.272 25.432 ... NaN 0.078 0.000 0.000 0.001 0.0 15.38 15.26 NaN NaN
153 Afghanistan 2003 AFG 22645136.0 2.107434e+10 0.010 0.000 1.560 0.219 16.302 ... NaN 0.077 0.000 0.000 0.001 0.0 16.08 15.96 NaN NaN
154 Afghanistan 2004 AFG 23553554.0 2.233257e+10 0.010 0.000 1.237 -0.322 -20.669 ... NaN 0.077 0.000 0.000 0.001 0.0 15.69 15.57 NaN NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
47410 Zimbabwe 2018 ZWE 15052191.0 2.678627e+10 0.558 0.037 10.715 1.419 15.265 ... NaN 0.114 0.001 0.001 0.002 0.0 116.76 29.37 -0.088 -0.825
47411 Zimbabwe 2019 ZWE 15354606.0 2.514642e+10 0.473 0.031 9.775 -0.939 -8.765 ... NaN 0.113 0.001 0.001 0.002 0.0 116.03 28.70 0.143 1.463
47412 Zimbabwe 2020 ZWE 15669663.0 2.317871e+10 0.496 0.032 7.850 -1.926 -19.700 ... NaN 0.112 0.001 0.001 0.002 0.0 113.20 25.99 0.818 10.421
47413 Zimbabwe 2021 ZWE 15993525.0 2.514009e+10 0.531 0.033 8.396 0.547 6.962 ... NaN 0.110 0.001 0.001 0.002 0.0 NaN NaN 1.088 12.956
47414 Zimbabwe 2022 ZWE 16320539.0 2.590159e+10 0.531 0.033 8.856 0.460 5.477 ... NaN 0.110 0.001 0.001 0.002 0.0 NaN NaN NaN NaN

5888 rows × 79 columns

co2[['co2','iso_code']].groupby('iso_code').sum().sort_values(by='co2',ascending=False).head(10)
co2
iso_code
CHN 188544.441
USA 128562.231
IND 41804.773
RUS 37243.672
JPN 27995.696
DEU 18652.725
KOR 13109.904
CAN 12991.416
IRN 12860.067
SAU 11929.307
top_10 = co2[['co2','iso_code']].groupby('iso_code').sum().sort_values(by='co2',ascending=False).head(10).index
top_10
Index(['CHN', 'USA', 'IND', 'RUS', 'JPN', 'DEU', 'KOR', 'CAN', 'IRN', 'SAU'], dtype='object', name='iso_code')
co2_tabla_paises = co2.pivot_table(values='co2',index='iso_code',columns='year',aggfunc='sum')
co2_tabla_paises.loc[top_10]
year 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 ... 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022
iso_code
CHN 3649.201 3728.513 4103.042 4841.119 5217.351 5882.143 6494.338 6983.577 7501.498 7891.089 ... 9956.376 9998.674 9866.951 9765.029 10011.151 10353.935 10721.042 10914.012 11336.233 11396.777
USA 6010.136 5907.740 5946.308 6010.146 6112.655 6132.183 6052.686 6130.123 5915.119 5480.726 ... 5480.157 5528.681 5376.473 5252.932 5212.162 5377.797 5262.145 4714.628 5032.213 5057.304
IND 977.526 990.970 1021.664 1059.160 1125.096 1185.674 1292.485 1392.506 1489.437 1612.216 ... 1995.098 2148.344 2234.219 2354.658 2426.607 2593.058 2612.888 2421.552 2674.222 2829.644
RUS 1479.142 1515.647 1506.788 1537.018 1543.957 1562.577 1623.002 1623.781 1652.064 1545.034 ... 1640.437 1639.959 1638.675 1634.885 1666.121 1712.494 1705.031 1632.929 1711.993 1652.177
JPN 1263.755 1249.162 1278.787 1287.292 1282.686 1290.145 1267.118 1302.837 1232.014 1163.057 ... 1315.192 1264.072 1223.169 1202.454 1186.802 1141.669 1104.540 1039.796 1062.129 1053.798
DEU 898.938 915.242 898.835 899.858 885.633 865.471 877.498 850.230 852.858 788.286 ... 833.804 794.738 798.085 801.745 785.986 754.811 707.491 647.252 678.799 665.605
KOR 439.981 455.631 476.065 485.213 491.389 498.688 503.478 521.709 532.651 537.820 ... 634.581 629.366 634.177 637.914 654.537 670.169 646.102 597.634 616.075 600.999
CAN 567.096 559.147 564.374 581.427 579.689 574.764 568.585 593.755 576.809 544.140 ... 569.540 566.353 570.680 557.690 566.652 577.066 578.588 522.845 537.174 547.944
IRN 364.302 386.158 393.410 407.077 437.679 462.003 496.070 502.565 522.557 536.605 ... 597.952 632.673 631.581 633.827 685.401 710.215 694.667 679.007 688.076 690.635
SAU 302.328 302.703 331.980 333.991 401.013 402.848 439.117 394.103 440.344 473.215 ... 544.564 612.842 678.849 697.420 680.854 686.896 707.125 610.773 631.437 662.549

10 rows × 23 columns

fig = px.imshow(
    co2_tabla_paises.loc[top_10],
    color_continuous_scale='jet',    
    aspect='auto',
)
fig.show()

Ejercicios sugeridos:

  1. Haz una gráfica con los valores máximos horarios para la To y Radiación global (Ig)
  2. Repite la gráfica de To usando matplotlib para que compares la diferencia en el código
  3. Haz una gráfica del cambio porcentual para visualizar el co2_per_capita