45  Multi-índice en las columnas de series temporales

En el vasto campo del análisis de datos con Pandas, la capacidad de organizar y estructurar nuestros datos de manera eficiente es esencial para una exploración y análisis efectivos. En esta sesión, conoceremos el concepto de multiíndice, también conocido como índice jerárquico, que nos permite tener múltiples niveles de índices en DataFrames. Descubriremos cómo esta característica puede simplificar la organización de datos complejos al facilitar su accesibilidad y comprensión. Exploraremos cómo aplicar multiíndices a las columnas de un DataFrame nos permite agrupar variables relacionadas bajo un mismo índice superior, lo que resulta especialmente útil en conjuntos de datos de series temporales.

import pandas as pd
import matplotlib.pyplot as plt
f = '../data/Ti_blanco.csv'
Ti_b = pd.read_csv(f,index_col=0,parse_dates=True)

f = '../data/Ti_negro.csv'
Ti_n = pd.read_csv(f,index_col=0,parse_dates=True)
Ti_n
Ti_C Ti_R1 Ti_R2 Ti_S
date
2006-01-01 00:10:00 19.038413 19.047696 19.087096 19.129994
2006-01-01 00:20:00 19.035998 19.045313 19.084383 19.128110
2006-01-01 00:30:00 19.033880 19.043235 19.082035 19.126533
2006-01-01 00:40:00 19.031973 19.041355 19.079928 19.125138
2006-01-01 00:50:00 19.030158 19.039552 19.077921 19.123787
... ... ... ... ...
2006-12-31 23:20:00 24.137526 23.391216 23.667201 24.646797
2006-12-31 23:30:00 24.064361 23.323456 23.589420 24.567420
2006-12-31 23:40:00 23.990232 23.257064 23.512465 24.488597
2006-12-31 23:50:00 23.917165 23.191858 23.436305 24.410237
2007-01-01 00:00:00 23.846178 23.127226 23.360837 24.332185

52560 rows × 4 columns

Ti_b
Ti_C Ti_R1 Ti_R2 Ti_S
date
2006-01-01 00:10:00 19.014610 19.024472 19.065057 19.106503
2006-01-01 00:20:00 19.012693 19.022505 19.062782 19.104997
2006-01-01 00:30:00 19.011030 19.020820 19.060846 19.103773
2006-01-01 00:40:00 19.009526 19.019312 19.059129 19.102714
2006-01-01 00:50:00 19.008070 19.017864 19.057494 19.101684
... ... ... ... ...
2006-12-31 23:20:00 19.589577 19.517255 19.780842 19.902851
2006-12-31 23:30:00 19.572531 19.500154 19.757842 19.882044
2006-12-31 23:40:00 19.555469 19.482987 19.734803 19.861207
2006-12-31 23:50:00 19.538364 19.465734 19.711720 19.840302
2007-01-01 00:00:00 19.521151 19.448329 19.688533 19.819254

52560 rows × 4 columns

pd.concat([Ti_b,Ti_n],axis=1)
Ti_C Ti_R1 Ti_R2 Ti_S Ti_C Ti_R1 Ti_R2 Ti_S
date
2006-01-01 00:10:00 19.014610 19.024472 19.065057 19.106503 19.038413 19.047696 19.087096 19.129994
2006-01-01 00:20:00 19.012693 19.022505 19.062782 19.104997 19.035998 19.045313 19.084383 19.128110
2006-01-01 00:30:00 19.011030 19.020820 19.060846 19.103773 19.033880 19.043235 19.082035 19.126533
2006-01-01 00:40:00 19.009526 19.019312 19.059129 19.102714 19.031973 19.041355 19.079928 19.125138
2006-01-01 00:50:00 19.008070 19.017864 19.057494 19.101684 19.030158 19.039552 19.077921 19.123787
... ... ... ... ... ... ... ... ...
2006-12-31 23:20:00 19.589577 19.517255 19.780842 19.902851 24.137526 23.391216 23.667201 24.646797
2006-12-31 23:30:00 19.572531 19.500154 19.757842 19.882044 24.064361 23.323456 23.589420 24.567420
2006-12-31 23:40:00 19.555469 19.482987 19.734803 19.861207 23.990232 23.257064 23.512465 24.488597
2006-12-31 23:50:00 19.538364 19.465734 19.711720 19.840302 23.917165 23.191858 23.436305 24.410237
2007-01-01 00:00:00 19.521151 19.448329 19.688533 19.819254 23.846178 23.127226 23.360837 24.332185

52560 rows × 8 columns

pd.concat([Ti_b,Ti_n],axis=1,keys=['blanco','negro'])
blanco negro
Ti_C Ti_R1 Ti_R2 Ti_S Ti_C Ti_R1 Ti_R2 Ti_S
date
2006-01-01 00:10:00 19.014610 19.024472 19.065057 19.106503 19.038413 19.047696 19.087096 19.129994
2006-01-01 00:20:00 19.012693 19.022505 19.062782 19.104997 19.035998 19.045313 19.084383 19.128110
2006-01-01 00:30:00 19.011030 19.020820 19.060846 19.103773 19.033880 19.043235 19.082035 19.126533
2006-01-01 00:40:00 19.009526 19.019312 19.059129 19.102714 19.031973 19.041355 19.079928 19.125138
2006-01-01 00:50:00 19.008070 19.017864 19.057494 19.101684 19.030158 19.039552 19.077921 19.123787
... ... ... ... ... ... ... ... ...
2006-12-31 23:20:00 19.589577 19.517255 19.780842 19.902851 24.137526 23.391216 23.667201 24.646797
2006-12-31 23:30:00 19.572531 19.500154 19.757842 19.882044 24.064361 23.323456 23.589420 24.567420
2006-12-31 23:40:00 19.555469 19.482987 19.734803 19.861207 23.990232 23.257064 23.512465 24.488597
2006-12-31 23:50:00 19.538364 19.465734 19.711720 19.840302 23.917165 23.191858 23.436305 24.410237
2007-01-01 00:00:00 19.521151 19.448329 19.688533 19.819254 23.846178 23.127226 23.360837 24.332185

52560 rows × 8 columns

casos = pd.concat([Ti_b,Ti_n],axis=1,keys=['blanco','negro'])
casos['blanco']
Ti_C Ti_R1 Ti_R2 Ti_S
date
2006-01-01 00:10:00 19.014610 19.024472 19.065057 19.106503
2006-01-01 00:20:00 19.012693 19.022505 19.062782 19.104997
2006-01-01 00:30:00 19.011030 19.020820 19.060846 19.103773
2006-01-01 00:40:00 19.009526 19.019312 19.059129 19.102714
2006-01-01 00:50:00 19.008070 19.017864 19.057494 19.101684
... ... ... ... ...
2006-12-31 23:20:00 19.589577 19.517255 19.780842 19.902851
2006-12-31 23:30:00 19.572531 19.500154 19.757842 19.882044
2006-12-31 23:40:00 19.555469 19.482987 19.734803 19.861207
2006-12-31 23:50:00 19.538364 19.465734 19.711720 19.840302
2007-01-01 00:00:00 19.521151 19.448329 19.688533 19.819254

52560 rows × 4 columns

casos['negro']
Ti_C Ti_R1 Ti_R2 Ti_S
date
2006-01-01 00:10:00 19.038413 19.047696 19.087096 19.129994
2006-01-01 00:20:00 19.035998 19.045313 19.084383 19.128110
2006-01-01 00:30:00 19.033880 19.043235 19.082035 19.126533
2006-01-01 00:40:00 19.031973 19.041355 19.079928 19.125138
2006-01-01 00:50:00 19.030158 19.039552 19.077921 19.123787
... ... ... ... ...
2006-12-31 23:20:00 24.137526 23.391216 23.667201 24.646797
2006-12-31 23:30:00 24.064361 23.323456 23.589420 24.567420
2006-12-31 23:40:00 23.990232 23.257064 23.512465 24.488597
2006-12-31 23:50:00 23.917165 23.191858 23.436305 24.410237
2007-01-01 00:00:00 23.846178 23.127226 23.360837 24.332185

52560 rows × 4 columns

fig, ax = plt.subplots(figsize=(10,4))

ax.plot(casos['blanco']['Ti_C'])
ax.plot(casos['negro']['Ti_C'])

casos.columns.get_level_values(0).unique()
Index(['blanco', 'negro'], dtype='object')
casos.columns.levels
FrozenList([['blanco', 'negro'], ['Ti_C', 'Ti_R1', 'Ti_R2', 'Ti_S']])
casos.columns.levels[0]
Index(['blanco', 'negro'], dtype='object')
colores = casos.columns.levels[0]

fig, ax = plt.subplots(figsize=(10,4))

for color in colores:
    ax.plot(casos[color]['Ti_C'])