44  Concatena archivos con Pandas

En esta sesión, exploraremos una herramienta para esta tarea: la función pd.concat de Pandas. Nos centraremos en su aplicación en el contexto de series temporales y la importancia de entender la opción axis y su efecto en la concatenación. Al utilizar pd.concat, podemos elegir entre concatenar a lo largo del eje de las filas (axis=0) o a lo largo del eje de las columnas (axis=1). Exploraremos cómo estas opciones son fundamentales al unir series temporales de diferentes fuentes para crear una secuencia temporal continua o para combinar series temporales paralelas manteniendo sus índices de tiempo sincronizados.

import pandas as pd
pd.concat()
SyntaxError: invalid syntax (1465400318.py, line 1)
f = '../data/esolmet/001.xls'
e1 = pd.read_excel(f,index_col=0,parse_dates=True,skiprows=[0,1,3,4])
e1.head()
Directa Global Difusa UV Temperatura Humedad Viento Vel. Presiòn Bar.
Fecha
2018-01-01 00:00:00 0.057 0.0 0.0 0.001 18.93 41.57 1.253 879.0692
2018-01-01 00:10:00 0.002 0.0 0.0 0.001 18.76 41.00 0.418 879.4363
2018-01-01 00:20:00 0.170 0.0 0.0 0.001 18.92 40.96 0.955 879.5181
2018-01-01 00:30:00 0.371 0.0 0.0 0.001 18.52 42.46 1.823 879.5826
2018-01-01 00:40:00 0.305 0.0 0.0 0.001 18.49 42.43 2.149 879.6826
f = '../data/esolmet/002.xls'
e2 = pd.read_excel(f,index_col=0,parse_dates=True,skiprows=[0,1,3,4])
e2.head()
Directa Global Difusa UV Temperatura Humedad Viento Vel. Presiòn Bar.
Fecha
2018-02-01 00:00:00 0.003 0.0 0.0 0.001 15.38 53.71 1.531 878.2341
2018-02-01 00:10:00 0.313 0.0 0.0 0.001 15.71 50.03 2.339 878.1605
2018-02-01 00:20:00 0.029 0.0 0.0 0.001 15.57 51.52 2.836 877.9796
2018-02-01 00:30:00 0.013 0.0 0.0 0.001 15.23 52.00 2.780 877.9725
2018-02-01 00:40:00 0.001 0.0 0.0 0.001 15.28 51.80 2.541 878.0463
e = pd.concat([e2,e1])
e
Directa Global Difusa UV Temperatura Humedad Viento Vel. Presiòn Bar.
Fecha
2018-02-01 00:00:00 0.003 0.0 0.0 0.001 15.38 53.71 1.531 878.2341
2018-02-01 00:10:00 0.313 0.0 0.0 0.001 15.71 50.03 2.339 878.1605
2018-02-01 00:20:00 0.029 0.0 0.0 0.001 15.57 51.52 2.836 877.9796
2018-02-01 00:30:00 0.013 0.0 0.0 0.001 15.23 52.00 2.780 877.9725
2018-02-01 00:40:00 0.001 0.0 0.0 0.001 15.28 51.80 2.541 878.0463
... ... ... ... ... ... ... ... ...
2018-01-31 23:20:00 0.000 0.0 0.0 0.001 15.82 52.51 2.070 878.3740
2018-01-31 23:30:00 0.000 0.0 0.0 0.001 15.42 53.71 1.581 878.5696
2018-01-31 23:40:00 0.081 0.0 0.0 0.001 15.28 53.26 2.083 878.6181
2018-01-31 23:50:00 0.075 0.0 0.0 0.001 15.32 52.38 2.500 878.3397
2018-02-01 00:00:00 0.003 0.0 0.0 0.001 15.38 53.71 1.531 878.2341

8498 rows × 8 columns

e.sort_index()
Directa Global Difusa UV Temperatura Humedad Viento Vel. Presiòn Bar.
Fecha
2018-01-01 00:00:00 0.057 0.0 0.0 0.001 18.93 41.57 1.253 879.0692
2018-01-01 00:10:00 0.002 0.0 0.0 0.001 18.76 41.00 0.418 879.4363
2018-01-01 00:20:00 0.170 0.0 0.0 0.001 18.92 40.96 0.955 879.5181
2018-01-01 00:30:00 0.371 0.0 0.0 0.001 18.52 42.46 1.823 879.5826
2018-01-01 00:40:00 0.305 0.0 0.0 0.001 18.49 42.43 2.149 879.6826
... ... ... ... ... ... ... ... ...
2018-02-28 23:20:00 0.000 0.0 0.0 0.000 22.49 38.44 1.811 874.8154
2018-02-28 23:30:00 0.000 0.0 0.0 0.000 22.75 39.60 0.859 874.8812
2018-02-28 23:40:00 0.003 0.0 0.0 0.000 22.34 39.43 1.244 875.0685
2018-02-28 23:50:00 0.117 0.0 0.0 0.000 22.52 39.90 1.729 874.9626
2018-03-01 00:00:00 0.004 0.0 0.0 0.000 22.35 41.03 1.508 875.0115

8498 rows × 8 columns

e
Directa Global Difusa UV Temperatura Humedad Viento Vel. Presiòn Bar.
Fecha
2018-02-01 00:00:00 0.003 0.0 0.0 0.001 15.38 53.71 1.531 878.2341
2018-02-01 00:10:00 0.313 0.0 0.0 0.001 15.71 50.03 2.339 878.1605
2018-02-01 00:20:00 0.029 0.0 0.0 0.001 15.57 51.52 2.836 877.9796
2018-02-01 00:30:00 0.013 0.0 0.0 0.001 15.23 52.00 2.780 877.9725
2018-02-01 00:40:00 0.001 0.0 0.0 0.001 15.28 51.80 2.541 878.0463
... ... ... ... ... ... ... ... ...
2018-01-31 23:20:00 0.000 0.0 0.0 0.001 15.82 52.51 2.070 878.3740
2018-01-31 23:30:00 0.000 0.0 0.0 0.001 15.42 53.71 1.581 878.5696
2018-01-31 23:40:00 0.081 0.0 0.0 0.001 15.28 53.26 2.083 878.6181
2018-01-31 23:50:00 0.075 0.0 0.0 0.001 15.32 52.38 2.500 878.3397
2018-02-01 00:00:00 0.003 0.0 0.0 0.001 15.38 53.71 1.531 878.2341

8498 rows × 8 columns

e.sort_index(inplace=True)
e
Directa Global Difusa UV Temperatura Humedad Viento Vel. Presiòn Bar.
Fecha
2018-01-01 00:00:00 0.057 0.0 0.0 0.001 18.93 41.57 1.253 879.0692
2018-01-01 00:10:00 0.002 0.0 0.0 0.001 18.76 41.00 0.418 879.4363
2018-01-01 00:20:00 0.170 0.0 0.0 0.001 18.92 40.96 0.955 879.5181
2018-01-01 00:30:00 0.371 0.0 0.0 0.001 18.52 42.46 1.823 879.5826
2018-01-01 00:40:00 0.305 0.0 0.0 0.001 18.49 42.43 2.149 879.6826
... ... ... ... ... ... ... ... ...
2018-02-28 23:20:00 0.000 0.0 0.0 0.000 22.49 38.44 1.811 874.8154
2018-02-28 23:30:00 0.000 0.0 0.0 0.000 22.75 39.60 0.859 874.8812
2018-02-28 23:40:00 0.003 0.0 0.0 0.000 22.34 39.43 1.244 875.0685
2018-02-28 23:50:00 0.117 0.0 0.0 0.000 22.52 39.90 1.729 874.9626
2018-03-01 00:00:00 0.004 0.0 0.0 0.000 22.35 41.03 1.508 875.0115

8498 rows × 8 columns

pd.concat([e2,e1],axis=1)
Directa Global Difusa UV Temperatura Humedad Viento Vel. Presiòn Bar. Directa Global Difusa UV Temperatura Humedad Viento Vel. Presiòn Bar.
Fecha
2018-01-01 00:00:00 NaN NaN NaN NaN NaN NaN NaN NaN 0.057 0.0 0.0 0.001 18.93 41.57 1.253 879.0692
2018-01-01 00:10:00 NaN NaN NaN NaN NaN NaN NaN NaN 0.002 0.0 0.0 0.001 18.76 41.00 0.418 879.4363
2018-01-01 00:20:00 NaN NaN NaN NaN NaN NaN NaN NaN 0.170 0.0 0.0 0.001 18.92 40.96 0.955 879.5181
2018-01-01 00:30:00 NaN NaN NaN NaN NaN NaN NaN NaN 0.371 0.0 0.0 0.001 18.52 42.46 1.823 879.5826
2018-01-01 00:40:00 NaN NaN NaN NaN NaN NaN NaN NaN 0.305 0.0 0.0 0.001 18.49 42.43 2.149 879.6826
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
2018-02-28 23:20:00 0.000 0.0 0.0 0.0 22.49 38.44 1.811 874.8154 NaN NaN NaN NaN NaN NaN NaN NaN
2018-02-28 23:30:00 0.000 0.0 0.0 0.0 22.75 39.60 0.859 874.8812 NaN NaN NaN NaN NaN NaN NaN NaN
2018-02-28 23:40:00 0.003 0.0 0.0 0.0 22.34 39.43 1.244 875.0685 NaN NaN NaN NaN NaN NaN NaN NaN
2018-02-28 23:50:00 0.117 0.0 0.0 0.0 22.52 39.90 1.729 874.9626 NaN NaN NaN NaN NaN NaN NaN NaN
2018-03-01 00:00:00 0.004 0.0 0.0 0.0 22.35 41.03 1.508 875.0115 NaN NaN NaN NaN NaN NaN NaN NaN

8497 rows × 16 columns

Ejercicio propuesto

Hacer una lista de glob, cargar todos los archivos usando un list comprehension y definiendo una función, concatenando todos los archivos y ordenando el índice.