18  Cambia y localiza nombres de columnas

En esta sesión, exploraremos cómo renombrar columnas de DataFrames en Pandas.

Descubre estrategias prácticas para mantener una nomenclatura clara y consistente, vital para la integridad de tus análisis. Aprenderás a mejorar la legibilidad de tus DataFrames con nombres de columnas descriptivos y coherentes para tener una narrativa computacional en tus proyectos.

Una nomenclatura clara y consistente puede transformar tu análisis de datos para que sea robusto, reproducible y colaborativo.

import pandas as pd
f = "../data/Cuernavaca_1dia_comas.csv"
cuerna = pd.read_csv(f,index_col=0,parse_dates=True)
cuerna.head()
To Ws Wd P Ig Ib Id
tiempo
2012-01-01 00:00:00 19.3 0.0 26 87415 0 0 0
2012-01-01 01:00:00 18.6 0.0 26 87602 0 0 0
2012-01-01 02:00:00 17.9 0.0 30 87788 0 0 0
2012-01-01 03:00:00 17.3 0.0 30 87554 0 0 0
2012-01-01 04:00:00 16.6 0.0 27 87321 0 0 0
columnas = cuerna.columns
columnas
Index(['To', 'Ws', 'Wd', 'P', 'Ig', 'Ib', 'Id'], dtype='object')
nombres = {
    "Wd":"wind_direction",
    "Ws":"WindSpeed"
}
cuerna.rename(columns=nombres)
To WindSpeed wind_direction P Ig Ib Id
tiempo
2012-01-01 00:00:00 19.3 0.0 26 87415 0 0 0
2012-01-01 01:00:00 18.6 0.0 26 87602 0 0 0
2012-01-01 02:00:00 17.9 0.0 30 87788 0 0 0
2012-01-01 03:00:00 17.3 0.0 30 87554 0 0 0
2012-01-01 04:00:00 16.6 0.0 27 87321 0 0 0
2012-01-01 05:00:00 15.9 0.0 26 87087 0 0 0
2012-01-01 06:00:00 17.0 0.0 27 87096 0 0 0
2012-01-01 07:00:00 18.0 0.0 34 87140 20 151 11
2012-01-01 08:00:00 19.0 0.0 61 87185 164 522 37
2012-01-01 09:00:00 20.0 0.0 95 87229 369 812 58
2012-01-01 10:00:00 20.0 1.0 108 87229 568 931 68
2012-01-01 11:00:00 20.0 2.1 160 87229 717 981 75
2012-01-01 12:00:00 21.0 1.8 135 87273 800 999 79
2012-01-01 13:00:00 22.0 1.5 160 87316 810 998 80
2012-01-01 14:00:00 21.7 1.3 164 87302 747 977 79
2012-01-01 15:00:00 21.3 1.2 176 87287 617 932 74
2012-01-01 16:00:00 21.0 1.0 140 87273 433 846 65
2012-01-01 17:00:00 19.0 0.0 198 87185 219 650 46
2012-01-01 18:00:00 17.1 0.0 221 87104 0 0 0
2012-01-01 19:00:00 17.0 0.0 269 87101 0 0 0
2012-01-01 20:00:00 17.3 0.0 50 87115 0 0 0
2012-01-01 21:00:00 17.0 0.2 85 87080 0 0 0
2012-01-01 22:00:00 16.6 0.5 89 87089 0 0 0
2012-01-01 23:00:00 15.9 0.8 93 87143 0 0 0
cuerna
To Ws Wd P Ig Ib Id
tiempo
2012-01-01 00:00:00 19.3 0.0 26 87415 0 0 0
2012-01-01 01:00:00 18.6 0.0 26 87602 0 0 0
2012-01-01 02:00:00 17.9 0.0 30 87788 0 0 0
2012-01-01 03:00:00 17.3 0.0 30 87554 0 0 0
2012-01-01 04:00:00 16.6 0.0 27 87321 0 0 0
2012-01-01 05:00:00 15.9 0.0 26 87087 0 0 0
2012-01-01 06:00:00 17.0 0.0 27 87096 0 0 0
2012-01-01 07:00:00 18.0 0.0 34 87140 20 151 11
2012-01-01 08:00:00 19.0 0.0 61 87185 164 522 37
2012-01-01 09:00:00 20.0 0.0 95 87229 369 812 58
2012-01-01 10:00:00 20.0 1.0 108 87229 568 931 68
2012-01-01 11:00:00 20.0 2.1 160 87229 717 981 75
2012-01-01 12:00:00 21.0 1.8 135 87273 800 999 79
2012-01-01 13:00:00 22.0 1.5 160 87316 810 998 80
2012-01-01 14:00:00 21.7 1.3 164 87302 747 977 79
2012-01-01 15:00:00 21.3 1.2 176 87287 617 932 74
2012-01-01 16:00:00 21.0 1.0 140 87273 433 846 65
2012-01-01 17:00:00 19.0 0.0 198 87185 219 650 46
2012-01-01 18:00:00 17.1 0.0 221 87104 0 0 0
2012-01-01 19:00:00 17.0 0.0 269 87101 0 0 0
2012-01-01 20:00:00 17.3 0.0 50 87115 0 0 0
2012-01-01 21:00:00 17.0 0.2 85 87080 0 0 0
2012-01-01 22:00:00 16.6 0.5 89 87089 0 0 0
2012-01-01 23:00:00 15.9 0.8 93 87143 0 0 0
cuerna.rename(columns=nombres,inplace=True)
cuerna.columns
Index(['To', 'WindSpeed', 'wind_direction', 'P', 'Ig', 'Ib', 'Id'], dtype='object')
columnas = cuerna.columns
wind = [columna for columna in columnas if "wind" in columna]
wind
['wind_direction']
wind = [columna for columna in columnas if "wind" in columna.lower()]
wind
['WindSpeed', 'wind_direction']
cuerna[wind].head()
WindSpeed wind_direction
tiempo
2012-01-01 00:00:00 0.0 26
2012-01-01 01:00:00 0.0 26
2012-01-01 02:00:00 0.0 30
2012-01-01 03:00:00 0.0 30
2012-01-01 04:00:00 0.0 27