18 Cambia y localiza nombres de columnas

En esta sesión, exploraremos cómo renombrar columnas de DataFrames en Pandas.

Descubre estrategias prácticas para mantener una nomenclatura clara y consistente, vital para la integridad de tus análisis. Aprenderás a mejorar la legibilidad de tus DataFrames con nombres de columnas descriptivos y coherentes para tener una narrativa computacional en tus proyectos.

Una nomenclatura clara y consistente puede transformar tu análisis de datos para que sea robusto, reproducible y colaborativo.

import pandas as pd

f = "../data/Cuernavaca_1dia_comas.csv"
cuerna = pd.read_csv(f,index_col=0,parse_dates=True)
cuerna.head()

	To	Ws	Wd	P	Ig	Ib	Id
tiempo
2012-01-01 00:00:00	19.3	0.0	26	87415	0	0	0
2012-01-01 01:00:00	18.6	0.0	26	87602	0	0	0
2012-01-01 02:00:00	17.9	0.0	30	87788	0	0	0
2012-01-01 03:00:00	17.3	0.0	30	87554	0	0	0
2012-01-01 04:00:00	16.6	0.0	27	87321	0	0	0

columnas = cuerna.columns
columnas

Index(['To', 'Ws', 'Wd', 'P', 'Ig', 'Ib', 'Id'], dtype='object')

nombres = {
    "Wd":"wind_direction",
    "Ws":"WindSpeed"
}

cuerna.rename(columns=nombres)

	To	WindSpeed	wind_direction	P	Ig	Ib	Id
tiempo
2012-01-01 00:00:00	19.3	0.0	26	87415	0	0	0
2012-01-01 01:00:00	18.6	0.0	26	87602	0	0	0
2012-01-01 02:00:00	17.9	0.0	30	87788	0	0	0
2012-01-01 03:00:00	17.3	0.0	30	87554	0	0	0
2012-01-01 04:00:00	16.6	0.0	27	87321	0	0	0
2012-01-01 05:00:00	15.9	0.0	26	87087	0	0	0
2012-01-01 06:00:00	17.0	0.0	27	87096	0	0	0
2012-01-01 07:00:00	18.0	0.0	34	87140	20	151	11
2012-01-01 08:00:00	19.0	0.0	61	87185	164	522	37
2012-01-01 09:00:00	20.0	0.0	95	87229	369	812	58
2012-01-01 10:00:00	20.0	1.0	108	87229	568	931	68
2012-01-01 11:00:00	20.0	2.1	160	87229	717	981	75
2012-01-01 12:00:00	21.0	1.8	135	87273	800	999	79
2012-01-01 13:00:00	22.0	1.5	160	87316	810	998	80
2012-01-01 14:00:00	21.7	1.3	164	87302	747	977	79
2012-01-01 15:00:00	21.3	1.2	176	87287	617	932	74
2012-01-01 16:00:00	21.0	1.0	140	87273	433	846	65
2012-01-01 17:00:00	19.0	0.0	198	87185	219	650	46
2012-01-01 18:00:00	17.1	0.0	221	87104	0	0	0
2012-01-01 19:00:00	17.0	0.0	269	87101	0	0	0
2012-01-01 20:00:00	17.3	0.0	50	87115	0	0	0
2012-01-01 21:00:00	17.0	0.2	85	87080	0	0	0
2012-01-01 22:00:00	16.6	0.5	89	87089	0	0	0
2012-01-01 23:00:00	15.9	0.8	93	87143	0	0	0

cuerna

	To	Ws	Wd	P	Ig	Ib	Id
tiempo
2012-01-01 00:00:00	19.3	0.0	26	87415	0	0	0
2012-01-01 01:00:00	18.6	0.0	26	87602	0	0	0
2012-01-01 02:00:00	17.9	0.0	30	87788	0	0	0
2012-01-01 03:00:00	17.3	0.0	30	87554	0	0	0
2012-01-01 04:00:00	16.6	0.0	27	87321	0	0	0
2012-01-01 05:00:00	15.9	0.0	26	87087	0	0	0
2012-01-01 06:00:00	17.0	0.0	27	87096	0	0	0
2012-01-01 07:00:00	18.0	0.0	34	87140	20	151	11
2012-01-01 08:00:00	19.0	0.0	61	87185	164	522	37
2012-01-01 09:00:00	20.0	0.0	95	87229	369	812	58
2012-01-01 10:00:00	20.0	1.0	108	87229	568	931	68
2012-01-01 11:00:00	20.0	2.1	160	87229	717	981	75
2012-01-01 12:00:00	21.0	1.8	135	87273	800	999	79
2012-01-01 13:00:00	22.0	1.5	160	87316	810	998	80
2012-01-01 14:00:00	21.7	1.3	164	87302	747	977	79
2012-01-01 15:00:00	21.3	1.2	176	87287	617	932	74
2012-01-01 16:00:00	21.0	1.0	140	87273	433	846	65
2012-01-01 17:00:00	19.0	0.0	198	87185	219	650	46
2012-01-01 18:00:00	17.1	0.0	221	87104	0	0	0
2012-01-01 19:00:00	17.0	0.0	269	87101	0	0	0
2012-01-01 20:00:00	17.3	0.0	50	87115	0	0	0
2012-01-01 21:00:00	17.0	0.2	85	87080	0	0	0
2012-01-01 22:00:00	16.6	0.5	89	87089	0	0	0
2012-01-01 23:00:00	15.9	0.8	93	87143	0	0	0

cuerna.rename(columns=nombres,inplace=True)

cuerna.columns

Index(['To', 'WindSpeed', 'wind_direction', 'P', 'Ig', 'Ib', 'Id'], dtype='object')

columnas = cuerna.columns

wind = [columna for columna in columnas if "wind" in columna]
wind

['wind_direction']

wind = [columna for columna in columnas if "wind" in columna.lower()]
wind

['WindSpeed', 'wind_direction']

cuerna[wind].head()

	WindSpeed	wind_direction
tiempo
2012-01-01 00:00:00	0.0	26
2012-01-01 01:00:00	0.0	26
2012-01-01 02:00:00	0.0	30
2012-01-01 03:00:00	0.0	30
2012-01-01 04:00:00	0.0	27