๐Ÿ† ์ž๊ฒฉ์ฆ, ์–ดํ•™

[๋น…๋ฐ์ดํ„ฐ ๋ถ„์„๊ธฐ์‚ฌ] ์‹ค๊ธฐ 2ํšŒ - 1์œ ํ˜• sort_values

๋ฐ์ดํ„ฐํŒ์Šค 2024. 8. 19. 16:50

 

๋ฌธ์ œ

์ฃผ์–ด์ง„ Dataset์—์„œ CRIM๊ฐ’์ด ๊ฐ€์žฅ ํฐ 10๊ฐœ์˜ ์ง€์—ญ์„ ๊ตฌํ•˜๊ณ 

10๊ฐœ์˜ ์ง€์—ญ์˜ CRIM๊ฐ’์„ ๊ทธ ์ค‘ ๊ฐ€์žฅ ์ž‘์€ ๊ฐ’์œผ๋กœ ๋Œ€์ฒดํ•˜๋ผ. ๊ทธ๋ฆฌ๊ณ 

AGE ์ปฌ๋Ÿผ ๊ฐ’์ด 80์ด์ƒ์ธ ๋Œ€์ฒด ๋œ CRIM ํ‰๊ท ๊ฐ’์„ ๊ตฌํ•˜๋ผ

min=df.sort_values('CRIM',ascending=False).reset_index(drop=True).iloc[:10]['CRIM'].min()
import numpy as np
df['CRIM']	=	np.where(df['CRIM']>=min,	min,	df['CRIM'])
df[df['AGE']>=80]['CRIM'].mean()
 

์ฝ”๋“œ๊ฐ€ ๋‹ค๋ฅธ๋ถ€๋ถ„์ด ๋”ฑ ํ•˜๋‚˜ ์žˆ๋Š”๋ฐ

๋‚˜๋Š” crim๊ฐ’์ด min๋ณด๋‹ค ํฐ ๊ฐ’์€ ๋‹ค min์œผ๋กœ ์•„๋‹Œ๊ฑด crim๊ฐ’ ๊ทธ๋Œ€๋กœ ๋‘๋Š” np.where ํ•จ์ˆ˜๋ฅผ ์ผ๊ณ 

์—ฌ๊ธฐ์„œ๋Š” loc๋ฅผ ์ด์šฉํ•ด์„œ df์˜ ์นผ๋Ÿผ์€ crim, 0~9ํ–‰ ๊นŒ์ง€ ๊ฐ’์„ ๋ชจ๋‘ min์œผ๋กœ ๋ฐ”๋กœ ๋„ฃ๋Š” ๊ฒƒ์„ ์‚ฌ์šฉํ–ˆ๋‹ค

df.loc[:9,'CRIM'] = df.loc[:9,'CRIM'].min()
df['CRIM']	=	np.where(df['CRIM']>=min,	min,	df['CRIM'])