๐Ÿ† ์ž๊ฒฉ์ฆ, ์–ดํ•™

[๋น…๋ฐ์ดํ„ฐ ๋ถ„์„๊ธฐ์‚ฌ] ์‹ค๊ธฐ 6ํšŒ - 1์œ ํ˜• datetime, astype('datetime64[ns]')

๋ฐ์ดํ„ฐํŒ์Šค 2024. 8. 19. 16:44

 

์ผ๋‹จ ๋ฐ์ดํ„ฐ ํƒ€์ž… ํ˜•ํƒœ๋ถ€ํ„ฐ ์ดํ•ด๋ฅผ ํ•ด์•ผํ•จ

๋Œ€ํ‘œ์‚ฌ์ง„ ์‚ญ์ œ

์‚ฌ์ง„ ์„ค๋ช…์„ ์ž…๋ ฅํ•˜์„ธ์š”.

๋Œ€ํ‘œ์‚ฌ์ง„ ์‚ญ์ œ
 

์‚ฌ์ง„ ์„ค๋ช…์„ ์ž…๋ ฅํ•˜์„ธ์š”.

 

 

 

๋ฌธ์ œ

๊ฐ ๊ตฌ๊ธ‰ ๋ณด๊ณ ์„œ ๋ณ„ ์ถœ๋™์‹œ๊ฐ๊ณผ ์‹ ๊ณ ์‹œ๊ฐ์˜ ์ฐจ์ด๋ฅผ '์†Œ์š”์‹œ๊ฐ„' ์ปฌ๋Ÿผ์„ ๋งŒ๋“ค๊ณ  ์ดˆ(sec)๋‹จ์œ„๋กœ ๊ตฌํ•˜๊ณ 

์†Œ๋ฐฉ์„œ๋ช… ๋ณ„ ์†Œ์š”์‹œ๊ฐ„์˜ ํ‰๊ท ์„ ์˜ค๋ฆ„์ฐจ์ˆœ์œผ๋กœ ์ •๋ ฌ ํ–ˆ์„๋•Œ 3๋ฒˆ์งธ๋กœ ์ž‘์€ ์†Œ์š”์‹œ๊ฐ„์˜ ๊ฐ’๊ณผ ์†Œ๋ฐฉ์„œ๋ช…์„ ์ถœ๋ ฅํ•˜๋ผ

๋Œ€ํ‘œ์‚ฌ์ง„ ์‚ญ์ œ

์‚ฌ์ง„ ์„ค๋ช…์„ ์ž…๋ ฅํ•˜์„ธ์š”.

์ฒ˜์Œ์— ๊ทธ๋ƒฅ ๋ฌด์‹ํ•˜๊ฒŒ df['์ถœ๋™์‹œ๊ฐ']-df['์‹ ๊ณ ์‹œ๊ฐ']์œผ๋กœ ๊ณ„์‚ฐํ–ˆ๋Š”๋ฐ ๋งˆ์ด๋„ˆ์Šค๊ฐ€ ๋‚˜์˜ด..

๋ฐ์ดํ„ฐ๋ฅผ ๋‹ค์‹œ ๋ณด๋‹ˆ๊นŒ ๋งˆํฌ์†Œ๋ฐฉ์„œ ๊ธฐ์ค€ ์ถœ๋™์ผ์ž๋ž‘ ์‹ ๊ณ ์ผ์ž๋„ ๋‹ค๋ฆ„.. ์‹œ๊ฐ๋„ ๊ฐ™์€ ํฌ๊ธฐ ๋‹จ์œ„๊ฐ€ ์•„๋‹ˆ์—ˆ์Œ

์ด๊ฑธ ์–ด๋–ป๊ฒŒ pd.to_datetime์œผ๋กœ ํ•˜๋‚˜ ๊ณ ๋ฏผ์— ๋น ์ ธ์„œ ์ •๋‹ต์„ ๋ด„

 
์‚ฌ์ง„ ์‚ญ์ œ

์‚ฌ์ง„ ์„ค๋ช…์„ ์ž…๋ ฅํ•˜์„ธ์š”.

df['์ถœ๋™์ผ์ž'].astype('str')๋กœ int(์ˆซ์žํ˜•) > str(๋ฌธ์žํ˜•)์œผ๋กœ ๋ฐ”๊ฟ”๋†“์€ ๊ฑด ์•Œ๊ฒŒ๋จ

๊ทผ๋ฐ ๋’ค์— str.zfill์€ ๋Œ€์ฒด ๋ญ”๊ฐ€ ์‹ถ์–ด์„œ ์ฐพ์•„๋ด„

 

str.zfill() : ๋ฌธ์ž์—ด์˜ ์™ผ์ชฝ์— 0์„ ์ฑ„์›Œ์„œ ์ง€์ •ํ•œ ๊ธธ์ด๋ฅผ ๋งŒ์กฑ์‹œํ‚ค๋Š” ๋ฉ”์„œ๋“œ

ํ•ด๋‹น ๋ฌธ์ž์—ด์ด ์ง€์ •ํ•œ ๊ธธ์ด๋ณด๋‹ค ์งง์„ ๊ฒฝ์šฐ, ์™ผ์ชฝ์— 0์„ ์ถ”๊ฐ€ํ•˜์—ฌ ๊ธธ์ด๋ฅผ ๋งž์ถฅ๋‹ˆ๋‹ค.

๋งŒ์•ฝ ๋ฌธ์ž์—ด์˜ ๊ธธ์ด๊ฐ€ ์ด๋ฏธ ์ง€์ •ํ•œ ๊ธธ์ด๋ณด๋‹ค ํฌ๊ฑฐ๋‚˜ ๊ฐ™์„ ๊ฒฝ์šฐ, ๋ฌธ์ž์—ด์„ ๊ทธ๋Œ€๋กœ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

 

์‹ ๊ณ ์‹œ๊ฐ ๊ฐ™์€ ๊ฒฝ์šฐ์— ๊ธ€์ž ์ž๋ฆฌ์ˆ˜๊ฐ€ ๋‹ค๋ฅด๋‹ˆ๊นŒ ๊ทธ๊ฑธ ๋งž์ถฐ์ฃผ๋ ค๊ณ  ํ•œ ๊ฒƒ ๊ฐ™์Œ

str.zfill(6) ํ•˜๋‹ˆ ๋ชจ๋‘ 6์ž๋ฆฌ๋กœ ๋งž์ถฐ์ง

์ด๋ ‡๊ฒŒ ๋‘๊ฐœ๋ฅผ ํ•ฉ์น˜๊ณ  ๋‚˜์„œ์•ผ ๋‚ ์งœ+์‹œ๊ฐ„ ์œผ๋กœ pd.to_datetime์ด ๋˜๋”๋ผ

 

๊ทผ๋ฐ ๋˜ dt.total_seconds()๋Š” ๋ญ๋ƒ??

total_seconds() >> ์ดˆ ๋‹จ์œ„๋กœ ๋ฐ˜ํ™˜ํ•ด์คŒ

result = df.groupby(['์†Œ๋ฐฉ์„œ๋ช…'])['์†Œ์š”์‹œ๊ฐ'].mean().sort_values().reset_index().iloc[2].values
print(result)
 

 

์†Œ๋ฐฉ์„œ๋ณ„ > groupby

์†Œ์š”์‹œ๊ฐ„์˜ ํ‰๊ท ์„ ์˜ค๋ฆ„์ฐจ์ˆœ์œผ๋กœ ์ •๋ ฌ > ['์†Œ์š”์‹œ๊ฐ'].mean().sort_values()

3๋ฒˆ์งธ๋กœ ์ž‘์€ ์†Œ์š”์‹œ๊ฐ„์˜ ๊ฐ’๊ณผ ์†Œ๋ฐฉ์„œ๋ช… > reset_index() ์ธ๋ฑ์Šค ์ดˆ๊ธฐํ™”ํ•ด์„œ ์นผ๋Ÿผ์œผ๋กœ ๊ฐ€์ ธ์˜จ๋‹ค์Œ์— 0,1,2 ๋‹ˆ๊นŒ 3๋ฒˆ์งธ๊บผ ๊ฐ€์ ธ์˜ค๊ธฐ

 

์—ฌ๊ธฐ์„œ ๊ถ๊ธˆ์ฆ

์‹œ๊ฐ„๋งŒ์€ datetime์œผ๋กœ ๋ชป ๋งŒ๋“œ๋‚˜???? ๋ผ๋Š” ์ƒ๊ฐ์ด

ํ•ด๊ฒฐ๋ฐฉ๋ฒ•์€ ์•„๋ž˜

 
์‚ฌ์ง„ ์‚ญ์ œ

์‚ฌ์ง„ ์„ค๋ช…์„ ์ž…๋ ฅํ•˜์„ธ์š”.

 

์‹œ๊ฐ„ ๋ฌธ์ œ๊ฐ€ 1์œ ํ˜•์—์„œ ์ œ์ผ ์–ด๋ ค์šด๊ฑฐ ๊ฐ™๋‹ค...

 

 


๊ทผ๋ฐ ๋‹ค๋ฅธ ํ’€์ด๋ฅผ ๋˜ ๋ฐœ๊ฒฌํ•จ

df['์ถœ๋™์‹œ๊ฐ„'] = df['์ถœ๋™์‹œ๊ฐ„'].astype('datetime64[ns]')
df['์‹ ๊ณ ์‹œ๊ฐ„'] = df['์‹ ๊ณ ์‹œ๊ฐ„'].astype('datetime64[ns]')
df['์ฐจ์ด'] = df['์ถœ๋™์‹œ๊ฐ„'] - df['์‹ ๊ณ ์‹œ๊ฐ„']
value = df.groupby('์†Œ๋ฐฉ์„œ๋ช…')['์ฐจ์ด'].mean().max() #00:02:34.285714285
result1 = round(value.seconds/60) #value.days, value.seconds
print(result1)
 

astype('datetime64[ns]') ์ด๊ฑด ๋ญ์‹œ๋ƒ ๋˜ ์ฒ˜์Œ ๋ณด๋Š” ํ•จ์ˆ˜๋ผ์„œ ์ฐพ์•„๋ดค๋‹ค

df['์ถœ๋™์‹œ๊ฐ„'] = pd.to_datetime(df['์ถœ๋™์‹œ๊ฐ„'], format = '%H-%M-%S')
df['์ถœ๋™์‹œ๊ฐ„'] = df['์ถœ๋™์‹œ๊ฐ„'].astype('datetime64[ns]')
 

์ด๋ ‡๊ฒŒ ๋‘๊ฐ€์ง€ ๋ฐฉ์‹์œผ๋กœ ๊ฐ€๋Šฅํ•œ ๋“ฏ

๊ทผ๋ฐ ๋˜ ์˜ค๋ฅ˜๊ฐ€ ์ƒ๊น€

 
์‚ฌ์ง„ ์‚ญ์ œ

์‚ฌ์ง„ ์„ค๋ช…์„ ์ž…๋ ฅํ•˜์„ธ์š”.

๋ ์šฉ?? ์˜ค๋ฅ˜๋ฅผ ์ฝ์–ด๋ดค๋”๋‹ˆ ๊ฐ™์€ ํฌ๋งท์ด ์•„๋‹ˆ๋ผ๊ณ  ๋‚˜์˜ค๋Š” ๊ฒƒ ๊ฐ™๋‹ค

๊ทธ๋Ÿฌ๋‹ˆ๊นŒ ์ถœ๋™์‹œ๊ฐ์ด ์–ด๋А๊ฑด 120324๊ณ  ์–ด๋А๊ฑด 324 ๋ผ๋ฉด

120324=์˜คํ›„ 12์‹œ 03๋ถ„ 24์ดˆ , 324=์˜ค์ „ 0์‹œ 3๋ถ„ 24์ดˆ ์ด๋ ‡๊ฒŒ ์ฝ๋Š” ๊ฑด๋ฐ ํ˜•ํƒœ๊ฐ€ ๋‹ฌ๋ผ์„œ ์˜ค๋ฅ˜๊ฐ€ ๋‚˜๋Š” ๋“ฏ

๊ฒฐ๊ตญ pd.to_datetime ํ•˜๋ ค๋ฉด ์œ„์—์„œ str.zfill(6) ์จ์„œ ์ถœ๋™์‹œ๊ฐ ์•ž์— 0์„ ์ฑ„์›Œ์ค˜์•ผ ํ•œ๋‹ค...

 

์‹œํ—˜์—์„œ๋Š” astype('datetime64[ns]') ๋ฅผ ์™ธ์›Œ๊ฐ€์„œ ์“ฐ๊ณ  ์•ˆ๋˜๋ฉด ์ถ”๊ฐ€ํ•˜์ž