๐Ÿ Python/์˜จ๋ผ์ธ ๊ฐ•์˜ 5

[๋ฉ”ํƒ€์ฝ”๋“œ] ์œ ํ†ต/์ด์ปค๋จธ์Šค ๋ฐ์ดํ„ฐ ๋ถ„์„ ์ž…๋ฌธ ๊ฐ•์˜ str.replace ํ•จ์ˆ˜

adidas['Price per Unit'] = adidas['Price per Unit'].str.replace('[%$,]', '').astype('float') # float ํƒ€์ž…์œผ๋กœ ๋ณ€ํ™˜adidas['Units Sold'] = adidas['Units Sold'].str.replace('[%$,]', '').astype('float')adidas['Total Sales'] = adidas['Total Sales'].str.replace('[%$,]', '').astype('float')adidas['Operating Profit'] = adidas['Operating Profit'].str.replace('[%$,]', '').astype('float')adidas['Operating Margin..

[๋ฉ”ํƒ€์ฝ”๋“œ] ๋ฐ์ดํ„ฐ ๋ถ„์„ ์ž…๋ฌธ Python ๋ถ€ํŠธ์บ ํ”„ 3-06 ๋ฌธ์ œํ’€์ด

start ์—ด๊ณผ end ์—ด์ด ์žˆ๋Š” DataFrame df๊ฐ€ ์ฃผ์–ด์ง‘๋‹ˆ๋‹ค.for ๋ฃจํ”„๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ DataFrame์˜ 'start' ์—ด๊ณผ 'end' ์—ด์„ datetime ์œ ํ˜•์œผ๋กœ ๋ณ€ํ™˜ํ•˜์‹ญ์‹œ์˜ค.start์—์„œ end๊นŒ์ง€์˜ ์‹œ๊ฐ„ ์ฐจ์ด๋ฅผ ์‹œ๊ฐ„ ๋‹จ์œ„๋กœ ๊ณ„์‚ฐํ•˜์‹ญ์‹œ์˜ค. ์‹œ๊ฐ„ ์ดํ•˜ ๋‹จ์œ„์˜ ์ฐจ์ด๋Š” ์†Œ์ˆ˜์ ์œผ๋กœ ํ‘œ์‹œํ•˜์‹ญ์‹œ์˜ค.'end' ์—ด์˜ ๋‚ ์งœ๋“ค์„ ์ฃผ ๋‹จ์œ„('W')๋กœ ๋ณ€ํ™˜ํ•˜๊ณ , ์ด๋ฅผ 'yyyy-mm-dd' ํ˜•ํƒœ์˜ ๋ฌธ์ž์—ด๋กœ ํ‘œํ˜„ํ•˜์‹ญ์‹œ์˜ค.df['end'].dt.to_period('W').dt.strftime('%Y-%m-%d')df['duration'] = (df['end'] - df['start']).dt.total_seconds()/3600cols=['start','end']for i in cols: df[i]=pd.to_da..

[๋ฉ”ํƒ€์ฝ”๋“œ] ๋ฐ์ดํ„ฐ ๋ถ„์„ ์ž…๋ฌธ Python ๋ถ€ํŠธ์บ ํ”„ 3-09 ๋ฌธ์ œํ’€์ด

ํŒ๋‹ค์Šค์˜ melt ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•ด 'iris' ๋ฐ์ดํ„ฐ์…‹์„ 'long form'์œผ๋กœ ๋ณ€ํ™˜ํ•˜์‹œ์˜ค'variable' ์—ด์˜ ์ด๋ฆ„์„ 'measure_type'์œผ๋กœ, 'value' ์—ด์˜ ์ด๋ฆ„์„ 'measure_value'๋กœ ๋ณ€๊ฒฝiris_long์œผ๋กœ ์ €์žฅiris_long์„ ํ™œ์šฉํ•˜์—ฌ species ์™€ measure_type ๋ณ„ ํ‰๊ท ์„ ๊ตฌํ•˜์‹œ์˜คiris_long=iris.melt( id_vars='species', value_vars=['sepal_length','sepal_width','petal_length','petal_width'], var_name='measure_type', value_name='measure_value')iris_long.pivot_table(index='species', ..

[๋ฉ”ํƒ€์ฝ”๋“œ] ๋ฐ์ดํ„ฐ ๋ถ„์„ ์ž…๋ฌธ Python ๋ถ€ํŠธ์บ ํ”„ 3-07 ๋ฌธ์ œํ’€์ด

์—ฐ์Šต๋ฌธ์ œ: Titanic ๋ฐ์ดํ„ฐ์…‹์„ ์ด์šฉํ•œ ๊ทธ๋ฃนํ™” ๋ฐ ์ง‘๊ณ„Titanic ๋ฐ์ดํ„ฐ์…‹์„ ์ด์šฉํ•˜์—ฌ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์—ฐ์‚ฐ์„ ์ˆ˜ํ–‰ํ•˜๋Š” Pandas ์ฝ”๋“œ๋ฅผ ์ž‘์„ฑํ•˜์‹ญ์‹œ์˜ค.๋ฐ์ดํ„ฐ๋ฅผ 'Pclass' (๊ฐ์‹ค ๋“ฑ๊ธ‰)๋ณ„๋กœ ๊ทธ๋ฃนํ™”ํ•˜๊ณ , ๊ฐ ๊ทธ๋ฃน์— ๋Œ€ํ•ด ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์ง‘๊ณ„ ์—ฐ์‚ฐ์„ ์ˆ˜ํ–‰ํ•˜์‹ญ์‹œ์˜ค:Survived ์ปฌ๋Ÿผ์˜ ํ•ฉ๊ณ„ (์ƒ์กด์ž ์ˆ˜).Fare ์ปฌ๋Ÿผ์˜ ํ‰๊ท  (ํ‰๊ท  ์š”๊ธˆ).Embarked ์ปฌ๋Ÿผ์˜ ๊ณ ์œ ๊ฐ’ ์ˆ˜ (์ถœ๋ฐœํ•œ ํ•ญ๊ตฌ์˜ ์ข…๋ฅ˜ ์ˆ˜).๊ฒฐ๊ณผ๋ฅผ ์ƒˆ๋กœ์šด DataFrame์œผ๋กœ ์ €์žฅํ•˜๊ณ , ๊ทธ๋ฃนํ™”ํ–ˆ๋˜ 'Pclass' ์ปฌ๋Ÿผ์„ ๋‹ค์‹œ DataFrame์˜ ์ปฌ๋Ÿผ์œผ๋กœ ๋ณ€ํ™˜ํ•˜์‹ญ์‹œ์˜ค (์ฆ‰, reset_index()๋ฅผ ์‚ฌ์šฉ).import seaborn as snsdf=sns.load_dataset('titanic')import pandas as pddf.groupby..

[๋ฉ”ํƒ€์ฝ”๋“œ] ๋ฐ์ดํ„ฐ ๋ถ„์„ ์ž…๋ฌธ Python ๋ถ€ํŠธ์บ ํ”„ 3-02 ๋ฌธ์ œํ’€์ด

titanic ๋ฐ์ดํ„ฐ๋ฅผ sns๋กœ ๋ถˆ๋Ÿฌ์˜ค๊ณ  ์ƒ์œ„ 25% age ๋ฅผ ์ฐพ์œผ์‹œ์˜คimport pandas as pdimport seaborn as snsdf=sns.load_dataset('titanic')b=891*0.25df.sort_values(by='age',ascending=False).head(int(b)) #๋‚ด๊ฐ€ ์“ด ์ฝ”๋“œsns.load_dataset('titanic').describe() #์ •๋‹ต ์ฝ”๋“œ ์ •๋‹ต์€ 20.125๊ฐ€ ๋‚˜์˜ค๋Š”๋ฐ ๋‚ด๊ฐ€ ์“ด ๋‹ต์€ 35๊ฐ€ ๋‚˜์˜จ๋‹ค..