๐Ÿ† ์ž๊ฒฉ์ฆ, ์–ดํ•™

[๋น…๋ฐ์ดํ„ฐ ๋ถ„์„๊ธฐ์‚ฌ] ์‹ค๊ธฐ - 3์œ ํ˜• ๋ชจํ‰๊ท  ๊ฒ€์ •(๋ชจ์ง‘๋‹จ 1๊ฐœ) T-test,wilcoxon

๋ฐ์ดํ„ฐํŒ์Šค 2024. 8. 20. 17:57

 

import	scipy.stats	as	stats
from	scipy.stats	import	shapiro
import	pandas	as	pd
import	numpy	as	np
 
 

์ผ๋‹จ 3์œ ํ˜•์—์„œ ํ•„์š”ํ•œ๊ฑด ์ •๊ทœ์„ฑ ๊ฒ€์ •์„ ์œ„ํ•ด shapiro๋ฅผ ๋ถˆ๋Ÿฌ์˜ค์ž

#	1.	๊ฐ€์„ค์„ค์ •
#	H0	:	mpg	์—ด์˜	ํ‰๊ท ์ด	20๊ณผ	๊ฐ™๋‹ค.
#	H1	:	mpg	์—ด์˜	ํ‰๊ท ์ด	20๊ณผ	๊ฐ™์ง€	์•Š๋‹ค.
 
#	2.	์œ ์˜์ˆ˜์ค€	ํ™•์ธ	:	์œ ์˜์ˆ˜์ค€	5%๋กœ	ํ™•์ธ
 
#	3.	์ •๊ทœ์„ฑ	๊ฒ€์ •
#	H0(๊ท€๋ฌด๊ฐ€์„ค)	:	์ •๊ทœ๋ถ„ํฌ๋ฅผ	๋”ฐ๋ฅธ๋‹ค.
#	H1(๋Œ€๋ฆฝ๊ฐ€์„ค)	:	์ •๊ทœ๋ถ„ํฌ๋ฅผ	๋”ฐ๋ฅด์ง€	์•Š๋Š”๋‹ค.
statistic,	pvalue	=	stats.shapiro(df['mpg'])
print(round(statistic,4),	round(pvalue,4))
result	=	stats.shapiro(df['mpg'])
print(result)
 
#	4.1	(์ •๊ทœ์„ฑ๋งŒ์กฑ	O)	t-๊ฒ€์ •	์‹ค์‹œ
statistic,	pvalue	=	stats.ttest_1samp(df['mpg'],	popmean=	20,	alternative='two-sided')		#	H1	:	์™ผ์ชฝ๊ฐ’์ด	์˜ค๋ฅธ์ชฝ	๊ฐ’
print(round(statistic,4),	round(pvalue,4)	)
#	alternative	(๋Œ€๋ฆฝ๊ฐ€์„ค	H1)	์˜ต์…˜	:	'two-sided',	'greater',	'less'
 

๋ชจํ‰๊ท  ๊ฒ€์ • - ๋ชจ์ง‘๋‹จ 1๊ฐœ - ์ •๊ทœ์„ฑ ๋งŒ์กฑO - T test

statas.ttest_1samp ํ•จ์ˆ˜ ์‚ฌ์šฉ

๋น„๊ตํ•  ๊ฐ’ = popmean

 

#	4.2	(์ •๊ทœ์„ฑ๋งŒ์กฑ	X)	wilcoxon	๋ถ€ํ˜ธ์ˆœ์œ„	๊ฒ€์ •
statistic,	pvalue	=	stats.wilcoxon(df['mpg']-20,	alternative='two-sided')
print(round(statistic,4),	round(pvalue,4)	)
 

๋ชจํ‰๊ท  ๊ฒ€์ • - ๋ชจ์ง‘๋‹จ 1๊ฐœ - ์ •๊ทœ์„ฑ ๋งŒ์กฑX - wilcoxon

stats.wilcoxon ํ•จ์ˆ˜ ์‚ฌ์šฉ

df['์นผ๋Ÿผ๋ช…']-๋น„๊ตํ•  ๊ฐ’