๐Ÿ† ์ž๊ฒฉ์ฆ, ์–ดํ•™

[๋น…๋ฐ์ดํ„ฐ ๋ถ„์„๊ธฐ์‚ฌ] ์‹ค๊ธฐ - 3์œ ํ˜• ๋ชจํ‰๊ท  ๊ฒ€์ •(๋ชจ์ง‘๋‹จ 3๊ฐœ) F-๊ฒ€์ •, ANOVA ๋ถ„์„

๋ฐ์ดํ„ฐํŒ์Šค 2024. 8. 20. 17:58

 

import	pandas	as	pd
import	numpy	as	np
import	scipy.stats	as	stats
from	scipy.stats	import	shapiro
 
 

shaprio ์ผ๋‹จ ๋ถˆ๋Ÿฌ์˜ค๊ณ !

#	1.	๊ฐ€์„ค์„ค์ •
#	H0	:	์„ธ	๊ทธ๋ฃน	์„ฑ์ ์˜	ํ‰๊ท ๊ฐ’์ด	๊ฐ™๋‹ค.	(	A(ํ‰๊ท )	=	B(ํ‰๊ท )	=	C(ํ‰๊ท )	)	
#	H1	:	์„ธ	๊ทธ๋ฃน์˜	์„ฑ์ 	ํ‰๊ท ๊ฐ’์ด	์ ์–ด๋„	ํ•˜๋‚˜๋Š”	๊ฐ™์ง€	์•Š๋‹ค.	(not	H0)
 
#	2.	์œ ์˜์ˆ˜์ค€	ํ™•์ธ	:	์œ ์˜์ˆ˜์ค€	5%๋กœ	ํ™•์ธ
 
#	3.	์ •๊ทœ์„ฑ	๊ฒ€์ •
print(stats.shapiro(df['A']))
print(stats.shapiro(df['B']))
print(stats.shapiro(df['C']))
#	statistic,	pvalue	=	stats.shapiro(df['A'])
#	print(round(statistic,4),	round(pvalue,4))
 

ANOVA ๊ฒ€์ •์€ ๊ฐ ์นผ๋Ÿผ์— ๋Œ€ํ•ด ๋ชจ๋‘ stats.shapiro๋ฅผ ๊ตฌํ•œ๋‹ค

๋งŒ์•ฝ ํ•˜๋‚˜๋ผ๋„ ์ •๊ทœ๋ถ„ํฌ๋ฅผ ๋”ฐ๋ฅด์ง€ ์•Š๋Š”๋‹ค๋ฉด ๋น„๋ชจ์ˆ˜ ๊ฒ€์ •๋ฐฉ๋ฒ•(์œŒ์ฝ•์Šจ)์„ ์จ์•ผ ํ•จ >> ๊ทผ๋ฐ ๋น„๋ชจ์ˆ˜๋Š” ์‹œํ—˜์— ์ถœ์ œ๋  ํ™•๋ฅ ์ด ์ ์Œ

#	4.	๋“ฑ๋ถ„์‚ฐ์„ฑ	๊ฒ€์ •
#	H0(๊ท€๋ฌด๊ฐ€์„ค)	:	๋“ฑ๋ถ„์‚ฐ	ํ•œ๋‹ค.
#	H1(๋Œ€๋ฆฝ๊ฐ€์„ค)	:	๋“ฑ๋ถ„์‚ฐ	ํ•˜์ง€	์•Š๋Š”๋‹ค.
print(stats.bartlett(df['A'],	df['B'],	df['C'])	)
 

๋“ฑ๋ถ„์‚ฐ์„ฑ ๊ฒ€์ •ํ• ๋• stats.barlett()

#	5.1	(์ •๊ทœ์„ฑO,	๋“ฑ๋ถ„์‚ฐ์„ฑ	O)	๋ถ„์‚ฐ๋ถ„์„(F_oneway)
import	scipy.stats	as	stats
statistic,	pvalue	=	stats.f_oneway(df['A'],	df['B'],df['C'])
#	์ฃผ์˜	:	๋ฐ์ดํ„ฐ๊ฐ€	๊ฐ๊ฐ	๋“ค์–ด๊ฐ€์•ผ	ํ•จ
print(round(statistic,4),	round(pvalue,4)	)
 

๋ชจํ‰๊ท  ๊ฒ€์ • - ๋ชจ์ง‘๋‹จ 3๊ฐœ - ์ •๊ทœ์„ฑ O - ๋“ฑ๋ถ„์‚ฐ์„ฑ O

stats.f_oneway() ํ•จ์ˆ˜ ์‚ฌ์šฉ

#	5.3	(์ •๊ทœ์„ฑX)	ํฌ๋ฃจ์Šค์นผ	์™ˆ๋ฆฌ์Šค	๊ฒ€์ •
import	scipy.stats	as	stats
statistic,	pvalue	=	stats.kruskal(df['A'],	df['B'],	df['C'])
print(round(statistic,4),	round(pvalue,4)	)
 

๋ชจํ‰๊ท  ๊ฒ€์ • - ๋ชจ์ง‘๋‹จ 3๊ฐœ - ์ •๊ทœ์„ฑ X

stats.kruskal() ํ•จ์ˆ˜ ์‚ฌ์šฉ

 

๋Œ“๊ธ€์ˆ˜0