๐Ÿ† ์ž๊ฒฉ์ฆ, ์–ดํ•™

[๋น…๋ฐ์ดํ„ฐ ๋ถ„์„๊ธฐ์‚ฌ] ์‹ค๊ธฐ - 3์œ ํ˜• ๋‹ค์ค‘ํšŒ๊ท€๋ถ„์„, ์ƒ๊ด€๋ถ„์„

๋ฐ์ดํ„ฐํŒ์Šค 2024. 8. 21. 18:05

 

๋ฌธ์ œ์—์„œ ๋‹ค์ค‘ํšŒ๊ท€๋ถ„์„์„ ํ•œ๋‹ค๊ณ  ๋‚˜์™”๋‹ค๋ฉด

# x=๋…๋ฆฝ๋ณ€์ˆ˜, y=์ข…์†๋ณ€์ˆ˜ ํ• ๋‹น
x=df[['์นผ๋Ÿผ๋ช…1','์นผ๋Ÿผ๋ช…2','์นผ๋Ÿผ๋ช…3']] ํ˜น์€ x=df.drop(columns=['์นผ๋Ÿผ๋ช…'])
y=df['์นผ๋Ÿผ๋ช…']
 

๊ทธ ๋‹ค์Œ์— sklearn์ด๋ž‘ statsmodel๋กœ ํ’€์ง€ ๊ณ ๋ฏผํ•ด์•ผ ํ•œ๋‹ค

#sklearn ํ’€์ด ๋ฐฉ์‹
import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression

model=LinerRegression()
result=model.fit(x,y)
๊ฒฐ์ •๊ณ„์ˆ˜=model.score(x,y)
ํšŒ๊ท€๊ณ„์ˆ˜=model.coef_
 
#statsmodels ํ’€์ด ๋ฐฉ์‹
import pandas as pd
import numpy as np
import statsmodels.api as sm

x=sm.add_constant(x)      #์ƒ์ˆ˜์ถ”๊ฐ€
model=sm.OLS(y,x).fit()   #y,x ์ˆœ์œผ๋กœ ์“ธ ๊ฒƒ
summary=model.summary()
 

์ด๊ฒƒ๋งŒ ์™ธ์šฐ๋ฉด ๋‹ค์ค‘ํšŒ๊ท€๋ถ„์„์€ ํ‘ธ๋Š”๋ฐ ๋ฌธ์ œ๊ฐ€ ์—†๋‹ค!!

 

from scipy.stats import pearsonr
r, pvalue=pearson(x,y)
 

 

 

์“ฐ๊ณ  ๋ณด๋‹ˆ ์ฝ”๋“œ๊ฐ€ ๋ณ„๊ฑฐ ์—†์ž–์•„..?

๊ทผ๋ฐ ์™œ์ด๋ฆฌ ํ—ท๊ฐˆ๋ ธ๋˜ ๊ฑฐ๋žŒ ๋จธ์“ฑ