문자형 변수를 숫자형 변수로 인코딩
1. Data Set
import seaborn as sns
DF = sns.load_dataset('mpg')
DF.head()
type(DF.origin[0])
str
DF.origin.value_counts()
usa 249
japan 79
europe 70
Name: origin, dtype: int64
X = DF[['origin']]
X[111:115]
2. With LabelEncoder
#정수(Integer) 인코딩
from sklearn.preprocessing import LabelEncoder
encoder1 = LabelEncoder()
LE = encoder1.fit_transform(X)
#정수 인코딩 결과
LE[111:115]
array([1, 2, 2, 0])
3. With OneHotEncoder
from sklearn.preprocessing import OneHotEncoder
encoder2 = OneHotEncoder()
OHE = encoder2.fit_transform(X)
#Array 변환 필요
print(OHE[111:115])
OHE.toarray()[111:115]
로지스틱 회귀(Logistic Regression) 1 (0) | 2022.06.07 |
---|---|
회귀분석(Regression Analysis) 4 (0) | 2022.06.07 |
회귀분석(Regression Analysis) 2 (0) | 2022.06.07 |
회귀분석(Regression Analysis) 1 (1) | 2022.06.07 |
Model Validation (0) | 2022.06.06 |