1. Import Packages and Lead Dataset
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
iris = load_iris()
DF = pd.DataFrame(data = iris.data,
columns = ['sepal_length',
'sepal_width',
'petal_length',
'petal_width'])
DF.head(3)
2. K-means Modeling
from sklearn.cluster import KMeans
kmeans_3 = KMeans(n_clusters = 3,
init ='k-means++',
max_iter = 15,
random_state = 2045)
kmeans_3.fit(DF)
3. Silhouette Analysis
3-1. DF에 'Clustering' 추가
DF['Clustering'] = kmeans_3.labels_
DF.head(3)
3-2. 실수엣 계수값
from sklearn.metrics import silhouette_samples
silhouette_samples(iris.data, DF['Clustering'])
DF['Silh_Coef'] = silhouette_samples(iris.data, DF['Clustering'])
DF.head(3)
3-3. 실루엣 점수(Silhouette Score)
from sklearn.metrics import silhouette_score
silhouette_score(iris.data, DF['Clustering'])
0.5528190123564091
DF.groupby('Clustering')['Silh_Coef'].mean()
Clustering
0 0.417320
1 0.798140
2 0.451105
Name: Silh_Coef, dtype: float64
Machine Learning (1) | 2024.09.28 |
---|---|
연관 규칙(Association Rules) (0) | 2022.06.09 |
K-평균 군집(K-means Clustering) 1 (0) | 2022.06.09 |
Random Forest (0) | 2022.06.08 |
의사결정 나무(Decision Tree) (0) | 2022.06.08 |