| 일 | 월 | 화 | 수 | 목 | 금 | 토 |
|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | 7 |
| 8 | 9 | 10 | 11 | 12 | 13 | 14 |
| 15 | 16 | 17 | 18 | 19 | 20 | 21 |
| 22 | 23 | 24 | 25 | 26 | 27 | 28 |
- 서열정렬
- AP
- 결정트리
- 시그모이드
- 생물정보학
- AP Computer Science A
- BLaST
- HMM
- ncbi
- 파이썬
- COVID
- 바이오인포매틱스
- 블록체인
- 인공지능 수학
- 자바
- 인공신경망
- 인공지능
- 캐글
- CNN
- 오류역전파
- 생명정보학
- 딥러닝
- 바이오파이썬
- bioinformatics
- RNN
- SVM
- Java
- Kaggle
- 이항분포
- MERS
- Today
- Total
데이터 과학
고양이와 개의 분류 예제 - CNN (케라스) 본문
고양이와 개의 분류에 대한 컨볼루션 알고리즘의 적용 예제입니다.
이전 예제에서 좀 더 다른 예제를 통해 고양이와 개의 분류를 케라스로 적용해 봅시다.
import numpy as np
import pandas as pd
from keras.preprocessing.image import ImageDataGenerator, load_img
from keras.utils import to_categorical
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
import random
import os
print(os.listdir("./catndog"))
결과:
'sampleSubmission.csv', 'test1.zip', 'train.zip
파일이 있는 위치는 이 링크에 있습니다. https://www.kaggle.com/competitions/dogs-vs-cats/data
파일이 3개 있는데 다운로드하여서 주피터 노트북 ./catndog 폴더에 넣으면 됩니다.
FAST_RUN = False
IMAGE_WIDTH=128
IMAGE_HEIGHT=128
IMAGE_SIZE=(IMAGE_WIDTH, IMAGE_HEIGHT)
IMAGE_CHANNELS=3
// 이미지 크기는 128*128입니다. 채널은 3개입니다.
filenames = os.listdir("./catndog/train/train")
categories = []
for filename in filenames:
category = filename.split('.')[0]
if category == 'dog':
categories.append(1)
else:
categories.append(0)
df = pd.DataFrame({
'filename': filenames,
'category': categories
})
// 카테고리 dog는 1입니다. cat은 0이겠네요.
df.head()
df.tail()
df['category'].value_counts().plot.bar()

sample = random.choice(filenames)
image = load_img("./catndog/train/train/"+sample)
plt.imshow(image)

from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D, Dropout, Flatten, Dense, Activation, BatchNormalization
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(IMAGE_WIDTH, IMAGE_HEIGHT, IMAGE_CHANNELS)))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(512, activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(2, activation='softmax')) # 2 because we have cat and dog classes
model.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
model.summary()
// 액티베이션은 relu 함수, 옵티마이저는 rmsprop
출력:
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 126, 126, 32) 896
_________________________________________________________________
batch_normalization_1 (Batch (None, 126, 126, 32) 128
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 63, 63, 32) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 63, 63, 32) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 61, 61, 64) 18496
_________________________________________________________________
batch_normalization_2 (Batch (None, 61, 61, 64) 256
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 30, 30, 64) 0
_________________________________________________________________
dropout_2 (Dropout) (None, 30, 30, 64) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 28, 28, 128) 73856
_________________________________________________________________
batch_normalization_3 (Batch (None, 28, 28, 128) 512
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 14, 14, 128) 0
_________________________________________________________________
dropout_3 (Dropout) (None, 14, 14, 128) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 25088) 0
_________________________________________________________________
dense_1 (Dense) (None, 512) 12845568
_________________________________________________________________
batch_normalization_4 (Batch (None, 512) 2048
_________________________________________________________________
dropout_4 (Dropout) (None, 512) 0
_________________________________________________________________
dense_2 (Dense) (None, 2) 1026
=================================================================
Total params: 12,942,786
Trainable params: 12,941,314
Non-trainable params: 1,472
from keras.callbacks import EarlyStopping, ReduceLROnPlateau
earlystop = EarlyStopping(patience=10)
learning_rate_reduction = ReduceLROnPlateau(monitor='val_acc',
patience=2,
verbose=1,
factor=0.5,
min_lr=0.00001)
callbacks = [earlystop, learning_rate_reduction]
df["category"] = df["category"].replace({0: 'cat', 1: 'dog'})
train_df, validate_df = train_test_split(df, test_size=0.20, random_state=42)
train_df = train_df.reset_index(drop=True)
validate_df = validate_df.reset_index(drop=True)
train_df['category'].value_counts().plot.bar()
//학습한 결과가 바 플롯으로 나타납니다.
validate_df['category'].value_counts().plot.bar()
total_train = train_df.shape[0]
total_validate = validate_df.shape[0]
batch_size=15
train_datagen = ImageDataGenerator(
rotation_range=15,
rescale=1./255,
shear_range=0.1,
zoom_range=0.2,
horizontal_flip=True,
width_shift_range=0.1,
height_shift_range=0.1
)
// 학습하는 내용입니다. 입력값을 0부터 1사잇값으로 rescale 정의
train_generator = train_datagen.flow_from_dataframe(
train_df,
"./catndog/train/train/",
x_col='filename',
y_col='category',
target_size=IMAGE_SIZE,
class_mode='categorical',
batch_size=batch_size
)
// fiename과 category를 축으로 만들어서 실험
validation_datagen = ImageDataGenerator(rescale=1./255)
validation_generator = validation_datagen.flow_from_dataframe(
validate_df,
"./catndog/train/train/",
x_col='filename',
y_col='category',
target_size=IMAGE_SIZE,
class_mode='categorical',
batch_size=batch_size
)
//테스트 데이터 정의
example_df = train_df.sample(n=1).reset_index(drop=True)
example_generator = train_datagen.flow_from_dataframe(
example_df,
"./catndog/train/train/",
x_col='filename',
y_col='category',
target_size=IMAGE_SIZE,
class_mode='categorical'
)
plt.figure(figsize=(12, 12))
for i in range(0, 15):
plt.subplot(5, 3, i+1)
for X_batch, Y_batch in example_generator:
image = X_batch[0]
plt.imshow(image)
break
plt.tight_layout()
plt.show()

epochs=3 if FAST_RUN else 50 // epochs=3 if FAST_RUN else 5 로 만들어 주세요.
history = model.fit_generator(
train_generator,
epochs=epochs,
validation_data=validation_generator,
validation_steps=total_validate//batch_size,
steps_per_epoch=total_train//batch_size,
callbacks=callbacks
)
// epoch를 50번까지 실행함. 연산이 오래 걸립니다.
// 에포트 11번까지 갈 때 2시간 걸렸습니다.( GTX1050입니다.) GTX 3080 이상 아니면 에포크 횟수를 줄이세요.
model.save_weights("model.h5")
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(12, 12))
ax1.plot(history.history['loss'], color='b', label="Training loss")
ax1.plot(history.history['val_loss'], color='r', label="validation loss")
ax1.set_xticks(np.arange(1, epochs, 1))
ax1.set_yticks(np.arange(0, 1, 0.1))
ax2.plot(history.history['acc'], color='b', label="Training accuracy")
ax2.plot(history.history['val_acc'], color='r',label="Validation accuracy")
ax2.set_xticks(np.arange(1, epochs, 1))
legend = plt.legend(loc='best', shadow=True)
plt.tight_layout()
plt.show()
//중간에 에러가 생깁니다. acc 에러인데 해결해 보세요.
test_filenames = os.listdir("./catndog/test1/test1")
test_df = pd.DataFrame({
'filename': test_filenames
})
nb_samples = test_df.shape[0]
test_gen = ImageDataGenerator(rescale=1./255)
test_generator = test_gen.flow_from_dataframe(
test_df,
"./catndog/test1/test1/",
x_col='filename',
y_col=None,
class_mode=None,
target_size=IMAGE_SIZE,
batch_size=batch_size,
shuffle=False
)
결과:
Found 12500 validated image filenames
predict = model.predict_generator(test_generator, steps=np.ceil(nb_samples/batch_size))
test_df['category'] = np.argmax(predict, axis=-1)
label_map = dict((v,k) for k,v in train_generator.class_indices.items())
test_df['category'] = test_df['category'].replace(label_map)
test_df['category'] = test_df['category'].replace({ 'dog': 1, 'cat': 0 })
test_df['category'].value_counts().plot.bar()

sample_test = test_df.head(18)
sample_test.head()
plt.figure(figsize=(12, 24))
for index, row in sample_test.iterrows():
filename = row['filename']
category = row['category']
img = load_img("./catndog/test1/test1/"+filename, target_size=IMAGE_SIZE)
plt.subplot(6, 3, index+1)
plt.imshow(img)
plt.xlabel(filename + '(' + "{}".format(category) + ')' )
plt.tight_layout()
plt.show()

submission_df = test_df.copy()
submission_df['id'] = submission_df['filename'].str.split('.').str[0]
submission_df['label'] = submission_df['category']
submission_df.drop(['filename', 'category'], axis=1, inplace=True)
submission_df.to_csv('submission.csv', index=False)
소스 사이트 :
https://www.kaggle.com/code/uysimty/keras-cnn-dog-or-cat-classification
https://www.kaggle.com/code/kanncaa1/convolutional-neural-network-cnn-tutorial
Convolutional Neural Network (CNN) Tutorial
Explore and run machine learning code with Kaggle Notebooks | Using data from Digit Recognizer
www.kaggle.com
Convolutional Neural Network (CNN) Tutorial
Explore and run machine learning code with Kaggle Notebooks | Using data from Digit Recognizer
www.kaggle.com
'인공지능 > 딥러닝 -파이썬 인공지능' 카테고리의 다른 글
| 아나콘다 No module named 'tensorflow' 해결법 (0) | 2022.10.27 |
|---|---|
| 케라스 에러 수정 (0) | 2022.10.24 |
| MNIST 예제 (1) | 2022.09.14 |
| CNN(합성곱) Algorithm (0) | 2022.05.24 |
| 비주얼 스튜디오에서 텐서플로우 설치 (0) | 2022.05.22 |