텐서플로우 모델 만들기 - tenseopeullou model mandeulgi

Question

모델구조 이해하기

텐서플로 2.0 버전에서 케라스를 이용해 모델을 구현하는 방법을 알아보자. 전반적인 자연어처리에 대해 알려면 아래의 블로그를 참고하자'

목차 Show

모델구조 이해하기
구현 순서
1.단계 전처리하기
2.단계 모델 구현하기
3단계. 학습하기
모델 생성 방법
Sequential 모델
Functional 모델
Subclassing 모델
모델 시각화
모델 컴파일 방법

han-py.tistory.com/281

구현 순서

전처리 => 모델만들기 => 학습하기

1.단계 전처리하기

컴퓨터는 한국어를 이해할 수 없습니다. 그래서 Deep learning을 하기 위해 한국어를 컴퓨터가 이해할 수 있게 Vector로 변환하는 과정이 필요합니다. 이를 우리는 전처리라고 하고, 입력값을 임베딩된 벡터로 변형한다라고도 합니다.

활용할 구조 : 심층 신경망(Deep Neural Network) 구조
구현할 모델 : 긍정/부정을 예측하는 감정 분석(Sentiment Analysis)

시작해 볼까요?

다음의 전처리 코드를 한 줄씩 이해해 봅시다.(궁금한 줄을 선택해 주세요)

import tensorflow as tf
from tensorflow.keras import preprocessing

samples = ['날이 좋아, 기분이 너무 좋다',
          '오늘 기분이 별로야',
          '프로젝트 너무 힘들어',
          '문장 분류를 배워서 행복해',
          '한파이는 너무 즐거워']

labels = [[1], [0], [0], [1], [1]]
tokenizer = preprocessing.text.Tokenizer()
tokenizer.fit_on_texts(samples)
sequences = tokenizer.texts_to_sequences(samples)

word_index = tokenizer.word_index

import tensorflow as tf

tensorflow 모듈을 가져와서 tf로 사용할 수 있게 만들어 줍니다.

from tensorflow.keras import preprocessing

아래의 text module을 사용하기 위해 tensorflow.keras 에서 preprocessing utils를 가져옵니다.

samples = ['날이 좋아, 기분이 너무 좋다',
'오늘 기분이 별로야',
'프로젝트 너무 힘들어',
'문장 분류를 배워서 행복해',
'한파이는 너무 즐거워']

전처리할 문장들입니다다.

labels = [[1], [0], [0], [1], [1]]

우리는 지금 긍정과 부정으로 두 가지 나눠서 분류를 하고 있기 때문에, samples 값에 맞게 긍정은 [1], 부정은 [0]으로 만들어 준 것이다. 이러한 두 가지 분류를 영어로 binary classification이라고도 합니다.

tokenizer = preprocessing.text.Tokenizer()

tf.keras.preprocessing.text 모듈은 Tokenizer 클래스를 포함합니다. Tokenizer 클래스를 사용하여 tokenizer 인스턴스를 만들어 줍니다.

tokenizer.fit_on_texts(samples)

tokenizer 인스턴스에 fit_on_texts를 사용하여 텍스트로 이루어진 리스트를 토대로 내부 단어를 업데이트 합니다. 아래의 texts_to_sequences나 texts_to_matrix를 사용하기 전에 fit_on_texts를 적용해 줘야합니다.

sequences = tokenizer.texts_to_sequences(samples)

texts_to_sequences는 samples에 들어간 text를 integers로 이루어진 숫자의 시퀀스 형태로 바꿔준다.

word_index = tokenizer.word_index

word_index의 속성은 단어와 숫자가 딕셔너리의 key와 value 쌍으로 바꿔줍니다.

자연어 토크나이저 알아보기.

https://codetorial.net/tensorflow/natural_language_processing_in_tensorflow_01.html

공식문서 관련 구현

https://www.tensorflow.org/tutorials/keras/text_classification?hl=ko

2.단계 모델 구현하기

심층 신경망 모델을 만드는 방법에는 Sequential API 방법과 Function API 방법이 있습니다. 우리는 Sequential API 방법을 활용하여 모델을 만들어 봅시다. 모델의 만드는 원리는 거름종이를 한 층씩 만들어서 한 층씩 통과시키는 것이라고 생각해 주시면 됩니다.

즉, 모델을 만들기 위해서는 layer를 하나씩 쌓아주면 된다.

model = tf.keras.Sequential()
model.add(layers.Embedding(vacab_size, emb_size, input_length = 4))
model.add(layers.Lambda(lambda x: tf.reduce_mean(x, axis = 1)))
model.add(layers.Dense(hidden_dimension, activation='relu'))
model.add(layers.Dense(output_dimension, activation='sigmoid'))

model = tf.keras.Sequential()

Sequential 객체를 생성한 후에 아래에 각 층을 추가하면 됩니다.

model.add(layers.Embedding(vacab_size, emb_size, input_length = 4))

Embedding층으로 입력값을 임베딩하는 부분입니다.

model.add(layers.lambda(lambda x: tf.reduce_mean(x, axis = 1)))

임베딩된 각 단어는 하나의 층으로 되어있지 않습니다. 그래서 다음 층에 넣기 위해 입력값들의 평균을 내어 다름 layer에 넣어주는 처리과정이 필요합니다. 즉, 벡터를 평균하기 위해 람다(Lambda) 층을 넣은 것입니다.

model.add(layers.Dense(hidden_dimension, activation='relu'))

hidden layer는 layer들 사이의 가운데 들어간 layer어라고 생각해 주면 됩니다.

model.add(layers.Dense(output_dimension, activation='sigmoid'))

activation은 활성화 함수를 넣는 것이라고 생각하면 됩니다. 여기서 sigmoid 함수는 출력값의 합이 1이되게 만들어, 각각의 값이 0과 1사이로 만들어 확률을 비교 할 수 있게 한다.

필요 변수

batch_size = 2
num_epopchs = 100
vocab_size = len(word_index) + 1
emb_size = 128
hidden_dimension = 256
output_dimension = 1

위에서 적의한 변수는 학습 과정에서 적용할 배치 사이즈, 에폭 수, 모델의 하이퍼파라미터에 해당하는 여러차원의 크기(임베딩 층, 은닉 층, 출력 층)입니다.

모델이 완성됐다.

3단계. 학습하기

학습하기 위해서는 Keras 내장 API인 compile 메서드와 fit 메서드를 사용하면 됩니다. 우선은 Compile 매서트를 사용해 학습과정을 정의해 봅시다.

model.compile(optimizer = tf.keras.optimizers.Adam(0.001),
             loss='binary_crossentropy',
             metrics=['accuracy'])

optimizer의 경우 아담(Adam) 최적화 알고리즘을 사용했습니다. 그리고 이진 분류 문제에서 loss값으로는 binary cross-entropy(이진 교차 엔트로피 손실 함수)를 사용합니다다. metrics 부분은 모델의 성능을 측정하기 위한 기준인 평가지표를 정의하는데, 이진 분류의 평가 지표로 가장 널리 사용되는 accuracy(정확도)를 평가 지표로 사용합니다.

이제 fit 메서드로 학습을 진행하면 된다.

model.fit(input_sequences, labels, epochs=num_epochs, batch_size=batch_size)

[딥러닝] Tensorflow 에서 모델 생성하는 법

모델 생성 방법

Tensorflow에서는 딥러닝 모델을 만드는 방법이 아래와 같이 3가지가 있습니다.

Sequential 모델

Functional 모델

Subclassing 모델

1, 2번은 기존 Keras를 사용하셨다면 이해하기 쉽고, 3번은 pytorch와 비슷한 방법입니다. 각각 어떻게 모델을 만드는지 알아보겠습니다.

Sequential 모델

가장 구현하기 쉬운 방법입니다. Sequential 모델을 생성하여 원하는 layer를 순차적으로 add하는 방식입니다. 다만 이 방법으로는 직관적인 모델을 빠르게 구현할 수 있지만, 구조가 조금 복잡해지면 구현하기 어려울 수 있다는 단점이 있습니다.

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

# 혹시 이미 그려둔 그래프가 있다면 clear
keras.backend.clear_session()

# model에 순차적으로 레이어를 쌓는다
model = Sequential()

# model에 층층이 추가, 첫번째 layer는 input_shape를 명시할것!
model.add(Dense(64,input_shape=(1,)))
model.add(Dense(1))

# 컴파일
model.compile(loss='mse', optimizer='sgd')

# 학습
model.fit(x, y, epochs=10, verbose=1)

# 예측
pred = model.predict(x)

Functional 모델

두번째로 Functional 모델입니다. input layer부터 output layer까지 직접 forward 순서대로 함수 끝에 입력 변수를 넣어 줍니다. 최종적으로는 Model(inputs, outputs) 과 같이 입력 layer와 출력 layer를 지정해 줌으로써 모델을 만들 수 있습니다. 무난하게 자주 사용하는 방법입니다.

from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Dense

# 혹시 이미 그려둔 그래프가 있다면 clear
keras.backend.clear_session()

# input layer생성
inputs = Input(shape=(64,))

# 이후 layer에서는 이전 출력 변수를 넣어줌
hidden = Dense(64, activation='relu')(inputs)
outputs = Dense(10, activation='softmax')(hidden)

# Model객체에 input layer, output layer(마지막 layer)을 입력
model = Model(inputs, outputs)

# 컴파일 및 학습
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x, y, epochs=200, verbose=1)

Subclassing 모델

마지막으로 Subclassing 모델입니다. pytorch와 구현 방식이 비슷하여 개인적으로 가장 선호하는 방법입니다. 직접 모델 클래스를 생성하고 Model 모듈을 상속 받습니다. 그리고 __init__함수에는 사용할 layer들을 작성하고, call함수에는 init에서 작성한 layer에 맞게 forward 해 줍니다.

from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Dense
keras.backend.clear_session()

# pytorch와 비슷한 방법
class MyModel(Model):
    def __init__(self):
        super(MyModel, self).__init__()
        self.dense1 = Dense(64, activation='relu')
        self.dense2 = Dense(10, activation='softmax')

    def call(self, x):
        x = self.dense1(x)
        x = self.dense2(x)
        return x
    
    def summary(self):
        inputs = Input((1, 10))
        Model(inputs, self.call(inputs)).summary()

# input입력으로 모델 생성(build를 해야하며, input_shape를 넣어줌)
model = MyModel()
model.build(input_shape=(1,10)) # (1, feature)

# 모델 요약
model.summary()

# 컴파일 및 학습
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x, y, epochs=20, verbose=1, validation_split=0.2, shuffle=True, batch_size=4)

기타 팁

모델 시각화

위에서 생성한 모델에 대해 네트워크 구조를 시각화 할 수 있습니다.

from tensorflow.keras.utils import plot_model
plot_model(model, show_shapes=True)

모델 컴파일 방법

모델을 fit 함수로 학습하기 위해서는 compile을 해야 합니다. 아래는 원하는 loss와 optimizer를 사용해서 compile 하는 방법들의 예시입니다. tensorflow에서 기본으로 제공하는 'categorical_crossentropy', 'adam', 'accuracy'등 문자열로도 구현할 수 있고, 직접 모듈을 가져와서 사용할 수 있습니다.

단, fit 함수로 학습하지 않고 tf.GradientTape()로 직접 train loop로 학습할 때에는 아래 방법이 아니라, loss와 optimizer를 직접 작성해 주어야 합니다.

# target이 one-hot encoding인 경우
model.compile(loss='categorical_crossentropy')
model.compile(loss=keras.losses.categorical_crossentropy)

# target이 int형 레이블인 경우
model.compile(loss='sparse_categorical_crossentropy')
model.compile(loss=keras.losses.sparse_categorical_crossentropy)

# optimizer도 아래와 같이 문자, 함수 모두 가능
model.compile(optimizer='adam')
model.compile(optimizer=keras.optimizers.Adam())

# metrics는 배열 형태로 지정
model.compile(metrics=['accuracy'])

# 최종 sample
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])