如何用卷積神經網絡CNN識別手寫數字集?_第1頁
如何用卷積神經網絡CNN識別手寫數字集?_第2頁
如何用卷積神經網絡CNN識別手寫數字集?_第3頁
如何用卷積神經網絡CNN識別手寫數字集?_第4頁
如何用卷積神經網絡CNN識別手寫數字集?_第5頁
已閱讀5頁,還剩6頁未讀, 繼續免費閱讀

下載本文檔

版權說明:本文檔由用戶提供并上傳,收益歸屬內容提供方,若內容存在侵權,請進行舉報或認領

文檔簡介

1、如何用卷積神經網絡CNN識別手寫數字集?前幾天用CNN識別手寫數字集,后來看到kaggle上有一個比賽是識別手寫數字集的,已經進行了一年多了,目前有1179個有效提交,最高的是100%,我做了一下,用keras做的,一開始用最簡單的MLP,準確率只有98.19%,然后不斷改進,現在是99.78%,然而我看到排名第一是100%,心碎 = =,于是又改進了一版,現在把最好的結果記錄一下,如果提升了再來更新。手寫數字集相信大家應該很熟悉了,這個程序相當于學一門新語言的“Hello World”,或者mapreduce的“WordCount”:)這里就不多做介紹了,簡單給大家看一下:復制代碼 1 #

2、Author:Charlotte 2 # Plot mnist dataset 3 from keras.datasets import mnist 4 import matplotlib.pyplot as plt 5 # load the MNIST dataset 6 (X_train, y_train), (X_test, y_test) = mnist.load_data() 7 # plot 4 images as gray scale 8 plt.subplot(221) 9 plt.imshow(X_train0, cmap=plt.get_cmap('PuBuGn_r

3、')10 plt.subplot(222)11 plt.imshow(X_train1, cmap=plt.get_cmap('PuBuGn_r')12 plt.subplot(223)13 plt.imshow(X_train2, cmap=plt.get_cmap('PuBuGn_r')14 plt.subplot(224)15 plt.imshow(X_train3, cmap=plt.get_cmap('PuBuGn_r')16 # show the plot17 plt.show()復制代碼圖: 1.BaseLine版本一開始我

4、沒有想過用CNN做,因為比較耗時,所以想看看直接用比較簡單的算法看能不能得到很好的效果。之前用過機器學習算法跑過一遍,最好的效果是SVM,96.8%(默認參數,未調優),所以這次準備用神經網絡做。BaseLine版本用的是MultiLayer Percepton(多層感知機)。這個網絡結構比較簡單,輸入->隱含->輸出。隱含層采用的rectifier linear unit,輸出直接選取的softmax進行多分類。網絡結構:代碼:復制代碼 1 # coding:utf-8 2 # Baseline MLP for MNIST dataset 3 import numpy 4 fro

5、m keras.datasets import mnist 5 from keras.models import Sequential 6 from keras.layers import Dense 7 from keras.layers import Dropout 8 from keras.utils import np_utils 9 10 seed = 711 numpy.random.seed(seed)12 #加載數據13 (X_train, y_train), (X_test, y_test) = mnist.load_data()14 15 num_pixels = X_tr

6、ain.shape1 * X_train.shape216 X_train = X_train.reshape(X_train.shape0, num_pixels).astype('float32')17 X_test = X_test.reshape(X_test.shape0, num_pixels).astype('float32')18 19 X_train = X_train / 25520 X_test = X_test / 25521 22 # 對輸出進行one hot編碼23 y_train = np_utils.to_categorical(

7、y_train)24 y_test = np_utils.to_categorical(y_test)25 num_classes = y_test.shape126 27 # MLP模型28 def baseline_model():29 model = Sequential()30 model.add(Dense(num_pixels, input_dim=num_pixels, init='normal', activation='relu')31 model.add(Dense(num_classes, init='normal', ac

8、tivation='softmax')32 model.summary()33 pile(loss='categorical_crossentropy', optimizer='adam', metrics='accuracy')34 return model35 36 # 建立模型37 model = baseline_model()38 39 # Fit40 model.fit(X_train, y_train, validation_data=(X_test, y_test), nb_epoch=10, batch_size

9、=200, verbose=2)41 42 #Evaluation43 scores = model.evaluate(X_test, y_test, verbose=0)44 print("Baseline Error: %.2f%" % (100-scores1*100)#輸出錯誤率復制代碼結果:復制代碼 1 Layer (type) Output Shape Param # Connected to 2 = 3 dense_1 (Dense) (None, 784) 615440 dense_input_100 4 _ 5 dense_2 (Dense) (None,

10、 10) 7850 dense_100 6 = 7 Total params: 623290 8 _ 9 Train on 60000 samples, validate on 10000 samples10 Epoch 1/1011 3s - loss: 0.2791 - acc: 0.9203 - val_loss: 0.1420 - val_acc: 0.957912 Epoch 2/1013 3s - loss: 0.1122 - acc: 0.9679 - val_loss: 0.0992 - val_acc: 0.969914 Epoch 3/1015 3s - loss: 0.0

11、724 - acc: 0.9790 - val_loss: 0.0784 - val_acc: 0.974516 Epoch 4/1017 3s - loss: 0.0509 - acc: 0.9853 - val_loss: 0.0774 - val_acc: 0.977318 Epoch 5/1019 3s - loss: 0.0366 - acc: 0.9898 - val_loss: 0.0626 - val_acc: 0.979420 Epoch 6/1021 3s - loss: 0.0265 - acc: 0.9930 - val_loss: 0.0639 - val_acc:

12、0.979722 Epoch 7/1023 3s - loss: 0.0185 - acc: 0.9956 - val_loss: 0.0611 - val_acc: 0.981124 Epoch 8/1025 3s - loss: 0.0150 - acc: 0.9967 - val_loss: 0.0616 - val_acc: 0.981626 Epoch 9/1027 4s - loss: 0.0107 - acc: 0.9980 - val_loss: 0.0604 - val_acc: 0.982128 Epoch 10/1029 4s - loss: 0.0073 - acc:

13、0.9988 - val_loss: 0.0611 - val_acc: 0.981930 Baseline Error: 1.81%復制代碼可以看到結果還是不錯的,正確率98.19%,錯誤率只有1.81%,而且只迭代十次效果也不錯。這個時候我還是沒想到去用CNN,而是想如果迭代100次,會不會效果好一點?于是我迭代了100次,結果如下:Epoch 100/1008s - loss: 4.6181e-07 - acc: 1.0000 - val_loss: 0.0982 - val_acc: 0.9854Baseline Error: 1.46%從結果中可以看出,迭代100次也只提高了0.35

14、%,沒有突破99%,所以就考慮用CNN來做。 2.簡單的CNN網絡keras的CNN模塊還是很全的,由于這里著重講CNN的結果,對于CNN的基本知識就不展開講了。網絡結構:代碼:復制代碼 1 #coding: utf-8 2 #Simple CNN 3 import numpy 4 from keras.datasets import mnist 5 from keras.models import Sequential 6 from keras.layers import Dense 7 from keras.layers import Dropout 8 from keras.layers

15、 import Flatten 9 from keras.layers.convolutional import Convolution2D10 from keras.layers.convolutional import MaxPooling2D11 from keras.utils import np_utils12 13 seed = 714 numpy.random.seed(seed)15 16 #加載數據17 (X_train, y_train), (X_test, y_test) = mnist.load_data()18 # reshape to be sampleschann

16、elswidthheight19 X_train = X_train.reshape(X_train.shape0, 1, 28, 28).astype('float32')20 X_test = X_test.reshape(X_test.shape0, 1, 28, 28).astype('float32')21 22 # normalize inputs from 0-255 to 0-123 X_train = X_train / 25524 X_test = X_test / 25525 26 # one hot encode outputs27 y_

17、train = np_utils.to_categorical(y_train)28 y_test = np_utils.to_categorical(y_test)29 num_classes = y_test.shape130 31 # define a simple CNN model32 def baseline_model():33 # create model34 model = Sequential()35 model.add(Convolution2D(32, 5, 5, border_mode='valid', input_shape=(1, 28, 28),

18、 activation='relu')36 model.add(MaxPooling2D(pool_size=(2, 2)37 model.add(Dropout(0.2)38 model.add(Flatten()39 model.add(Dense(128, activation='relu')40 model.add(Dense(num_classes, activation='softmax')41 # Compile model42 pile(loss='categorical_crossentropy', optimi

19、zer='adam', metrics='accuracy')43 return model44 45 # build the model46 model = baseline_model()47 48 # Fit the model49 model.fit(X_train, y_train, validation_data=(X_test, y_test), nb_epoch=10, batch_size=128, verbose=2)50 51 # Final evaluation of the model52 scores = model.evaluate

20、(X_test, y_test, verbose=0)53 print("CNN Error: %.2f%" % (100-scores1*100)復制代碼結果:復制代碼 1 _ 2 Layer (type) Output Shape Param # Connected to 3 = 4 convolution2d_1 (Convolution2D) (None, 32, 24, 24) 832 convolution2d_input_100 5 _ 6 maxpooling2d_1 (MaxPooling2D) (None, 32, 12, 12) 0 convoluti

21、on2d_100 7 _ 8 dropout_1 (Dropout) (None, 32, 12, 12) 0 maxpooling2d_100 9 _10 flatten_1 (Flatten) (None, 4608) 0 dropout_10011 _12 dense_1 (Dense) (None, 128) 589952 flatten_10013 _14 dense_2 (Dense) (None, 10) 1290 dense_10015 =16 Total params: 59207417 _18 Train on 60000 samples, validate on 1000

22、0 samples19 Epoch 1/1020 32s - loss: 0.2412 - acc: 0.9318 - val_loss: 0.0754 - val_acc: 0.976621 Epoch 2/1022 32s - loss: 0.0726 - acc: 0.9781 - val_loss: 0.0534 - val_acc: 0.982923 Epoch 3/1024 32s - loss: 0.0497 - acc: 0.9852 - val_loss: 0.0391 - val_acc: 0.985825 Epoch 4/1026 32s - loss: 0.0413 -

23、 acc: 0.9870 - val_loss: 0.0432 - val_acc: 0.985427 Epoch 5/1028 34s - loss: 0.0323 - acc: 0.9897 - val_loss: 0.0375 - val_acc: 0.986929 Epoch 6/1030 36s - loss: 0.0281 - acc: 0.9909 - val_loss: 0.0424 - val_acc: 0.986431 Epoch 7/1032 36s - loss: 0.0223 - acc: 0.9930 - val_loss: 0.0328 - val_acc: 0.

24、989333 Epoch 8/1034 36s - loss: 0.0198 - acc: 0.9939 - val_loss: 0.0381 - val_acc: 0.988035 Epoch 9/1036 36s - loss: 0.0156 - acc: 0.9954 - val_loss: 0.0347 - val_acc: 0.988437 Epoch 10/1038 36s - loss: 0.0141 - acc: 0.9955 - val_loss: 0.0318 - val_acc: 0.989339 CNN Error: 1.07%復制代碼迭代的結果中,loss和acc為訓

25、練集的結果,val_loss和val_acc為驗證機的結果。從結果上來看,效果不錯,比100次迭代的MLP(1.46%)提升了0.39%,CNN的誤差率為1.07%。這里的CNN的網絡結構還是比較簡單的,如果把CNN的結果再加幾層,邊復雜一代,結果是否還能提升? 3.Larger CNN這一次我加了幾層卷積層,代碼:復制代碼 1 # Larger CNN 2 import numpy 3 from keras.datasets import mnist 4 from keras.models import Sequential 5 from keras.layers import Dense

26、6 from keras.layers import Dropout 7 from keras.layers import Flatten 8 from keras.layers.convolutional import Convolution2D 9 from keras.layers.convolutional import MaxPooling2D10 from keras.utils import np_utils1112 seed = 713 numpy.random.seed(seed)14 # load data15 (X_train, y_train), (X_test, y_

27、test) = mnist.load_data()16 # reshape to be samplespixelswidthheight17 X_train = X_train.reshape(X_train.shape0, 1, 28, 28).astype('float32')18 X_test = X_test.reshape(X_test.shape0, 1, 28, 28).astype('float32')19 # normalize inputs from 0-255 to 0-120 X_train = X_train / 25521 X_tes

28、t = X_test / 25522 # one hot encode outputs23 y_train = np_utils.to_categorical(y_train)24 y_test = np_utils.to_categorical(y_test)25 num_classes = y_test.shape126 # define the larger model27 def larger_model():28 # create model29 model = Sequential()30 model.add(Convolution2D(30, 5, 5, border_mode=

29、'valid', input_shape=(1, 28, 28), activation='relu')31 model.add(MaxPooling2D(pool_size=(2, 2)32 model.add(Convolution2D(15, 3, 3, activation='relu')33 model.add(MaxPooling2D(pool_size=(2, 2)34 add(Dropout(0.2)35 model.add(Flatten()36 model.add(Dense(128, activation='relu

30、')37 model.add(Dense(50, activation='relu')38 model.add(Dense(num_classes, activation='softmax')39 # Compile model40 model.summary()41 pile(loss='categorical_crossentropy', optimizer='adam', metrics='accuracy')42 return model43 # build the model44 model =

31、larger_model()45 # Fit the model46 model.fit(X_train, y_train, validation_data=(X_test, y_test), nb_epoch=69, batch_size=200, verbose=2)47 # Final evaluation of the model48 scores = model.evaluate(X_test, y_test, verbose=0)49 print("Large CNN Error: %.2f%" % (100-scores1*100)復制代碼 結果:復制代碼_L

32、ayer (type) Output Shape Param # Connected to=convolution2d_1 (Convolution2D) (None, 30, 24, 24) 780 convolution2d_input_100_maxpooling2d_1 (MaxPooling2D) (None, 30, 12, 12) 0 convolution2d_100_convolution2d_2 (Convolution2D) (None, 15, 10, 10) 4065 maxpooling2d_100_maxpooling2d_2 (MaxPooling2D) (No

33、ne, 15, 5, 5) 0 convolution2d_200_dropout_1 (Dropout) (None, 15, 5, 5) 0 maxpooling2d_200_flatten_1 (Flatten) (None, 375) 0 dropout_100_dense_1 (Dense) (None, 128) 48128 flatten_100_dense_2 (Dense) (None, 50) 6450 dense_100_dense_3 (Dense) (None, 10) 510 dense_200=Total params: 59933_Train on 60000

34、samples, validate on 10000 samplesEpoch 1/1034s - loss: 0.3789 - acc: 0.8796 - val_loss: 0.0811 - val_acc: 0.9742Epoch 2/1034s - loss: 0.0929 - acc: 0.9710 - val_loss: 0.0462 - val_acc: 0.9854Epoch 3/1035s - loss: 0.0684 - acc: 0.9786 - val_loss: 0.0376 - val_acc: 0.9869Epoch 4/1035s - loss: 0.0546

35、- acc: 0.9826 - val_loss: 0.0332 - val_acc: 0.9890Epoch 5/1035s - loss: 0.0467 - acc: 0.9856 - val_loss: 0.0289 - val_acc: 0.9897Epoch 6/1035s - loss: 0.0402 - acc: 0.9873 - val_loss: 0.0291 - val_acc: 0.9902Epoch 7/1034s - loss: 0.0369 - acc: 0.9880 - val_loss: 0.0233 - val_acc: 0.9924Epoch 8/1036s

36、 - loss: 0.0336 - acc: 0.9894 - val_loss: 0.0258 - val_acc: 0.9913Epoch 9/1039s - loss: 0.0317 - acc: 0.9899 - val_loss: 0.0219 - val_acc: 0.9926Epoch 10/1040s - loss: 0.0268 - acc: 0.9916 - val_loss: 0.0220 - val_acc: 0.9919Large CNN Error: 0.81%復制代碼效果不錯,現在的準確率是99.19% 4.最終版本網絡結構沒變,只是在每一層后面加了dropout

37、,結果居然有顯著提升。一開始迭代500次,跑死我了,結果過擬合了,然后觀察到69次的時候結果就已經很好了,就選擇了迭代69次。復制代碼 1 # Larger CNN for the MNIST Dataset 2 import numpy 3 from keras.datasets import mnist 4 from keras.models import Sequential 5 from keras.layers import Dense 6 from keras.layers import Dropout 7 from keras.layers import Flatten 8 fr

38、om keras.layers.convolutional import Convolution2D 9 from keras.layers.convolutional import MaxPooling2D10 from keras.utils import np_utils11 import tplotlib.pyplot as plt12 from keras.constraints import maxnorm13 from keras.optimizers import SGD14 # fix random seed for reproducibility15 seed = 716

39、numpy.random.seed(seed)17 # load data18 (X_train, y_train), (X_test, y_test) = mnist.load_data()19 # reshape to be samplespixelswidthheight20 X_train = X_train.reshape(X_train.shape0, 1, 28, 28).astype('float32')21 X_test = X_test.reshape(X_test.shape0, 1, 28, 28).astype('float32')22

40、 # normalize inputs from 0-255 to 0-123 X_train = X_train / 25524 X_test = X_test / 25525 # one hot encode outputs26 y_train = np_utils.to_categorical(y_train)27 y_test = np_utils.to_categorical(y_test)28 num_classes = y_test.shape129 #raw30 # define the larger model31 def larger_model():32 # create

41、 model33 model = Sequential()34 model.add(Convolution2D(30, 5, 5, border_mode='valid', input_shape=(1, 28, 28), activation='relu')35 model.add(MaxPooling2D(pool_size=(2, 2)36 model.add(Dropout(0.4)37 model.add(Convolution2D(15, 3, 3, activation='relu')38 model.add(MaxPooling2

42、D(pool_size=(2, 2)39 model.add(Dropout(0.4)40 model.add(Flatten()41 model.add(Dense(128, activation='relu')42 model.add(Dropout(0.4)43 model.add(Dense(50, activation='relu')44 model.add(Dropout(0.4)45 model.add(Dense(num_classes, activation='softmax')46 # Compile model47 pile

43、(loss='categorical_crossentropy', optimizer='adam', metrics='accuracy')48 return model49 50 # build the model51 model = larger_model()52 # Fit the model53 model.fit(X_train, y_train, validation_data=(X_test, y_test), nb_epoch=200, batch_size=200, verbose=2)54 # Final evaluation of the model55 scores = model.evaluate(X_test, y_test, verbose=0)56 print("Large CNN Error: %.2f%&q

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯系上傳者。文件的所有權益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網頁內容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
  • 4. 未經權益所有人同意不得將文件中的內容挪作商業或盈利用途。
  • 5. 人人文庫網僅提供信息存儲空間,僅對用戶上傳內容的表現方式做保護處理,對用戶上傳分享的文檔內容本身不做任何修改或編輯,并不能對任何下載內容負責。
  • 6. 下載文件中如有侵權或不適當內容,請與我們聯系,我們立即糾正。
  • 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論