machine learning - why pretraining for convolutional neural networks -
Usually there is a problem with gradient disappearing in the promotion NN. I found out how some of the convolutional NN (CNN) are getting rid of the problems of this extinct gradient (why?).
Besides this, some papers have been considered for some persuasive approach to CNN. Can anyone tell me the following?
(1) Resons to persuasive behavior in CNN and (2) What are the problems / limitations with CNN? (3) Any relatant paper talking about the CNN border?
Thanks in advance.
-
Pretraining is a regularization technique that improves the accuracy of the generalization of your model Since the network is in contact with large amounts of data (we have large amounts of unorganized data in a number of taxes), the load parameters are taken to a single location, rather than meeting a specific subset of the underlying data, instead of data distribution More likely to represent Swimming. The nerve trap, especially with whose toned units have high model presentation capability, cover your data, and random parameters are weak for initialities. Also, as the initial layers have been started properly in a supervised manner, the problem of gradient dilution is no longer serious. This is the reason that pretraining is used as an initial step for supervision work which is usually done with gradient dynasty algorithms.
-
CNN shares the same fate with other neural nets, there are too many parameters for tune; Optimal input patch size, number of hidden layers, feature map per layer, pooling and stride size, normalization windows, learning rate and others. Thus, the problem of model selection is relatively difficult compared to other ML techniques. The training of large networks is done either on GPU or CPU's cluster.
Comments
Post a Comment