You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+8-8Lines changed: 8 additions & 8 deletions
Original file line number
Diff line number
Diff line change
@@ -4,15 +4,13 @@ RNNSharp is a toolkit of recurrent neural network which is widely used for many
4
4
This page will introduces you about what is RNNSharp, how it works and how to use it. To get the demo package, please access release page and download the package.
5
5
6
6
## Overview
7
-
RNNSharp supports many different types of recurrent neural network (aka RNN) structures.In the aspect of historical memory, it supports BPTTand LSTM structures. And in respect of output layer structure, RNNSharp supports native output layer and recurrent CRFs[1]. In additional, RNNSharp also support forward RNN and bi-directional RNN structures.
7
+
RNNSharp supports many different types of deep recurrent neural network (aka DeepRNN) structures.In the aspect of historical memory, it supports BPTT(BackPropagation Through Time) and LSTM(Long Short-Term Memory) structures. And in respect of output layer structure, RNNSharp supports native output layer and recurrent CRFs[1]. In additional, RNNSharp also support forward RNN and bi-directional RNN structures.
8
8
9
-
For BPTT and LSTM, BPTT-RNN is usually called as "simple RNN", since the structure of its hidden layer node is very simple. It's not good at preserving long time historical memory, but its decoding is lower than LSTM.
9
+
For BPTT and LSTM, BPTT-RNN is usually called as "simple RNN", since the structure of its hidden layer node is very simple. It's not good at preserving long time historical memory. LSTM-RNN is more complex than BPTT-RNN, since its hidden layer node has inner-structure which helps it to save very long time historical memory. In general, LSTM has better performance than BPTT on longer sequences.
10
10
11
-
LSTM-RNN is more complex than BPTT-RNN, since its hidden layer node has inner-structure which helps it to save very long time historical memory. In general, LSTM has better performance than BPTT on longer sequences.
11
+
For native RNN output, many widely experiments and applications have proved that it has better results than tranditional algorithms, such as MMEM, for online sequence labeling tasks, such as speech recognition, auto suggestion and so on.
12
12
13
-
For native RNN output, many widely experiments and applications have proved that it's an excellent algorithm for online sequence labeling tasks, such as speech recognition, auto suggestion and so on. It has better performance than MMEM and other traditionals algorithms.
14
-
15
-
For recurrent CRFs (recurrent conditional random fields), it's a new type of CRF based on RNN. Compared with the above one, Recurrent-CRF can be used for many different types of sequence labeling tasks in offline, such as word segmentation, named entity recognition and so on. With the similar feature set, it has better performance than linear CRF, since the representation of its feature is richer than before.
13
+
For RNN-CRF, based on native RNN outputs and their transition, we compute CRF output for entire sequence. Compred with native RNN, RNN-CRF has better performance for many different types of sequence labeling tasks in offline, such as word segmentation, named entity recognition and so on. With the similar feature set, it has better performance than linear CRF.
16
14
17
15
For bi-directional RNN, the output result combines the result of both forward RNN and backward RNN. It usually has better performance than single-directional RNN.
-dropout <float>: hidden layer node drop out ratio, default is 0
215
213
-bptt <int>: the step for back-propagation through time, default is 4
216
-
-layersize <int>: hidden layer size for training, default is 200
214
+
-layersize <int>: the size of each hidden layer, default is 200 for a single layer. If you want to have more than one layer, each layer size is split by character ',' For example: "-layersize = 200,100" means the neural network has two hidden layers, the first hidden layer size is 200, and the second hidden layer size is 100
217
215
-crf <0/1>: training model by standard RNN(0) or RNN-CRF(1), default is 0
218
216
-maxiter <int>: maximum iteration for training. 0 is no limition, default is 20
219
217
-savestep <int>: save temporary model after every <int> sentence, default is 0
Above command line will train a bi-directional recurrent neural network with CRF output. The network has two BPTT hidden layers and one output layer. The first hidden layer size is 200 and the second hidden layer size is 100
0 commit comments