Why is CNN synchronous (parallel) and RNN asynchronous (serial)?
Personal Home –> xiaosongshine. Github. IO /
1. Question elicitation
CNN and RNN both adopt the structure of parameter sharing(parameter sharing) unit and sliding traversal. Why is CNN synchronous (parallel) while RNN asynchronous (serial)?
2. Personal insights:
CNN, RNN Shared unit and sliding structure similar, difference between RNN has the memory function, be traversed the unit has a causal connection (memory information), on a moment of the hidden layer state involved in the calculation process of this moment, the words of the calculation result of an example is the first unit will be as part of the second unit of input, so, The e current unit must wait for the next unit to be computed, as many times as possible. However, CNN elements at the same level are equivalent without causality, so all required element cores (with the same parameters) can be directly copied according to the element kernel, and then matrix parallel calculation is adopted, requiring only one calculation.
3. Think more:
Can RNN be designed as a parallel model while preserving memory function? Can CNN add dependencies without changing parallel operations? Criticism is welcome.