Is it possible to add option batch_first = True for MultiheadAttention and transformer modules like it is done for RNN ? Batch x Sequence x Embedding I find it more understandable and intuitive when batch_first is true |
Please use https://discuss.pytorch.org (https://discuss.pytorch.org/) for questions. Please kindly feel encouraged to reopen this issue, if this does not apply. |
@izdeby This seems like a reasonable feature request to me, and AFAIK feature requests are fair game for github issues. |