I refer to Multi-Scale Context Aggregation by Dilated Convolutions.
In addition to the benefits you already mentioned such as larger receptive field, efficient computation and lesser memory consumption, the dilated causal convolutions also has the following benefits:
I'd refer you to read this amazing paper WaveNet which applies dilated causal convolutions to raw audio waveform for generating speech, music and even recognize speech from raw audio waveform.
I hope you find this answer helpful.