It has a decent API, it covers a broad range of topics, it's predominately short-form text content, it's human moderated, it's categorized and there's even up/down voting to help with reinforcement learning.
It's basically easy mode for ML engineers, which is why Reddit now charges serious...