First of all I would decide on what kind of chat bot I want to build.
A general purpose chatbot would chat on any topics (weather, politics, sport …). In this case you would need a lot of data/dialogues on each subject. For example movies dialogues, smalltalks, discussions between several people would be good source for training.
Another possibility is built a domain specific chatbot. Your chatbot could chat with you only on certain topics. For example you could built FAQ chatbot, or answering system on a specific subject. For example a chat bot could suggest possible causes of a decease based on previous dialogues with patience. Or you could built a simple chatbot which would answer to your questions on health whose knowledge is based on health stack exchange database (dump is free).
Of course, you would have to think about how to integrate the whole knowledge into RASA. Namely, you could generate question and answers or dialogues automatically from the raw data (stack exchange raw data for example). Most posts in SE has a title, a question and accepted or most voted answers. You could use it to generate stories.
Personally if I’d like to built a domain specific chatbot then I would train it with smalltalk data like greeting, goodbye, "how can I help?"s, "Sorry, I don’t get you"s… Maybe you could joke and story telling capabilities to your chat for fun. But as to domain specific chat, I would create my own intent/entity detection logic, similarity measure between sentences. Maybe I’d have to train Neural Networks with attention, etc. In other words, you domain may require custom approach to reach high accuracy in answering domain specific questions.
In my opinion RASA is a good platform to integrate your custom solution. It has a good feature such as writing your custom actions.
This also maybe helpful.