Catalogue of datasets annotated for Hate Speech



On Hatespeechdata.com we have catalogued a large number of readily available datasets annotated for hate speech, online abuse, and offensive language. They may be useful for e.g. training a natural language processing system to detect this language. The page currently consists of 50+ datasets in 15 languages – including Arabic, Danish, English, French, German, Hindu-English, Indonesian and Turkish.