Rachael Tatman - 5 mistakes you'll probably make with language data (and how to recover)
5 mistakes you'll probably make with language data (and how to recover) by Rachael Tatman Visit https://rstats.ai/nyr/ to learn more. Abstract: Language is fundamentally different from other types of data, and it's inevitable that you'll run into some language-specific issues. This talk will cover some of the most common types of errors I've seen data analysts and machine learning engineers make with language data, from ignoring the differences between text genres to treating text as written speech to assuming that all languages work like English. We'll also talk about ways to avoid these common mistakes (and recover gracefully if you've already made them). Twitter: https://twitter.com/rctatman Presented at the 2021 New York R Conference (September 9, 2021)