Text Mining - With R

tidy_austen <- austen_books() %>% unnest_tokens(word, text) # one word per row tidy_austen Stop words (the, and, to, of) carry little meaning. tidytext provides get_stopwords() .

with a bar chart:

data(stop_words) cleaned_austen <- tidy_austen %>% anti_join(stop_words, by = "word") Count most common words: Text Mining With R

word_counts <- cleaned_austen %>% count(word, sort = TRUE) word_counts %>% head(10) - austen_books() %&gt