Este post será o primeiro de uma pequena série onde eu fiz algumas analises sobre a mudança editorial do Instituto Mises Brasil usando um pouco de Natural Language Processing (NLP) e Latent Dirichlet Allocation (LDA).
Disclaimer: some of the information in this blog post might be incorrect and as FastText it’s very fast-paced to correct and adjust things probably some parts of this post may be can be out-of-date very soon too.
This repository curated by Sebastian Ruder it’s a great source of techniques and benchmarks about the state-of-the-art in NLP research.
In one experiment using a very large text database I got at the end of training using train_supervised()in FastText a serialized model with more than 1Gb.
A few days ago I wrote about FastText and one thing that is not clear in docs it’s about how to make the experiments reproducible in a deterministic day.