Functions with Multiprocessing for Text Processing
2020 Apr 01Those who followed the post A small journey in the valley of Natural Language Processing and Text Pre-Processing for German language (or its English version) saw some of the challenges of modeling a text classifier for the German language.
However, one thing that saved me during the preprocessing phase was that I used multiprocessing to parallelize the preprocessing on the text column, which saved me an incredible amount of time (recalling: I had over 1 million text records, with an average of 250 words per record and a standard deviation of 700, all using an internal library).
<script src=”.js”> </script>
That’s it: simple and smooth.