Magidow/Belinkov on #Arabic at #FuturePhilologies:
- Average lifespan of a word in the Shamela corpus is almost 1200 years.
Work in progress:
- Using a hierarchical clustering algorithm comparing word embeddings in the Shamela corpus finds three main periods. But there may be a genre effect.
After culling text reuse/quotation, more subperiods emerge, but precise distinctions are so far elusive.
But there does seem to be a pre-classical Arabic period!