Part of the discussion will circulate around this paper:
https://culturalanalytics.org/article/17212-can-gpt-3-pass-a-writer-s-turing-test
Ffrom the GPT-3 paper https://papers.nips.cc/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf
"In collecting training data for GPT-3, we used the unfiltered distribution of languages reflected in internet text datasets (primarily Common Crawl)"
For those that are interested in why web archives matter, this is very significant.
@edsu "trained on their own output"; that spirals inward. amiright? </part-sarcasm>
@edsu This won't be (isn't) a problem for models that have become self-aware.
OpenAI started in 2015 as a non-profit to help insure that there is viable open AI tech.
But in 2019 OpenAI needed more compute power and staff so they incorporated as part of Microsoft.
Here is a screen cap of the pricing model for their closed API (shared as part of this talk).