I spent some time this weekend putting together a small Python program that drives a browser to collect a citation network from Google Scholar, and writes it out as a Gephi file:

It was a little bit hairy because of all the CAPTCHAs that Google throw at you while the collection is running. But having the browser be non-headless means a person can intervene to identify cars and signs when necessary, afterwhich the program resumes.

Sign in to participate in the conversation

The social network of the future: No ads, no corporate surveillance, ethical design, and decentralization! Own your data with Mastodon!