We finally have the whole #PUDL data pipeline running in @dagster and the visualizations make it very clear where we need to parallelize stuff. #datadon
Anybody else running particularly large or complex #OpenData / #OpenSource DAGs with these tools? We'd love to compare notes.
It would also be cool if there were some way to expose all this information to our users in a read-only form, so they can see what's happening with the nightly builds too.
@ZaneSelvans You might want to have a look at #Kedro too: https://demo.kedro.org/ (very soon with its own Mastodon account, I hope)
@astrojuanlu It looks like they're focusing on ML workflows. Do folks commonly use it for plain old data processing too? Though I doubt we're going to switch at this point now that we've finally got this all set up!
@ZaneSelvans Disclaimer: I'm the Developer Advocate for Kedro.
It's funny you say that, to me it means we have to adjust our messaging :) it's for plain and simple data processing pipelines too. I understand that you're not in the mood of migrating once more - if you ever give it a try, let me know!