Follow

Web Archives in Repositories inkdroid.org/2022/05/24/wacz/ I’m fortunate to be back at code4lib again this year. It gives
me hope to see this conference working in the same spirit as it started
out with, albeit with much honed mechanics. It is also refreshing to be
talking about someone else’s work, in this case the work of the
Webrecorder project, rather than my own.

One of the dire pitfalls of PhD research, and academia more generally,
is the tendency to focus so much on your own int

@edsu super interesting, I'll have to check this out!! I was talking to Ilya a while back about a possible WACZ previewer in InvenioRDM. There is so much potential to serve web archived content in repos!!

@VickyRampin thank u! If you'd like to chat some time about possible approaches please get in touch: ehs@pobox.com

@edsu Should the label "Archived website" really contain "Archived" in your Monolithic web architecture diagram? Heritrix's typical target is live (not-yet-archived) websites, not those that are already archived.

@machawk1 I guess it is 'to be archived' -- would that fit your brain better?

@edsu It is the target (in the diagram) that is of concern, i.e., Heritrix communicating with an "archived website". Heritrix typically just communicates with a live web site.

@machawk1 ok, it is updated now. It's funny because I had it that way in my original sketches for those diagrams. First thought best thought I suppose.

Sign in to participate in the conversation
social.coop

A Fediverse instance for people interested in cooperative and collective projects.