Quick question for those more in the know: Have these events disrupted IA's ability to archive pages? I ask because I was recently talking with a security guy about a novel malware that used a hacked webpage for command injection. One possible motive that came to mind, if the archiving was disrupted would be to cover tracks for a similar malware. Inject code, perform malicious activity, revert, then, there's more time before the control code is discovered.
Technology
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
The majority of Reddit discourse on this is wild. The crowd there is going HARD to try and paint IA in the most negative light possible.
I know we don't like Reddit here, but for example: https://www.reddit.com/r/DataHoarder/comments/1g7w0rh/internet_archive_issues_continue_this_time_with/
It's almost as if the "hackers" and/or copyright holders are running that conversation.
Since it's Reddit, I would guess copyright sockpuppets are steering the narrative to help damage them further.
Hope they had a backup
We need IA full mirrors. This is too critical to leave to this one company.
Knowing the folks at IA I'm sure they would love a backup. They would love a community. I'm sure they don't want to be the only ones doing this. But dang, they've got like 99 Petabytes of data. I don't know about you, but my NAS doesn't have that laying around...
I wonder if someone can come up with some kind of distributed storage that isn't insanely slow. Kinda like a CDN but on personal devices. I'm thinking like SETI@HOME did with distributed compute.
Edit: this is kinda like torrents but where the contents are changing frequently.
Something like torrents. Split the whole thing in small 5gb torrents.
You should look up IPFS! It's trying to be kinda like that.
It'll always be slower than a CDN, though, partly because CDNs pay big money to be that fast, but also anything p2p is always going to have some overhead while the swarm tries to find something. It's just a more complicated problem that necessarily has more layers.
But that doesn't mean it's not possible for it to be "fast enough"
And there's a promising new IPFS-like system called Iroh, which should have a lot less overhead and in general just be faster than IPFS. It's not quite ready to just switch to right now, but an enterprising individual could probably make something useful with it without too much work (i.e. months, not years).
I'm using it for a distributed application project right now, but the intent is a bit different than the IA use-case.
Interesting, thanks
That is an insane amount of storage. How much does it grow every year and is it stable growth or accelerating?
Apparently, BlackMeta is behind the DDoS attack to the Internet Archive. Apparently they are pro-Palestine hacktivists - their X account also has some russian written in it.
(Edit) Also, Internet Archive is banned on China since 2012 and Russia since 2015.
Definitely not their genocidal neighbors terrorizing as usual. /s
Yes they are a "pro-Palestine" Russian based hacker group... Nothing funny going on here no sir
Reading that whole page, holy shit, it's like a twelve year old wrote it trying to sound very smart while also attempting to divert blame and falsify agenda. If this ain't a Russian psyop, nothing is.
Buddy. I don't care what they say. It's plainly obvious they are lying. They are just brown hat hackers
So if white hat is ethical hackers, black hat is unethical, and red hat is Linux, then obviously brown hat is shitty!