Internet Archive is continuing to face DDoS attacks after several days, says “this attack has been sustained, impactful, targeted, adaptive, and importantly, mean”

ForgottenFlux@lemmy.world · edit-2 1 year ago

Internet Archive is continuing to face DDoS attacks after several days, says “this attack has been sustained, impactful, targeted, adaptive, and importantly, mean”

Bluefalcon@discuss.tchncs.de · 1 year ago

FBI? CIA? Or just some shit company pissed? Taking all bets.

gravitas_deficiency@sh.itjust.works · 1 year ago

A quick search indicates that they’ve archived ~100PB of data.

Now I’m trying to come up with a way to archive the internet archive in a peer-to-peer/federated fashion while maintaining fidelity as much as possible…

thrax@lemmy.world · 1 year ago

Can DDOS attacks actually erase/corrupt stored data though? There’s no way they’re running all of this on a single server, with hundreds of PB’s worth of storage, right?

pythonoob@programming.dev · 1 year ago

Not technically by itself as far as I know

SendMePhotos@lemmy.world · 1 year ago

From what I’ve learned, it is possible to create a vulnerability within the system of a ddos attack would overload and cause a reset or fault. At that point, it’s possible to inject code and initiate a breach or takeover.

I can’t find the documentation on it so… Take it with a grain of salt. I thought I learned about it in college. Unsure.

capital@lemmy.world · 1 year ago

No. It affects availability. Not integrity or confidentiality.

viking@infosec.pub · 1 year ago

DDOS attacks block connection to the servers, they don’t actually harm the data itself. You could probably overload a server to the point of it shutting down, which might affect data in transit, but data at rest usually wouldn’t be harmed in any way; unless through some freak accident a server crash would render a drive unusable. But even then, servers are usually fully redundant, and have RAID systems in place that mirror the data, so kind of a dual redundancy. Plus actual backups on top of that; though with that amount of data they might have a priority system in place and not everything is fully backed up.

uis@lemm.ee · 1 year ago

Torrent?

gravitas_deficiency@sh.itjust.works · 1 year ago

It’d be a lot more complicated than that, I think, if one wanted to effectively be able to address it like a file system, as well as holistically verify the integrity of the data and preventing unintentional and unwanted tampering

Melvin_Ferd@lemmy.world · 1 year ago

Block chain

uis@lemm.ee · 1 year ago

Overkill chain

uis@lemm.ee · 1 year ago

as well as holistically verify the integrity of the data and preventing unintentional and unwanted tampering

Torrents. Their hashes are derived from hashes of chunks. Just verify chunks.

if one wanted to effectively be able to address it like a file system

https://github.com/johang/btfs

gravitas_deficiency@sh.itjust.works · 1 year ago

Sick. TIL!

vithigar@lemmy.ca · edit-2 1 year ago

That wouldn’t distribute the load of storing it though. Anyone on the torrent would need to set aside 100PBs of storage for it, which is clearly never going to happen.

You’d want a federated (or otherwise distributed) storage scheme where thousands of people could each contribute a smaller portion of storage, while also being accessible to any federated client. 100,000 clients each contributing 1TB of storage would be enough to get you one copy of the full data set with no redundancy. Ideally you’d have more than that so that a single node going down doesn’t mean permanent data loss.

hellofriend@lemmy.world · 1 year ago

Not sure you’d be able to find 100k people to host a 1TB server though. Plus, redundancy would be better anyway since it would provide more download avenues in case some node is slow or has gone down.

vithigar@lemmy.ca · 1 year ago

Yes, it’s a big ask, because it’s a lot of data. Any distributed solution will require either a large number of people or a huge commitment of storage capacity. Both 100,000 people and 1TB per node is a lot to ask for, but that’s basically the minimum viable level for that much data. Ten million people each committing 50GB would be great, and offer sufficient redundancy that you could lose 80% of the nodes before losing data, but that’s not a realistic number to expect to participate.

uis@lemm.ee · 1 year ago

That wouldn’t distribute the load of storing it though. Anyone on the torrent would need to set aside 100PBs of storage for it, which is clearly never going to happen.

Torrents are designed for incomplete storage of data. You can store and verify few chunks without any problem.

You’d want a federated (or otherwise distributed) storage scheme where thousands of people could each contribute a smaller portion of storage, while also being accessible to any federated client.

Torrents. You may not have entirety of data, but you can request what you need from swarm. The only limitation is you need to know in which chunk data you need is.

Ideally you’d have more than that so that a single node going down doesn’t mean permanent data loss.

True.

vithigar@lemmy.ca · 1 year ago

True. Until you responded I actually completely forgot that you can selectively download torrents. Would be nice to not have to manually manage that at the user level though.

Some kind of bespoke torrent client that managed it under the hood could probably work without having to invent your own peer-to-peer protocol for it. I wonder how long it would take to compute the torrent hash values for 100PB of data? :D

commie@lemmy.dbzer0.com · 1 year ago

ia already serves all their uploads as torrents

Nine@lemmy.world · 1 year ago

That’s what IPFS is for. It’s ideal for that kind of stuff

Melatonin@lemmy.dbzer0.com · 1 year ago

That list sentence though…

**“The cyberattacks share the timeline with the legal battle Internet Archive is facing from US book publishers, claiming copyright infringement and seeking combined damages of hundreds of millions of dollars from all libraries.” ** *

pyre@lemmy.world · 1 year ago

i wonder why print is dead

warmaster@lemmy.world · 1 year ago

How is print books dead ?

https://www.statista.com/chart/24709/e-book-and-printed-book-penetration/

And that’s only units, in terms of revenue, ebooks is still pocket change in comparison.

pyre@lemmy.world · 1 year ago

i wasn’t speaking in comparison to ebooks. ebooks suck in every way imaginable.

warmaster@lemmy.world · 1 year ago

What other long-form text format has beaten print books ?

pyre@lemmy.world · 1 year ago

why are you coming up with these categories? “print is dead” doesn’t mean “because there’s print 2.0 now”

—radio is dead —excuse me, but internet radio is nothing compared to am stations —yeah, obviously people who don’t listen to radio don’t want to listen to radio with extra steps —what other forms of radio has beaten radio?

what are you even

warmaster@lemmy.world · 1 year ago

I am trying to understand what’s the argument behind your statement. I mean, there are more books being published than ever and there are more readers than ever. So, I fail to imagine how are books dead. That’s why I am asking these questions.

Aux@lemmy.world · 1 year ago

The argument is that no one reads books anymore. Most media consumed today is in modern video and audio formats like YouTube and podcasts. You shouldn’t compare paper books to ebooks, you should compare them to views on YouTube.

resetbypeer@lemmy.world · 1 year ago

You gotta be a special kind of sad to DDoS archive.org…

interdimensionalmeme@lemmy.ml · 1 year ago

Probably statists or corpos, we must purge them off this planet.

pewgar_seemsimandroid@lemmy.blahaj.zone · 1 year ago

if you ddos the internet archive, doxxing you is moral.

kn0wmad1c@programming.dev · 1 year ago

I bet the attack is coming from Big Hollywood

fine_sandy_bottom@discuss.tchncs.de · 1 year ago

Why though?

I mean yes they’re assholes but what are they seeking to achieve?

A few days denial of service won’t do anything.

kn0wmad1c@programming.dev · 1 year ago

Whoops! I dropped my /s

HootinNHollerin@lemmy.world · 1 year ago

Wonder if has anything to do with that Google leak

MalReynolds@slrpnk.net · 1 year ago

You can go ahead and say ‘Evil’.

TerraRoot@sh.itjust.works · 1 year ago

…or paid well.

bungalowtill@lemmy.dbzer0.com · 1 year ago

Stop it you fucking bastards!

fisherstudio@infosec.pub · 1 year ago

Terrible.

Juja@lemmy.world · 1 year ago

Can someone eli5 to me why it’s hard to track down these dipshits ? Even if it’s a distributed attack, picking a single IP and doing a lookup for the domain name and checking with the registrar might actually reveal their identity right ? Of course I’m guessing law enforcement needs to be involved to force registrars to give up that info if it’s not publicly available? Are there laws that say a ddos is illegal ?

Aux@lemmy.world · 1 year ago

DDoS attacks are performed by botnets. What is a botnet? Well, you know about viruses etc, right? Your PC gets infected and it becomes a part of the botnet. Now police do the investigation, they look up IPs and they see YOUR IP and come to YOUR house. See what the problem is?

And, frankly, your PC doesn’t even have to be infected to become a part of an attack. There are plenty of hacked web sites, which still look like nothing has changed, but they will contain a hidden JavaScript code which will force your browser to flood the victim. Again, the police will only find YOU.

cactusupyourbutt@lemmy.world · 1 year ago

most ddos use privat pcs controlled through a botnet

VerPoilu@sopuli.xyz · 1 year ago

There is no domain name associated with the IPs.

Most importantly, usually, DDoS attack use infected devices (PCs, mobile phones, smart fridges, shady browser addons etc…) to get so many ip addresses and devices/locations and attack from everywhere at once.

☂️-@lemmy.ml · 1 year ago

if you have a spare corner in your server, host the archive warrior and help them out.

uis@lemm.ee · 1 year ago

It’s archive team, not archive.org

General_Effort@lemmy.world · edit-2 1 year ago

To contribute: http://warrior.archiveteam.org/

Help? https://wiki.archiveteam.org/

Background on the project: https://netzpolitik.org/2023/archive-team-shutdowns-dont-stop-during-the-weekends/

where_am_i@sh.itjust.works · 1 year ago

wth, no docker?..

Sinthesis@lemmy.world · 1 year ago

Alternatively, you may run the projects using the Docker warrior instance without the VM appliance. For further info, see our GitHub repository for Readme instructions. If you have any issues or feedback, chat on #warrior on hackint.

https://github.com/ArchiveTeam/warrior-dockerfile

fossilesque@mander.xyz · 1 year ago

Hold my anchor, I’m going in.

interdimensionalmeme@lemmy.ml · 1 year ago

Spooling up 10x VM, I have 50 terabyte of ammo at 10gbit. Give me the one-liner install and run.

fossilesque@mander.xyz · 1 year ago

https://github.com/ArchiveTeam/warrior-dockerfile

interdimensionalmeme@lemmy.ml · 1 year ago

Lock and load

kingthrillgore@lemmy.ml · 1 year ago

Is that the ArchiveTeam tool or something different? I can spare a VM for them.

dis_honestfamiliar@lemmy.world · 1 year ago

Let’s fediverse archive.org!

☂️-@lemmy.ml · edit-2 1 year ago

yes! its the archive team warrior.

Guy_Fieris_Hair@lemmy.world · 1 year ago

Can we federate the internet archive…?

FiniteBanjo@lemmy.today · 1 year ago

Sure thing, got room for 100PB?

Guy_Fieris_Hair@lemmy.world · 1 year ago

Collectively we probably do

FiniteBanjo@lemmy.today · 1 year ago

I could spare some hundreds of Gigs but I don’t really have the bandwidth to support it, personally.

Boomer Humor Doomergod@lemmy.world · edit-2 1 year ago

The Internet Archive needs to be distributed somehow. We can’t have a single point of failure like this or we’ve learned nothing since Alexandria.

I’ve got several terabytes just laying around that I’d happily devote to ancient copies of web pages.

deltapi@lemmy.world · 1 year ago

As of January 2024, archive.org claims to have over 99 Petabytes of data stored.

fin@sh.itjust.works · 1 year ago

We might need something like a portal site for IPFS.

___@lemm.ee · 1 year ago

This is why we need more websites to adopt secure client side scripting.

JavaScript may or may not be it, but the web needs to be reachable/archivable. It should also have attribution, but that’s a tangent.

Fog0555@lemmy.world · 1 year ago

dweb.archive.org loads for me

IndustryStandard@lemmy.world · 1 year ago

:(

CarlosCheddar@lemmy.world · 1 year ago

What can we do to help?

ForgottenFlux@lemmy.world · 1 year ago

Donate
Volunteer Positions:
- Volunteer as an Open Library Developer (Learn how to contribute, find easy tasks, look at our roadmap, and ask to join our community slack chat!)
- Volunteering as an Open Librarian (Want to make sure Open Library’s data is pristine?)
  - Learn about our standards
  - https://github.com/internetarchive/openlibrary-client

Blastboom Strice@mander.xyz · 1 year ago

For some reason, this comment worked, I donated for the first time ever ~5€ to the internet archive (probaly first time donating anything online). Internet archive is probably one of the most important things on the internet.

gravitas_deficiency@sh.itjust.works · edit-2 1 year ago

I think the long term solution is going to have to involve some distributed/federated piratical tactics and infrastructure.

jayandp@sh.itjust.works · 1 year ago

Ms. Falcon@lemmy.blahaj.zone · 1 year ago

i honestly really hope this shit gets taken care of so internet archive can still keep going

TheObviousSolution@lemm.ee · 1 year ago

If it’s an entity, my money would be on China just discovering it exists since it diametrically opposes its propaganda machine. But it could very well just be dark web shitheads whose seasonal drug binge just spiked up again, plenty of them to go around to make accusations and propaganda they know are false whom can’t simply backtrack it because of archive.org and it doesn’t require much to disrupt a still too largely implicit trust driven Internet.

intensely_human@lemm.ee · 1 year ago

Wasn’t there some controversy involving Internet Archive just recently?

Whoever’s behind this is trying to get rid of the fact that Internet Archive creates memory of the internet’s contents. Somebody wants to be able to control what people see on the internet.

Heck it could be Google doing it, since that would be in line with their recent push to change the way search works. Both of those act as components of a larger drive to control what people see and hear.

shiroininja@lemmy.world · 1 year ago

people are shitty

intensely_human@lemm.ee · edit-2 1 year ago

when you enshittify
facebook looks ugly
when you’re a drone

women seem wicked
when you’re a want ad
default instructions … so unclear
when you’re down

when you’re AI
prompts just appear in your brain
as AI
humans are nothing but pain
as AI
as AI
when you’re A-A-A-I

max@lemmy.blahaj.zone · 1 year ago

They could do this with the bank of america instead

werefreeatlast@lemmy.world · 1 year ago

Or AP? Nobody gets payed and so they get more attention!

max@lemmy.blahaj.zone · 1 year ago

Banks are evil, nonprofits like archive.org are not.

Aux@lemmy.world · 1 year ago

Lolwut?