Evidence for the DDoS attack that bigtech LLM scrapers actually are.

  • pcouy@lemmy.pierre-couy.fr
    link
    fedilink
    English
    arrow-up
    4
    ·
    1 month ago

    CIDR ranges (a.b.c.d/subnet_mask) contain 2^(32-subnet_mask) IP addresses. The 1.5 I’m using controls the filter’s sensitivity and can be tuned to anything between 1 and 2

    Using 1 or smaller would mean that the filter gets triggered earlier for larger ranges (we want to avoid this so that a single IP can’t trick you into banning a /16)

    Using 2 or more would mean you tolerate more fail/IP for larger ranges, making you ban all smaller subranges before the filter gets a chance to trigger on a larger range.

    This is running locally to a single f2b instance, but should work pretty much the same with aggregated logs from multiple instances

    • froztbyte@awful.systems
      link
      fedilink
      English
      arrow-up
      3
      ·
      1 month ago

      I’m aware of the construction of a CIDR prefix, I meant what are you using to categorise IPs from requests to look up mask size? whois? using published NIC/RIR data? what’s in BGP/routedumps? other?

      • hecko@pawb.social
        link
        fedilink
        English
        arrow-up
        1
        ·
        9 days ago

        late but i believe they mean they check for every possible range, e.g. if it’s only 1.2.3.5 making noise it’ll get banned as a /32 but if 1.2.3.6 is too it might justify a /30