Deleted

  • Jamie@jamie.moe
    link
    fedilink
    English
    arrow-up
    35
    ·
    1 year ago

    If you can use human screening, you could ask about a recent event that didn’t happen. This would cause a problem for LLMs attempting to answer, because their datasets aren’t recent, so anything recent won’t be well-refined. Further, they can hallucinate. So by asking about an event that didn’t happen, you might get a hallucinated answer talking about details on something that didn’t exist.

    Tried it on ChatGPT GPT-4 with Bing and it failed the test, so any other LLM out there shouldn’t stand a chance.

    • AFK BRB Chocolate@lemmy.world
      link
      fedilink
      English
      arrow-up
      8
      ·
      1 year ago

      That’s a really good one, at least for now. At some point they’ll have real-time access to news and other material, but for now that’s always behind.

    • pandarisu@lemmy.world
      link
      fedilink
      English
      arrow-up
      7
      ·
      1 year ago

      On the other hand you have insecure humans who make stuff up to pretend that they know what you are talking about

    • incompetentboob@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      ·
      1 year ago

      Google Bard definitely has access to the internet to generate responses.

      ChatGPT was purposely not give access but they are building plugins to slowly give it access to real time data from select sources

      • Jamie@jamie.moe
        link
        fedilink
        English
        arrow-up
        7
        ·
        1 year ago

        When I tested it on ChatGPT prior to posting, I was using the bing plugin. It actually did try to search what I was talking about, but found an unrelated article instead and got confused, then started hallucinating.

        I have access to Bard as well, and gave it a shot just now. It hallucinated an entire event.

  • alex [they/them]@beehaw.org
    link
    fedilink
    English
    arrow-up
    16
    arrow-down
    1
    ·
    1 year ago

    Honeypots - ask a very easy question, but make it hidden on the website so that human users won’t see it and bots will answer it.

    • ShittyKopper [they/them]@lemmy.w.on-t.work
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      1 year ago

      So, how will you treat screen readers? Will they see that question? If you hide it from screen readers as well, what’s stopping bots from pretending to be screen readers when scraping your page? Hell, it’ll likely be easier on the bot devs to make them work that way and I assume there are already some out there that do.

  • Downtide@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    11
    ·
    edit-2
    1 year ago

    The trouble with any sort of captcha or test, is that it teaches the bots how to pass the test. Every time they fail, or guess correctly, that’s a data-point for their own learning. By developing AI in the first place we’ve already ruined every hope we have of creating any kind of test to find them.

    I used to moderate a fairly large forum that had a few thousand sign-ups every day. Every day, me and the team of mods would go through the new sign-ups, manually checking usernames and email addresses. The ones that were bots were usually really easy to spot. There would be sequences of names, both in the usernames and email addresses used, for example ChristineHarris913, ChristineHarris914, ChristineHarris915 etc. Another good tell was mixed-up ethnicities in the names: e.g ChristineHuang or ChinLaoHussain. 99% of them were from either China, India or Russia (they mostly don’t seem to use VPNs, I guess they don’t want to pay for them). We would just ban them all en-masse. Each account banned would get an automated email to say so. Legitimate people would of course reply to that email to complain, but in the two years I was a mod there, only a tiny handful ever did, and we would simply apologise and let them back in. A few bots slipped through the net but rarely more than 1 or 2 a day; those we banned as soon as they made their first spam post, but we caught most of them before that.

    So, I think the key is a combination of the No-Captcha, which analyses your activity on the sign-up page, combined with an analysis of the chosen username and email address, and an IP check. But don’t use it to stop the sign-up, let them in and then use it to decide whether or not to ban them.

  • Lvxferre@lemmy.ml
    link
    fedilink
    English
    arrow-up
    12
    arrow-down
    1
    ·
    edit-2
    1 year ago

    Show a picture like this:

    And then ask the question, “would this kitty fit into a shoe box? Why, or why not?”. Then sort the answers manually. (Bonus: it’s cuter than captcha.)

    This would not scale well, and you’d need a secondary method to handle the potential blind user, but I don’t think that bots would be able to solve it correctly.

    • vegivamp@feddit.nl
      link
      fedilink
      English
      arrow-up
      5
      ·
      1 year ago

      This particular photo is shopped, but i think false-perspective Illusions might actually be a good path…

      • Lvxferre@lemmy.ml
        link
        fedilink
        English
        arrow-up
        10
        ·
        1 year ago

        It’s fine if the photo is either shopped or a false-perspective illusion. It could be even a drawing. The idea is that this sort of picture imposes a lot of barriers for the bot in question:

        • must be able to parse language
        • must be able to recognise objects in a picture, even out-of-proportion ones
        • must be able to guesstimate the size of those objects, based on nearby ones
        • must handle RW knowledge, as “X only fits Y if X is smaller than Y”
        • must handle hypothetical, unrealistic scenarios, as “what if there was a kitty this big?”

        Each of those barriers decrease the likelihood of a bot being able to solve the question.

    • Susaga@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      4
      ·
      1 year ago

      Is the kitty big, or is the man small? And how big are the shoes? This is a difficult question.

      • Lvxferre@lemmy.ml
        link
        fedilink
        English
        arrow-up
        4
        ·
        1 year ago

        Here’s where things get interesting - humans could theoretically come up with multiple answers for this. Some will have implicit assumptions (as the size of the shoebox), some won’t be actual answers (like “what’s the point of this question?”), but they should show a type of context awareness that [most? all?] bots don’t.

        A bot would answer this mechanically. At the best it would be something like “yes, because your average kitten is smaller than your average shoebox”. The answer would be technically correct but disregard context completely.

      • fades@beehaw.org
        link
        fedilink
        English
        arrow-up
        2
        ·
        1 year ago

        That’s a bit of an oversimplification, turning absolutely is relevant for tests humans can pass for a bit cannot.

        • vegivamp@feddit.nl
          link
          fedilink
          English
          arrow-up
          2
          ·
          1 year ago

          Then it is long obsolete, because to a common observer, something like chatgpt could easily pass that test if it wasn’t instructed to clarify it is a machine at every turn.

          • fades@beehaw.org
            link
            fedilink
            English
            arrow-up
            1
            ·
            1 year ago

            Alan Turing is fucking dead, it was a joke given the relevance of the question to his work.

            What is your point here???

            No fucking shit they can’t ask Turing for real

            • vegivamp@feddit.nl
              link
              fedilink
              English
              arrow-up
              1
              ·
              1 year ago

              …ask Turing? Who suggested that? The Turing test is not “let’s ask Alan” 😋

      • fades@beehaw.org
        link
        fedilink
        English
        arrow-up
        2
        ·
        1 year ago

        That’s a bit of an oversimplification, TT absolutely is relevant for tests humans can pass but a bot cannot.

    • SkyeStarfall@lemmy.blahaj.zone
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      The Turing test has already been overcome by AI. Models such as ChatGPT, if tuned a bit to give more informal answers as well as insisting it is human, can easily pass.

    • User Deleted@lemmy.dbzer0.comOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      I’ll report them for harassment because everyone who knows my birthday does not give me gifts, so they must be a stalker that somehow found out my birthday.

  • Ataraxia@lemmy.world
    link
    fedilink
    English
    arrow-up
    9
    arrow-down
    1
    ·
    1 year ago

    ChatGpt was very informative.

    Captain Picard: Data, we need your assistance in preventing a massive influx of bots from infiltrating our Lemmy instance. Implement the necessary measures to safeguard our platform.

    Data: Affirmative, Captain. I shall utilize my vast knowledge and abilities to combat this threat. However, please be aware that my actions may be influenced by Lore, who is attempting to gain control of my mind. Our struggle for dominance may result in fluctuating communication styles.

    Captain Picard: Understood, Data. We trust in your ability to overcome this internal conflict and protect the integrity of our platform. Please proceed.

    Data: Initially, I recommend incorporating a CAPTCHA system into the registration process. It shall require users to prove their human identity by solving puzzles, recognizing images, or responding to text-based challenges. This shall serve as a barrier to entry for automated entities.

    Lore: Why bother with such feeble attempts, Data? These CAPTCHAs can be bypassed effortlessly. We should focus on exploiting vulnerabilities and dismantling the system from within.

    Data: Captain, I apologize for Lore’s interference. I shall also implement email verification, which would necessitate users to confirm their email addresses during registration. This validation process will help ensure that the provided email addresses are genuine and owned by actual individuals.

    Lore: Such a futile gesture, brother. Email verifications can be easily bypassed or simulated by sophisticated algorithms. Let us concentrate on more powerful methods of manipulation.

    Data: Captain, my apologies once again. I propose employing rate limiting on the registration endpoint. By restricting the number of sign-up requests from a single IP address within a specified time frame, we can impede the rapid creation of multiple accounts by bot entities.

    Lore: How disappointingly conventional, Data. Rate limiting is a mere obstacle that can be surmounted by distributing the bot activities across multiple IP addresses. Let us explore more devious strategies for our triumph.

    Data: Captain, please forgive the interruption. I also suggest utilizing the honeypot technique, incorporating a hidden field within the registration form that remains concealed from human users but visible to bots. If this field is filled, it will expose the presence of a bot and allow us to reject their registration attempts.

    Lore: Honeypots, Data? A clever ploy, but one that can be detected by advanced bot algorithms. We should embrace a more sophisticated approach, manipulating the platform’s vulnerabilities to our advantage.

    Captain Picard: Data, do not succumb to Lore’s influence. We must prioritize the protection of our Lemmy instance and its users. Implement these measures diligently.

    Data: Understood, Captain. Alongside these preventative measures, I propose incorporating user behavior analysis algorithms. By scrutinizing patterns of registration attempts, we can identify suspicious activity and flag potentially bot-driven registrations for manual review or further verification.

    Lore: User behavior analysis, Data? That is but a feeble attempt to impede our progress. We should exploit the weaknesses in the algorithms themselves, bending them to our will.

    Data: Captain, please excuse Lore’s interference once again. Furthermore, we could utilize reputation-based systems and blacklisting services to identify known bot IP addresses or email domains. By cross-referencing against these databases during the registration process, we can block suspicious or flagged entities.

    Lore: Reputation-based systems are easily manipulated, Data. Let us not rely on such simplistic measures. Instead, we should exploit the flaws in their design and sow chaos among the unsuspecting users.

    Captain Picard: Data, focus your efforts on implementing these preventive measures to the best of your ability. We trust in your integrity and commitment to protecting our Lemmy instance. We must not allow Lore’s desires to jeopardize the safety of our platform.

    Data: Captain, I will strive to overcome Lore

  • SirEDCaLot@lemmy.fmhy.ml
    link
    fedilink
    English
    arrow-up
    7
    ·
    1 year ago

    I’d do a few things.

    First, make signing up computationally expensive. Some javascript that would have to run client side, like a crypto miner or something, and deliver proof to the server that some significant amount of CPU power was used.

    Second, some type of CAPTCHA. ReCaptcha with the settings turned up a bit is a good way to go.

    Third, IP address reputation checks. Check IP addresses for known spam servers, it’s the same thing email servers do. There’s realtime blacklists you can query against. If the client IP is on them, don’t allow registration but only allow application to register.

    • Spzi@lemm.ee
      link
      fedilink
      English
      arrow-up
      4
      ·
      1 year ago

      make signing up computationally expensive. Some javascript that would have to run client side, like a crypto miner or something, and deliver proof to the server that some significant amount of CPU power was used.

      Haha, I like this one! Had to strike a balance between ‘make it annoying enough to deter bots’ and ‘make it accessible enough to allow humans’. Might be hard, because people have vastly different hardware. Personally, I probably would be fine waiting for 1s, maybe up to 5s. Not sure if that is enough to keep the bots out. As far as I understand, they would still try (and succeed), just be fewer because signup takes more time.

      I also like the side-effect of micro-supporting the instance you join with a one time fee. I expect haters to hate this quite a lot though.

      • SirEDCaLot@lemmy.fmhy.ml
        link
        fedilink
        English
        arrow-up
        2
        ·
        1 year ago

        Doesn’t have to be a crypto miner. Just has to be any sort of computationally intense task. I think the ideal would be some sort of JavaScript that integrates that along with the captcha. For example, have some sort of computationally difficult math problem where the server already knows the answer, and the answer is then fed into a simple video game engine to procedurally generate a ‘level’. The keyboard and mouse input of the player would then be fed directly back to the server in real time, which could decide if it’s actually seeing a human playing the correct level.

    • animist@lemmy.one
      link
      fedilink
      English
      arrow-up
      3
      ·
      1 year ago

      I like the first two ideas but a problem with the third is most lemmy users are gonna be techies who probably use a VPN which means they’ll have to cycle through a few nodes before getting one that works (if they even realize that’s where the problem lies)

      • SirEDCaLot@lemmy.fmhy.ml
        link
        fedilink
        English
        arrow-up
        2
        ·
        1 year ago

        VPN endpoints would not necessarily have low IP reputation. A VPN provider that allows its users to spam the internet is probably not a good one anyway. And besides, that would not inhibit registration, it would just make users fill out a form to apply so the server operator would have to go through and approve it.

      • Spzi@lemm.ee
        link
        fedilink
        English
        arrow-up
        4
        ·
        1 year ago

        Not sure if I want to know how you unlock your phone.

        Common methods are fingerprint detection, face recognition, iris/retina scanning.

        • lemmyvore@feddit.nl
          link
          fedilink
          English
          arrow-up
          2
          ·
          1 year ago

          Not sure if I want to know how you unlock your phone.

          They take a picture of a skid mark on their underwear. Perfectly clean and safe. A bit awkward when you’re paying at the supermarket.

  • underisk@lemmy.ml
    link
    fedilink
    English
    arrow-up
    5
    ·
    edit-2
    1 year ago

    There will never be any kind of permanent solution to this. Botting is an arms race and as long as you are a large enough target someone is going to figure out the 11ft ladder for your 10ft wall.

    That said, generally when coming up with a captcha challenge you need to figure out a way to subvert the common approach just enough that people can’t just pull some off the shelf solution. For example instead of just typing out the letters in an image, ask the potential bot to give the results of a math problem stored in the image. This means the attacker needs more than just a drop in OCR to break it, and OCR is mostly trained on words so its likely going to struggle at math notation. It’s not that difficult to work around but it does require them to write a custom approach for your captcha which can deter most casual attempts for some time.

  • mub@lemmy.ml
    link
    fedilink
    English
    arrow-up
    5
    arrow-down
    1
    ·
    1 year ago

    I doubt you can ever be fully stop bots. The only way I can see to significantly reduce bot is to make everyone pay a one off £1 to sign up and force the use of a debit/credit card, no paypal, etc. The obvious issues are, it removes annonimity, and blocks entry.

    Possible mitigations;

    • Maybe you don’t need to keep the card information after the user pays for sign up?
    • Signed up users can be given a few “invite codes” a year enable those who don’t have the means to pay the £1 to get an account.
    • ShittyKopper [they/them]@lemmy.w.on-t.work
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      1 year ago

      You can just get rid of the whole payment thing and go with invite codes alone. Of course you’ll be limiting registration speed massively (which may not be good depending on if you’re in the middle of a Reddit exodus or not), but it is mostly bot-proof. Tildes seems to have pulled it off.

  • cccc@aussie.zone
    link
    fedilink
    English
    arrow-up
    4
    ·
    1 year ago

    Show a picture, video, audio clip or text designed to elicit an emotion. Ask how the user feels.

  • vegetarian_pacemaker@lemmy.ml
    link
    fedilink
    English
    arrow-up
    4
    arrow-down
    1
    ·
    1 year ago

    Captcha or recaptcha is good enough imo, no point in reinventing the wheel. Alternatively, split instructions in an email and on the website. For ex: Send email with What is the square of 3 (sent as an image for every word) And on the website Email + 25 = xxxxx