• uis@lemm.ee
    link
    fedilink
    English
    arrow-up
    9
    ·
    edit-2
    28 days ago

    LLMs can’t cite. They don’t know what a citation is other than a collection of text of a specific style

    LLMs can cite. It’s called Retrival-Augmented Generation. Basically LLM that can do Information Retrival, which is just academic term for search engines.

    You’d be lucky if the number of references equalled the number of referenced items even if you were lucky enough to get real sources out of an LLM

    You can just print retrival logs into references. Well, kinda stretching definition of “just”.

    • notthebees@reddthat.com
      link
      fedilink
      English
      arrow-up
      1
      ·
      27 days ago

      My question is that the thing they are citing actually exists and if it does exist, contains the information it claims.

      • FutileRecipe@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        27 days ago

        Depends. In my experience, it usually does exist. Now there are hallucinations where GPT makes up stuff or just misinterprets what it read. But it’s super easy to read the GPT output, look at the cited work, skim works for relevance, then tweak the wording and citing to match.

        If you just copy/paste and take GPT’s word for it without the minimal amount of checking, you’re digging your own grave.

      • uis@lemm.ee
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        27 days ago

        the thing they are citing actually exists

        In case of RAGs it exists in searched dataset.

        and if it does exist, contains the information it claims.

        Not guaranteed.