• rozodru@piefed.world
    link
    fedilink
    English
    arrow-up
    10
    ·
    14 hours ago

    I tried it, had to VPN in to do so but I tried it. I gave it 5 tasks, it succeeded in 2 of them, rest were hallucinations. so…yeah…guess it’s much better than Opus.

    • Hackworth@piefed.ca
      link
      fedilink
      English
      arrow-up
      3
      arrow-down
      2
      ·
      14 hours ago

      rest were hallucinations

      I’m having trouble parsing whatcha mean here if they were coding tasks. The code didn’t run? Ran but had 0 functionality? If they were non-coding tasks, then agreed, I didn’t notice it being significantly more accurate. Though I did appreciate the larger vocab. I wasn’t gonna be able to afford to keep using it once it went to API pricing anyway.

      • rozodru@piefed.world
        link
        fedilink
        English
        arrow-up
        5
        ·
        13 hours ago

        sorry should have been more specific. it was a mix of coding and non-coding. 1 coding task ran fine, another one just didn’t work at all. one was a basic walk through tutorial type task that was accurate, the others were hallucinations.