•  d3Xt3r   ( @d3Xt3r@beehaw.org ) 
    link
    fedilink
    12
    edit-2
    1 year ago

    Looks interesting, but doesn’t seem better than GPT-4. GPT-4 scored 67% on the Human Eval test, whereas Code Llama scored only a 53.7%, which isn’t a trivial difference. Bit disingenuous of Meta to claim it to be “on par” with ChatGPT.