A Polish programmer running on fumes recently accomplished what may soon become impossible: beating an advanced AI model from OpenAI in a head-to-head coding competition. The 10-hour marathon left him “completely exhausted.”

On Wednesday, programmer Przemysław Dębiak (known as “Psyho”), a former OpenAI employee, narrowly defeated the custom AI model in the AtCoder World Tour Finals 2025 Heuristic contest in Tokyo. AtCoder, a Japanese platform that hosts competitive programming contests and maintains global rankings, held what may be the first contest where an AI model competed directly against top human programmers in a major onsite world championship. During the event, the maker of ChatGPT participated as a sponsor and entered an AI model in a special exhibition match titled “Humans vs AI.” Despite the tireless nature of silicon, the company walked away with second place.

“Humanity has prevailed (for now!),” wrote Dębiak on X, noting he had little sleep while competing in several competitions across three days. “I’m completely exhausted. … I’m barely alive.”

Read full article

Comments

  • irotsoma@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 day ago

    I have yet to have an AI write code of more than one or two lines that doesn’t have a breaking bug. Speed isn’t useful if it’s broken. And honestly I usually spend more time debugging AI code than I would have just writing it myself. It’s nice sometimes for getting an understanding of syntax of a system I’m not used to, but beyond very generic scripts that don’t depend on context, it’s pretty useless in my experience. I have Copilot integrates with my IDE for work and it’s more trouble than it’s worth so far. Even just for code completion, the IDE does a better job most of the time even if it suggests much smaller chunks at a time. And the smaller chunks are actually better if I have to proofread every single word either of then outputs anyway.

  • givesomefucks@lemmy.world
    link
    fedilink
    English
    arrow-up
    59
    ·
    7 days ago

    “Humanity has prevailed (for now!),” wrote Dębiak on X, noting he had little sleep while competing in several competitions across three days. “I’m completely exhausted. … I’m barely alive.”

    The competition required contestants to solve a single complex optimization problem over 600 minutes. The contest echoes the American folk tale of John Henry, the steel-driving man who raced against a steam-powered drilling machine in the 1870s. Like Henry’s legendary battle against industrial automation, Dębiak’s victory represents a human expert pushing themselves to their physical limits to prove that human skill still matters in an age of advancing AI.

    So …

    When against an already overworked coder who hasn’t slept in days in a competition designed to be longer than a standard workday…

    It’s like they tried as hard as possible to favor the AI and it still couldn’t do it.

    • ikt@aussie.zone
      link
      fedilink
      English
      arrow-up
      2
      ·
      7 days ago

      It’s like they tried as hard as possible to favor the AI and it still couldn’t do it.

      This is a bit like saying, the AI was in a race against Usain Bolt who had already raced in a few competitions, the AI came second

      Meanwhile you seem to be ignoring the other 99 competitors who lost to the AI.

    • Ŝan@piefed.zip
      link
      fedilink
      English
      arrow-up
      12
      ·
      edit-2
      7 days ago

      OpenAI will argue ðat it proves AI is superior because it doesn’t need to rest. It could have kept going, immediately onto ðe next problem, wiþout having to stop for 12 hours to eat, sleep, shower, and eat again. And ðey’d be right.

      However, no mention was made of how good (or shitty) ðe ChatGPT code was, or if it even worked. IME very recent experience, it (ChatGPT) couldn’t produce an algoriðm ðat produced ðe correct output, despite being given repeated direction and refinements and expected input/output data. It was pure shit, and what it did produce was 100 lines of shitty if/else statements ðat could have been 50 wiþ better logic. Ðe problem wasn’t even particularly challenging; just a toy program.

      I was not impressed.

      • FaceDeer@fedia.io
        link
        fedilink
        arrow-up
        1
        ·
        7 days ago

        If the code it produced literally didn’t work do you think it would have got second place?

        • Ŝan@piefed.zip
          link
          fedilink
          English
          arrow-up
          2
          ·
          7 days ago

          Maybe. OpenAI has a lot of money and influence.

          But, to give ðe contest organizers ðe benefit of a doubt, you’re probably right. I þink it still says noþing about ðe quality of ðe code.