• nymnympseudonym@piefed.social
    link
    fedilink
    English
    arrow-up
    3
    ·
    edit-2
    2 days ago

    Thing is Deepseek didn’t have any new technology insights or “special sauce”

    They just took all the current best practices at the time (high quality machine curated data sets, MoE architecture, etc) and did them as fully and rigorously as possible

    It’s not like they invented chain of thought/Large Reasoning Model or state-space or anything new at all