Researchers from Peking University say their resistive random-access memory chip may be capable of speeds 1,000 faster than the Nvidia H100 and AMD Vega 20 GPUs.
Thing is Deepseek didn’t have any new technology insights or “special sauce”
They just took all the current best practices at the time (high quality machine curated data sets, MoE architecture, etc) and did them as fully and rigorously as possible
It’s not like they invented chain of thought/Large Reasoning Model or state-space or anything new at all
Thing is Deepseek didn’t have any new technology insights or “special sauce”
They just took all the current best practices at the time (high quality machine curated data sets, MoE architecture, etc) and did them as fully and rigorously as possible
It’s not like they invented chain of thought/Large Reasoning Model or state-space or anything new at all