A user on the online forum 4chan has leaked a massive 270GB of data belonging to The New York Times. This leak includes the source code for the newspaper’s digital operations.

Here are some other findings we can confirm:

  • The leak does have the original source code of the game Wordle, which the NY Times acquired in 2022.
  • The leak includes a dated WordPress database of 1,500 NY Times Education site users. The database contains names and surnames, email addresses, and hashed passwords. You should expect it to be added to HIBP shortly.
  • Several folders contain internal communications from NY Times Slack channels.
  • Times uses various machine learning algorithms and NLP techniques/scripts for its services.
  • Many exposed authentication methods exist, including authentication URLs and their respective passwords, secret keys, and API tokens. The majority are well protected, but plenty of such secrets need immediate attention. We have also seen private user keys used for authentication.
  • There are a lot of details about internal NY Times architecture from a software development point of view.

So far, it is difficult to say whether the NY Times will need to reset the passwords for everyone who is a member of its site.

It’s worth pointing out that this leak appears to involve data from The New York Times’s IT/infrastructure/website organization rather than the news organization composed of reporters. In media companies, these two entities are largely separate. The IT/infrastructure team handles the technical aspects of the website and digital operations, while the news organization manages reporting and editorial content.

  • came_apart_at_Kmart [he/him, comrade/them]
    link
    fedilink
    English
    315 months ago

    the source code that serves up news content is 500 MB.

    the rest is for interactive pop ups and mobile layout breaking, random spontaneous, invisible click boxes to makenit so you accidentally activate an ad when trying to watch, pause, close or otherwise interact with a video.

    it is some of the most cutting edge website complicating code ever written.

    • dch82
      link
      fedilink
      English
      15 months ago

      500MB? What the heck? That’s literally 350 or so floppies or so many win95 installs