In an age of LLMs, is it time to reconsider human-edited web directories?

Back in the early-to-mid '90s, one of the main ways of finding anything on the web was to browse through a web directory.

These directories generally had a list of categories on their front page. News/Sport/Entertainment/Arts/Technology/Fashion/etc.

Each of those categories had subcategories, and sub-subcategories that you clicked through until you got to a list of websites. These lists were maintained by actual humans.

Typically, these directories also had a limited web search that would crawl through the pages of websites listed in the directory.

Lycos, Excite, and of course Yahoo all offered web directories of this sort.

(EDIT: I initially also mentioned AltaVista. It did offer a web directory by the late '90s, but this was something it tacked on much later.)

By the late '90s, the standard narrative goes, the web got too big to index websites manually.

Google promised the world its algorithms would weed out the spam automatically.

And for a time, it worked.

But then SEO and SEM became a multi-billion-dollar industry. The spambots proliferated. Google itself began promoting its own content and advertisers above search results.

And now with LLMs, the industrial-scale spamming of the web is likely to grow exponentially.

My question is, if a lot of the web is turning to crap, do we even want to search the entire web anymore?

Do we really want to search every single website on the web?

Or just those that aren’t filled with LLM-generated SEO spam?

Or just those that don’t feature 200 tracking scripts, and passive-aggressive privacy warnings, and paywalls, and popovers, and newsletters, and increasingly obnoxious banner ads, and dark patterns to prevent you cancelling your “free trial” subscription?

At some point, does it become more desirable to go back to search engines that only crawl pages on human-curated lists of trustworthy, quality websites?

And is it time to begin considering what a modern version of those early web directories might look like?

@degoogle #tech #google #web #internet #LLM #LLMs #enshittification #technology #search #SearchEngines #SEO #SEM

  • AJ Sadauskas@aus.socialOP
    link
    fedilink
    arrow-up
    5
    ·
    9 months ago

    @bsammon And this Archive.org capture of Lycos.com from 1998 contradicts your memory: https://web.archive.org/web/19980109165410/http://lycos.com/

    See those links under “WEB GUIDES: Pick a guide, then explore the Web!”?

    See the links below that say Autos/Business/Money/Careers/News/Computers/People/Education /Shopping/Entertainment /Space/Sci-Fi/Fashion /Sports/Games/Government/Travel/Health/Kids

    That’s exactly what I’m referring to.

    Here’s the page where you submitted your website to Lycos: https://web.archive.org/web/19980131124504/http://lycos.com/addasite.html

    As far as the early search engines went, some were more sophisticated than others, and they improved over time. Some simply crawled the webpages on the sites in the directory, others

    But yes, Lycos definitely was definitely an example of the type of web directory I described.

    • bsammon@lemmy.sdf.org
      link
      fedilink
      arrow-up
      7
      ·
      9 months ago

      1998 isn’t “originally” when Lycos started in 1994. That 1998 snapshot would be their “portal” era, I’d imagine.

      And the page where you submitted your website to Lycos – that’s no different than what Google used to have. It just submitted your website to the spider. There’s no indication in that snapshot that suggests that it would get your site added to a curated web-directory.

      Those late 90’s web-portal sites were a pale imitation of the web indices that Yahoo, and later DMoz/ODP were at their peak. I imagine that the Lycos portal, for example, was only managed/edited by a small handful of Lycos employees, and they were moving as fast as they could in the direction of charging websites for being listed in their portal/directory. The portal fad may have died out before they got many companies to pony up for listings.

      I think in the Lycos and AltaVista cases, they were both search engines originally (mid 90s) and than jumped on the “portal” bandwagon in the late 90s with half-assed efforts that don’t deserve to be held up as examples of something we might want to recreate.

      Yahoo and DMoz/ODP are the only two instances I am aware of that had a significant (like, numbered in the thousands) number of websites listed, and a good level of depth.