I had been having trouble getting meaningful results from the fediverse on Google, and after seeing this post, it seems I’m not the only one. So, I created a site that helps search the fediverse in your search engine of choice (it currently supports Google, Bing, Yahoo, DuckDuckGo, and Dogpile).
Due to query limitations with most search engines, it currently only searches the top 15 lemmy/kbin instances, but I’ve tested it and it seems to provide access to a good chunk of fediverse content. The exception is Google, which should be far more reliable overall as well as providing the ability to search Mastodon and PeerTube.
If you have contributions or ideas for improvement, feel free to check out the project here or shoot me a message. Hope this helps people! :)
Edit: Update in progress including improved search queries and support for Mastodon/PeerTube (Google only, unfortunately)
Edit 2: Update is live, along with a dedicated domain name. If the website doesn’t look any different for you, try Ctrl+F5 or clearing site data - it seems some browsers are caching the old page.
In all seriousness, Google needs to get on providing an easier way to specify that a search should hit the Fediverse.
site:reddit.com
works for Reddit, but there is presently no analogous operator on Google’s search for a distributed system that spans many domains.I mean, it’s great that you’ve made this, don’t get me wrong, but they really should do that as well.
Search Term intext:“Powered by kbin”
maybe?
but there is presently no analogous operator on Google’s search for a distributed system that spans many domains.
Because that’s just a basic search. A search engine searches across multiple domains by default. If you’re specifically looking for only results from ActivityPub enabled services, that’s pretty much an impossibility since there’s no way to know (from a web crawl) if a page is served by a server that supports ActivityPub. Another problem is that a lot of fediverse instances purposefully block search engine crawlers because they don’t want to appear in search results.
I like the idea of scrapping Google altogether, and just having “better” search engines here that account for federated decouplings/distributions
Not entirely the same, but I switched over to Presearch a year or so ago, just to get away from Google and the “big tech” corporations
Hm, I find it somewhat annoying that right now, this is not really searching the Fediverse, but rather what we’ve come to call “the Threadiverse”, which is all about Reddit-like content aggregators.
In other words, I’d love an option to search different kinds of content, like instead of Threadiverse-stuff searching the most popular mastodon, misskey, or pleroma instances just to name a few.
Searching Mastodon is a bit of a… contentious issue. A lot of smaller Mastodon-based sites are full of traumatized vulnerable people who really just want to do their own thing, and they’ll rattle cages if they find out someone’s indexing their sites or posts. If anyone’s making third party search tools, it’s best to be careful to respect discoverability and indexing flags.
I find this to be incredibly fair, but also makes it much harder to dive into the fediverse. Where is the middle ground do you think?
Mastodon has flags for opting in to discoverability features (being featured in the profile directory, and having posts be searchable via Mastodon’s search bar) and for search engine indexing (for Google, bing, etc.).
Just don’t return posts from users that have opted out of those, and things should be mostly ok.
Just don’t return posts from users that have opted out of those, and things should be mostly ok.
This is the main problem I see. User settings are part of the mastodon API. If you’re building a general-purpose search engine, you use a crawler to index pages and your crawler has no idea those flags even exist.
I’m hoping to expand the project to hopefully be a bit more robust - I’ll definitely keep this on my radar
Thank you so much for the consideration! <3
deleted by creator
Good work bruv
Pretty cool, thank you!
Won’t let me upvote so I’m commenting to show love instead.
Simple and handy, thanks!
dig it, works great
Awesome. Though I notice very little shows up from kbin.social; content I know is there is missing when I search for it. That may have more to do with the recency of the site growth or the cloudflare protection that was up a few days ago.
I would guess that it is the cloudflare protection, since that will have prevented crawlers from indexing the site while it was enabled.
Nice UI, thx.
Very cool! Thanks!
Suggestion: add Brave Search (search.brave.com) as an option as well. It’s a smaller search engine but they have their own index and does not track users.Will do o7
Edit: It seems Brave doesn’t support chaining site specifiers, so my current method won’t work with their search
add kbin-link to your browser to make this even better!
Simple and to the point. Nice work!
Amazing! Great job! And thanks!
Seems like you could probably use this strategy and get rid of the limits by turning this into an extension that would tack on the site list to the search directly(though, I’m unsure if there are such limits directly via the search box on Google or whomever).
I’d also, just from a code quality perspective, bust the list out into it’s own property (which could later become smarter), and build the query string out at runtime.