has now surpassed MitMing *1/3rd* of the world's websites, controlling a staggering 34%. And yet privacy seekers naively continue to use so-called "privacy focused" search engines like , which neglects to filter out Cloudflare sites from the results.

@resist1984 A search engine should not filter ANY sites from the results...

@datenschutzratgeber If you know of a client-side app that can push junk results down to page 20 so the server can send /all/, I'd be keen to know about it. Until then, we count on search engines not just to find shit but also to organize by putting the best results in view and hide the garbage. Research has shown that a link in search results is *twice* as likely to get clicked than the link below it.

Follow

@datenschutzratgeber to say "A search engine should not filter ANY sites" grossly misunderstands the purpose of a search engine. If a search were to return the whole index known to the engine, you would learn very quickly the value of filtering search results. Filtering is the core activity of what search engines do.

Β· Β· 2 Β· 2 Β· 1

@resist1984
I remember in the 90s we used to have applications that filtered results from several search engines. It was considered better to check several search engines depending on what you were searching for.
@datenschutzratgeber

@datenschutzratgeber @onepict just like DDG is a metasearch engine, so would be any app that harvests results from other engines. Analogous to a search app would be using perhaps in combination with searx, but those tools don't have a "filter out Cloudflare" switch.. it would still need to be created. It's feasible but in the end doesn't solve the problem of getting loyal users off false-privacy.

@resist1984 I was talking about filtering sites from the search results, not from the search index. Of course, a search engine is supposed to only display relevant results but removing sites from the list because they use Cloudflare is just paternalism.

Also, the search engine would have to maintain an ever-changing second index of Cloudflare sites.

@datenschutzratgeber results are irrelevant to privacy seekers. There is plenty of choice for privacy-ambivilous users (Google, DuckDuckGo, Bing, Yahoo [ is falsely positioned]). Filtering CF sites doesn't need a site index, just CIDRs. Ss filters out CF sites just fine, and that's basically a garage operation on a shoestring budget.

@resist1984

But if you filter Cloudflare sites, you should also filter the ones that use tracking etc. Which basically reduces the result list by 60+ %.

What's "Ss"? Does that work on a large scale?

@datenschutzratgeber Luckily we're not on a slippery slope here. We can easily nix the worst ~34% and still function. At large scales, nixing the 34% big offenders would have the effect of shrinking the 34% (sites want to be found).

@datenschutzratgeber Ss is handling the current scale just fine but it has a breaking point, which is why other search engines calling themselves "privacy focused" need to actually become privacy focused.

@resist1984 I don't think that would have any significant effect because major search engines like Google, Bing etc. would still list them on top πŸ€”

@datenschutzratgeber Yes, they would. But the idea @resist1984 offers is to filter client side. You can manipulate the downloaded content in the browser as you like. I delete cookie consent dialogs, quite often successfully, or filter certain content by type or origin domain in uMatrix. Why not change orders or suppress paid results based on filter rules?

@datenschutzratgeber The rankings of the majors can be discarded since we don't use the majors directly. A metasearch engine (or user app) controls its own ranking. Some searx instances will alternate round-robin: #1 from google, #1 from bing (if different), #1 from giga, #2 from google, etc. No reason bad sites can't be filtered.

@resist1984 Well, filtering on client-side is something completely different. I was talking about (mandatory) server-side filtering by the search engine itself.

@datenschutzratgeber Ss already demonstrates that it's possible. Why wouldn't it be? If a source decides to give 100% Cloudflare results, they can simply be dropped as a source.

Sign in to participate in the conversation
Mastodon πŸ” privacytools.io

Fast, secure and up-to-date instance. PrivacyTools provides knowledge and tools to protect your privacy against global mass surveillance.

Website: privacytools.io
Matrix Chat: chat.privacytools.io
Support us on OpenCollective, many contributions are tax deductible!