this post was submitted on 05 May 2024
325 points (93.8% liked)

Technology

57895 readers
4761 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

Hi!

Kagi had a rough couple months on the PR side, and a comment from another Lemmy user arguing that they aren't using Google's index set me off... because I had just read a couple weeks ago on their own websites that they primarily use Google's search index.

Lo and behold, that user was "right": No mention of Google whatsoever on Kagi's Search Sources page. If that's all you had to go off of, you'd be excused for thinking they are only using their internal index to power their web search since that's what they now strongly imply. The only "reference" to external indexes is this nebulous sentence:

Our search results also include anonymized API calls to all major search result providers worldwide, specialized search engines like Marginalia, and sources of vertical information [...]

... Unless one goes to check that pesky Wayback Machine. Here is the same page from March 2024, which I will copy/paste here for posterity:

Search Sources

You can think of Kagi as a "search client," working like an email client that connects to various indexes and sources, including ours, to find relevant results and package them into a superior, secure, and privacy-respecting search experience, all happening automatically and in a split-second for you.

External

Our data includes anonymized API calls to traditional search indexes like Google, Yandex, Mojeek and Brave, specialized search engines like Marginalia, and sources of vertical information like Wolfram Alpha, Apple, Wikipedia, Open Meteo, Yelp, TripAdvisor and other APIs. Typically every search query on Kagi will call a number of different sources at the same time, all with the purpose of bringing the best possible search results to the user.

For example, when you search for images in Kagi, we use 7 different sources of information (including non-typical sources such as Flickr and Wikipedia Commons), trying to surface the very best image results for your query. The same is also the case for Kagi's Video/News/Podcasts results.

Internal

But most importantly, we are known for our unique results, coming from our web index (internal name - Teclis) and news index (internal name - TinyGem). Kagi's indexes provide unique results that help you discover non-commercial websites and "small web" discussions surrounding a particular topic. Kagi's Teclis and TinyGem indexes are both available as an API.

We do not stop there and we are always trying new things to surface relevant, high-quality results. For example, we recently launched the Kagi Small Web initiative which platforms content from personal blogs and discussions around the web. Discovering high quality content written without the motive of financial gain, gives Kagi's search results a unique flavor and makes it feel more humane to use.


Of course, running an index is crazy expensive. By their own admission, Teclis is narrowly focused on "non-commercial websites and 'small web' discussions". Mojeek indexes nowhere near enough things to meaningfully compete with Google, and Yandex specializes in the Russosphere. Bing (Google's only meaningful direct indexing competitor) is not named so I assume they don't use it. So it's not a leap to say that Google powers most of English-speaking web searches, just like Bing powers almost all search alternatives such as DDG.

I don't personally mind that they use Google as an index (it makes the most sense and it's still the highest-quality one out there IMO, and Kagi can't compete with Google's sheer capital on the indexing front). But I do mind a lot that they aren't being transparent about it anymore. This is very shady and misleading, which is a shame because Kagi otherwise provides a valuable and higher quality service than Google's free search does.

top 50 comments
sorted by: hot top controversial new old
[–] [email protected] 5 points 4 months ago (1 children)

This video by TechAltar is great and goes into why Bing/Google are often the backend for alternative search engines, such as DDG (Bing), Ecosia (Bing), and Startpage (Google).

[–] [email protected] 2 points 4 months ago

Here is an alternative Piped link(s):

This video

Piped is a privacy-respecting open-source alternative frontend to YouTube.

I'm open-source; check me out at GitHub.

[–] [email protected] 2 points 4 months ago

What are people’s thoughts on MetaGER? I started using it when I switched over to librewolf web browser and I really like the base free option, it looks like they have a monetization model but I haven’t looked into it much.

I tried SearchXNG a couple of times but it never stuck. Ashamed to say but google is still used as a backup a lot of time.

[–] [email protected] 18 points 4 months ago* (last edited 4 months ago)

I unsubscribed and deleted my Kagi account mainly because of their attitude to data privacy but also because of their nutjob CEO. When I subscribed I was excited because I thought they wanted to build a proper competitor to other search engine operators, but they are actually just another company that tries to shove AI into absolutely everything. So, after realising that they are an untrustworthy company full of tech maximalists trying to build the torment nexus, I immediately canceled my subscription and moved back to duckduckgo and marginalia. Maybe I give SearXNG another go, it's just that selfhosting is a bit of a bother.

[–] [email protected] 6 points 4 months ago (1 children)

Them using Google indexes anonymously isn't intending to solve the problem you think it is. It's more about incentive structures. Google's "free" search optimizes for ad revenue now. The API access doesn't as much, and Kagi certainly doesn't have an ad incentive. So privacy is a nice bonus, but the real benefit is a customer serving incentive structure.

[–] [email protected] 0 points 4 months ago

If they can silently remove things from their website. They can also silently do things to terms and policies you agreed to also. Then privacy and stuff are up in the air if they can rake in more money.

[–] [email protected] 19 points 4 months ago (3 children)

Guys just use SearXNG . Stop relying on paid services from companies.

[–] [email protected] 10 points 4 months ago

I agree. I was paying Kagi a few months ago but then started self hosting a SearXNG instance. The nicest thing about it is that I can replace links and select the engines I want to use. It's very customizable.

[–] [email protected] 3 points 4 months ago (2 children)

I'm just gonna block the word kagi on my lemmy client. I've been seeing a trend of people trying to create a big deal out of non issues trying to sell kagi for a long time. The amount of votes and some of the replies are always sus too.

[–] [email protected] -1 points 4 months ago (1 children)

This type of thing has made reddit unusable lately and I hope ~~it doesn't show up here~~ lemmy has a better defense against "subtle spam."

[–] [email protected] 14 points 4 months ago* (last edited 4 months ago) (1 children)

Thing is how do you differentiate between a bunch of people who genuinely like a product and are happy to say so because it's solved a problem for them that they see other people having, and "subtle spam"?

For instance, I'm a Kagi subscriber and have been for some months now as it's doing a good job for me, and I've had the odd person leap down my throat accusing me of being a corporate shill etc, and I am absolutely not (but that's what a shill would say!!!)

How does anyone get a product recommendation from a product that's genuinely growing in popularity so people are recommending it? I get there needs to be a healthy dose of cynicism but where does the line get drawn to the point where that cynicism is no longer "healthy" and simply means everyone distrusts everything that's made by a company if somebody on the internet says it's good?

Where's the equal cynicism when somebody says something is shit and it could be a corporate shill from a competitor?

[–] [email protected] 2 points 4 months ago

I've even seen people saying that any brand mention will be compensated, even slightly negative. I think some sort of web of trust is the only answer.

[–] [email protected] 26 points 4 months ago* (last edited 4 months ago)

It's really disingenuous to mud sling people with a different view by implying they themselves don't exist/are astroturfing/are bots.

I'm a real human who decided to use their service for kicks and actually like some of the benefits and control over the results compared to other search engines.

Especially when I'm doing research, which is usually half of all my time searching anyways.

Enough that I decided to pay for the service. I'm happy with it and want to share that happiness with others. Are you saying that because I liked a service that I can't seem to get anywhere else I'm now the bad guy? Because I like something and want to share it with others, that's bad?

Is the alternative that you might prefer to be corporate astroturfing instead of organic discussion and growth? Like, really, seriously, what's the alternative here if people talking about and sharing something they like is not acceptable?

[–] [email protected] 30 points 4 months ago

I don’t care whose indexes they use so long as the results are good. The problem isn’t the index, it’s how the contents get prioritized and presented. Kagi happens to do so well for me.

[–] [email protected] 42 points 4 months ago

Our search results also include anonymized API calls to all major search result providers worldwide

When I read this, it doesn't tell me they don't use Google. Quite the opposite. It says all, that immediately tells me Google is among them.

[–] [email protected] 50 points 4 months ago* (last edited 4 months ago)

Our data includes anonymized API calls to traditional search indexes like Google, Yandex, Mojeek and Brave, specialized search engines like Marginalia, and sources of vertical information like Wolfram Alpha, Apple, Wikipedia, Open Meteo, Yelp, TripAdvisor and other APIs

I don't want to be that guy, but technically they said they are using traditional indexes like Google, not that they are in fact using Google. But I guess that is splitting hairs.
Also, maybe they just dropped Google from their indexes? And what's more: Why does it matter if they are using Google at all, when the results are satisfying?

Knowing which indexes they are using exactly would be nice to know, though.

load more comments
view more: next ›