return to table of content

Compare Google, Bing, Marginalia, Kagi, Mwmbl, and ChatGPT

endisneigh
33 replies
15h37m

I reckon these days search is pretty difficult and everyone knows how to game it. I recommend using a search engine that lets you effectively change which sites are shown. You can do this with Kagi, or with Google's Programmable Search Engines - I'm sure there are more too.

In particular I block Youtube, not because they aren't sometimes correct, but because I don't want videos polluting the regular results - it just takes too long to get info from videos.

An ability to upvote results for a given query seems tantalizing but I bet it would be gamed too. The DIY approach seems to be the only tractable one.

In my case I only only results from domains I believe are correct. The whitelist approach does have downsides. Usually I'll vet new potential domains through social means like Reddit and this site, rather than identifying them through the search results. I believe there's an inherent tradeoff between discoverability and the gameability of the results.

Though I do sympathize with folks who reminisce about 2008 Google Search results, there were probably orders of magnitude less content out there and a complete ignorance to how valuable your place is on your business and thus no SEO.

I also personally disagree that yt-dlp is the "correct" result for the average user when they search Youtube Download. I highly doubt the average user would know or care to use the command line. A website front end would be more actionable for them.

ysavir
15 replies
14h54m

In particular I block Youtube, not because they aren't sometimes correct, but because I don't want videos polluting the regular results - it just takes too long to get info from videos.

Funnily enough, lately I've been prioritizing YT videos more when searching. So many sites now are just regurgitated SEO farms with minimal quality, and easy to see why: it's minimal effort to produce and cheap to host. But making a video takes time and effort, so has a much higher barrier to use as a click farm.

More than once when traditional search failed me, I went to YT and found some video from 2009 clearly and eloquently explaining what I'm looking for in detail, and without any distractions because the person authoring the video clearly didn't specialize in the media format or show interest in experimenting.

I've found it to also be a better source when looking a product to buy. Want to know which fan to get? Turns out there's a channel from a dedicated guy who keeps finding ways to test different fans and their utility and with multiple videos demonstrating his approach and findings. The mainstream channels aren't all that useful, but there's a ton of "old web" style videos (some even recent) passionately providing details for almost anything you'd think to search. And they're a gold mine.

robrenaud
4 replies
14h41m

Would a browser feature that skipped to the relevant parts of the video based on closed captioning and understanding search intent be useful? It seems like this would be a good way for Google to fight to stay relevant in UX vs having the chat bots just quickly spitting out a readable answer. Hunting through ad laden webpages is annoying. Seeking to the relevant section of the video is a solvable problem, especially for videos above some viewership threshold.

nulld3v
1 replies
14h29m

I've definitely seen Google do this already: https://searchengineland.com/google-tests-suggested-clip-sea...

tentacleuno
0 replies
13h53m

Google seems to be taking much more advantage of YouTube's transcription feature lately. The first addition was the (ok, gimmicky) animation on the Subscribe button when someone says the dreaded like. Hopefully a sign of things to come.

Overall AI summaries are very welcome for a certain subset of YouTube which is sadly dominated by sponsored, clickbait, and ad-driven content.

tentacleuno
0 replies
14h3m

Seeking to the relevant section of the video is a solvable problem

...and it has already been solved, though partially: SponsorBlock allows people to add a "Highlight" section to a video, which denotes the part of the video which the user most likely wanted to see (sans the "what's up guys", "like and subscribe", etc.)

Of course, it's not perfect: it relies upon humans doing the work, though some may see that as a positive over something more computerized.

dcow
0 replies
14h25m

Didn’t Google try this already? It seems useful to me, at least. IMO the next frontier of search is not better hypertext, it’s podcasts, audio, and video.

plagiarist
3 replies
14h39m

Do you have some tips for finding concise videos that answer the question you are asking? I am finding more and more obvious LLM bullshit in results, so I am willing to try some other tactics. But I am not ready to spend the minutes watching videos to see if it is actually relevant or a waste of time, always artificially long to increase ad revenue.

crznp
1 replies
13h36m

For me, it really depends on the type of video. For fixing cars, I'm usually looking for something specific enough that there isn't a lot of chaff. It was probably recorded and edited on a phone just to splice the clips together. Probably the default thumbnail that youtube extracted from the video.

For product videos, if Project Farm did it, look there first. Otherwise, I look for someone has a lot of videos for competing products with basically the same format, not over 10 minutes.

Tech videos are the hardest, I often still prefer text. Maybe look for links to the docs in the description? I still get duds though.

williamcotton
0 replies
2h26m

I don’t know much about fixing cars, but yeah, YouTube is a treasure trove for tacit knowledge.

ysavir
0 replies
5h35m

Wish I did, but here you're at the algorithm's mercy, unfortunately. One possibility is subbing/accruing watch time on channels that you find provide you the right value, so that the algorithm might recommend similar channels on other subject matters.

imiric
3 replies
13h59m

But making a video takes time and effort, so has a much higher barrier to use as a click farm.

The mainstream channels aren't all that useful, but there's a ton of "old web" style videos (some even recent) passionately providing details for almost anything you'd think to search. And they're a gold mine.

This won't be the case for long. YT is already starting to be polluted with spam and AI generated content, which will get more and more common. The same thing that happened to the web in text form, will happen to videos.

I think the only solutions are using allowlists for specific domains, and ironically enough more AI to filter specific results. Or just straight up LLMs instead of web search, assuming they're not trained on spam data themselves.

ysavir
0 replies
5h38m

One critical difference is the date attached to youtube videos. It's easy to verify that a video was made before this tech was available, but you can't do that with websites, or search engine result pages.

It does limit utility for more modern needs, unfortunately.

lrem
0 replies
4h52m

Note that the problem of filtering bad data out of learning material isn’t inherently easier than filtering same out of search results.

danieldk
0 replies
10h51m

Yeah. I was recently looking for videos comparing two smartphones and among top ranked videos there were videos that just show the phones side by side and the video consists of showing specs side by side and videos that just have LLM-generated text, added to the video with TTS.

necovek
1 replies
7h9m

That's curious, I generally hate video due to inability to glance over content, and the few attempts I made to actually find useful information I searched for resulted in... spammy extra low effort video content that did not answer my questions.

williamcotton
0 replies
2h29m

Depends on what you’re looking for. A blog post about how to play Search and Destroy by The Stooges is not as useful as a video of James Williamson himself showing you the riffs!

kristopolous
7 replies
15h10m

I'm a big fan of the non commercial site search engines because of the gaming aspect. If you're not generating revenue from the clicks the game mostly goes away.

I'm not saying people aren't entitled to make some money, but it clearly incentivizes user hostile behavior.

Maybe make it an option because legitimate sites like journalism also use this model.

Renaud
6 replies
15h2m

Subscription model like Kagi seems to work pretty well against gaming the results.

Their only remaining incentive is to be good enough that people keep paying for the service.

Nextgrid
3 replies
13h41m

It works not because they're somehow smarter or have more resources than Google at detecting spam/SEO, it's because unlike Google (and other ad-supported search engines), they make money from result quality and have an interest in blocking spam.

Google on the other hand makes money off ads (whether on the search results page itself or on the spam sites), so spam sites are at best considered neutral and at worst considered beneficial (since they can embed Google ads/analytics, and make the ads on the search results page look relatively good compared to the spam).

Black-hat SEO has been around since the early days of search engines and they managed to keep it at bay just fine. What changed isn't that there was some sudden breakthrough in malicious SEO, it's that it was more profitable to keep the spammers around than to fight them, and with the entire tech industry settling on advertising/"engagement" as its business model, the risk of competition was nil because competitors with the same business model would end up making the same decision.

The same reason is behind the neutering of advanced search features. These have nothing to do with the supposed war on spam/SEO, so why were they removed? Oh yeah because you'd spend less time on the search results page and are less likely to click on an ad/sponsored result, so it's against Google's interests and was removed too.

ec109685
2 replies
13h18m

Kagi works because there is no incentive for SEO manipulators to target it since their market share is so small.

Super tinfoil hat to believe Google wants to send users to blog spam websites (e.g. beneficial to Google).

Anytime there is money to be made, there is an effectively infinite amount of people trying to game the system.

lanstin
0 replies
13h3m

Google is a complex system so “want” can just include we are making money from the blog spam and while we don’t like it other things take priority over fighting it as effectively as we could.

ZeePelli
0 replies
6h0m

It's never tinfoil-hat to assume that a corporation is, at very least, making sure not to fight too hard against any activity that brings it more revenue.

whakim
1 replies
13h9m

But the author tried Kagi and the results don't appear to be noticeably different, filled with scammy adspam just like Google and Bing. Kagi's results seem to mostly aggregate existing search engines [1], so this isn't much of a surprise. Perhaps a subscription-based service that operates an index at Google's scale might help, but no such thing exists to my knowledge.

[1] https://help.kagi.com/kagi/search-details/search-sources.htm...

greggh
0 replies
1h50m

Right, but Kagi has built in tools to make it easy to fix that. Blocking those spammy sites from ever showing up again. Moving certain sites up the ranking, and so on. These features mean that over time my Kagi results have become nearly perfect for myself.

stevage
4 replies
13h50m

I have a hard time believing it's so difficult for a search engine to distinguish between a credible, respected website that has been around a while with some generated garbage that exists to be a search result. We humans can tell them apart, so in principle, computers can too.

Nextgrid
2 replies
13h34m

Yes, this should be table stakes for a classifier - a company with the resource of Google can definitely solve that problem if they weren't themselves in the business of spam (advertising) and benefited from spam sites (as they often include Google ads/analytics).

navigate8310
0 replies
11h6m

Google is quite quick in plugging holes in AdSense but AdWords.

ametrau
0 replies
1h7m

table stakes

Always “table stakes”. Do you think in buzzwords also? I’ve always wondered this. Or do you think normal words and then translate it into this bandwagoning / membership proving garbage ?

pixl97
0 replies
54m

I guess this brings up the question of how good are humans at doing this across a wide number of domains on average?

The other question I have is how long do these garbage results stay up for a particular query on average?

teeray
1 replies
14h20m

it just takes too long to get info from videos.

I can’t wait until video transcripts get fed into LLMs just to eliminate the whole “This video is sponsored by something-completely-unrelated, more about them later. What’s up Youtube, remember to like, share, subscribe… 5 entire minutes pass on similar drivelthe actual thing you want, but stretched out to an agonizing length

execat
0 replies
14h8m

You need SponsorBlock.

Usually people leave a "highlight" marker which tells you where you're supposed to jump to. Along with the regular "This video was brought to you by <insert>VPN".

lamontcg
1 replies
13h56m

Though I do sympathize with folks who reminisce about 2008 Google Search results, there were probably orders of magnitude less content out there and a complete ignorance to how valuable your place is on your business and thus no SEO.

That was a decade after Google was created and people certainly understood SEO and Google was constantly updating its algorithm to punish people who were trying to game the algorithm.

The wikipedia page on "link farming" for example references it happening as early as 1999 and targeting SEO on inktomi:

https://en.wikipedia.org/wiki/Link_farm

I remember some internal presentations at Amazon around ~2004 about how boosting Google SEO on Amazon web pages increased traffic and revenue (and Amazon was honestly a bit behind-the-curve due to a kind of NIH syndrome).

bee_rider
0 replies
2h42m

At the time it seemed like Google was winning, though. SEO seems to have gotten really good, or maybe Google just gave up.

boomboomsubban
33 replies
14h23m

Can someone tell me why Bing, and thus DDG, has switched to prioritizing local results? I'll search the most inane things, like lyrics to a song, and get results for local businesses containing maybe one word in common.

It's most frustrating with phone numbers. I picked up the habit of searching the random numbers that called me, to try and find out if they were possibly important. I used to get a bunch of spam sites that clearly existed to profit off me making those searches.

Both Google and DDG have removed those spam sites, even though they were useful at times. Google will tell me the number is in some random PDF that contains a few of the digits, then no other results. DDG will say the top result is my local police department, something that freaked me out the first few times.

alexforster
11 replies
13h27m

DDG is just repackaged Bing. Always has been. I remember looking into them when I was ready to job-hop many years ago, and they asked for dedication to their search engine as their foremost requirement for employment. It's the "drop-shipping" equivalent of search engines.

behnamoh
10 replies
12h56m

hope kagi takes ddg place in terms of adoption. never really liked ddg even though i always care about privacy.

mrweasel
9 replies
9h11m

I really don't get that sentiment. Currently Kagi is just as dependent on Google as DuckDuckGo is on Bing. That might only be temporary of course and Kagi does seem to be working on a search engine of their own.

Rather than wanting Kagi to take the place of DuckDuckGo, it would would be better if Kagi could take users from Google, and then when ready, drop Google as a search provider.

Semaphor
5 replies
8h37m

Kagi mixes google, bing, some non-profit small-web SE, and their own index.

mrweasel
4 replies
7h25m

I don't think they use Bing, but yes, Google, Marginalia, Yandex, Brave and others. I still fail to see how that's different to DuckDuckGo, who also run their own crawler. It's really weird that people are almost hating on DuckDuckGo for how they run their search engine, while applauding Kagi, for doing the same, but with a different business model.

speedgoose
2 replies
7h1m

I also assume that Kagi uses some shady residential IPs proxies and similar tricks to scrap Google while DDG has access to the Bing API.

mrweasel
1 replies
6h40m

You can buy access to the Google Search API, which is what I assume Kagi does. Building your product on being able to circumvent some Google restrictions seems like a bad business move, if you can buy the same service for a reasonable price.

speedgoose
0 replies
3h3m

Where can I buy it?

Semaphor
0 replies
3h47m

Only if they changed that (which they might have as part of their cost-optimization). They said they mixed bing and google results back then.

feanaro
1 replies
7h15m

Kagi should hire the Marginalia author.

freediver
0 replies
1h52m

We already include Marginalia results in Kagi [1]

https://help.kagi.com/kagi/search-details/search-sources.htm...

Kiro
0 replies
6h24m

DDG used to be the HN darling and you would get downvoted for saying anything negative or even insinuating that they are relying on Bing. Now the spot has been overtaken by Kagi but it looks like it suffers from the same problems. The counterargument that they have their own index as well is the same that was used for DDG, when the reality was that it was only used for widgets and other fluff. Let's see how it plays out for Kagi.

teeray
2 replies
14h10m

I'll search the most inane things, like lyrics to a song, and get results for local businesses

Query: “I’m coming out of my cage…”

Result (Ad): “You’ll be doing just fine with these amazing year-end closeout prices at Al’s Discount Car Barn. Gotta come down—you’ll want it all!”

moffkalast
0 replies
8h27m

It was only a list, how did it end up like this?

boomboomsubban
0 replies
14h6m

Ads would make sense, but there's no way my local city council is paying Bing and they are the most frequently listed result.

callalex
2 replies
13h5m

I’m confused, you are searching for, specifically, a local phone number and you are upset that the machine interprets that as you looking for a local result? That’s what most people expect from a local number search.

Perhaps the incorrect thing is not your internet search results, but actually your phone carrier for lying to you and telling you that a caller has a local number?

tedunangst
0 replies
12h51m

If I search for a ten digit number, it is not helpful to return a local business that shares the last four digits.

boomboomsubban
0 replies
12h36m

The number is local, and occasionally I've searched and found the number was a local clinic or business that had legitimate reason to call me but not leave a message. In those scenarios, close to all ten of the numbers are found on the page.

The top result being my local police department because it shares the same area code and has maybe one other number in common is clearly a bad result. It does this even if the phone carrier isn't lying to me and the caller does have a local number, like the increasingly common political spam calls.

berkut
2 replies
13h46m

Yeah, I've noticed this as well with DDG recently: even with the localised checkbox disabled it still prioritises them, which often is very frustrating as the results are then almost totally useless.

However, more generally, I've personally found that DDG (and maybe Bing's then?) localised results are just really bad, and have been for the multiple years I've been using DDG and it's had this feature: I'm in New Zealand, and enabling localised / region-based search still often provides results to pages with TLDs like "co.uk", ".ca" and ".pl" (these latter are really common for content-generated spam in my experience), which I just can't understand...

Unfortunately, I have found that Google's results are usually a lot better in terms of being "location-aware" than DDG, at least when that's what you want...

notnullorvoid
0 replies
1h7m

That's a bit surprising that you're seeing spam sites with .ca, those are illegal here and all .ca domains must be registered by someone in Canada.

You can report them: https://ised-isde.canada.ca/site/canada-anti-spam-legislatio...

n_plus_1_acc
0 replies
3h25m

I habe the same experience from Germany. There's the slider but it's not doing mich.

skygazer
1 replies
14h1m

If you’re going to search for phone numbers you’ll want to ensure you enable verbatim searching under tools on Google, and put the number in quotes, perhaps in “xxx-xxx-xxxx” OR “(xxx) xxx-xxxx” forms. Many of the sites you mention are fake sites with fake contacts just for ad serving, and I’ve read in some few cases the scammers seeded the spoofed numbers they appear to call from on to the sites they control to see who googles their phone numbers.

pixl97
0 replies
1h29m

Reverse spoof the numbers of FTC investigators and Google employees?

csnover
1 replies
12h39m

Man, thank you for saying this. Stuffing results with geolocated local junk despite explicitly opting out by choosing “All regions” is so frustrating. This wasn’t happening a year or two ago. I submit negative feedback about it constantly, but I guess not enough people are doing that for anyone to notice or care.

I’ve also noticed a significant increase in attempts to stuff news into regular search results. I really do not appreciate being force-fed mental health poison. I don’t need it ever, but I especially don’t need it when I’m searching for some specific technical thing and then get emotionally sabotaged by some clickbait headline because … why? Some bullshit KPI? Why are tech companies so obsessed with pushing news into every orifice?

stavros
0 replies
7h25m

Hah, calling the news "mental health poison" is the most accurate thing I've read all day.

bscphil
1 replies
1h53m

Bing, and thus DDG, has switched to prioritizing local results

From what I can tell this is an issue with the Bing API that DDG uses that the DDG folks have been unable to resolve. I've tried many identical queries between DDG and Bing and while Bing does occasionally return incorrect local results, the completely irrelevant local results that appear on almost every DDG search do not seem to happen with Bing itself.

From what I understand, DDG is aware of the issue. I don't know why it isn't more of a priority.

binarymax
0 replies
20m

Long time DDG user (>10 years) here, and it’s astounding to me that they haven’t prioritized making their own independent index to switch off Bing. I would have expected them to do it like 5 years ago, but there’s afaik no initiative to do so. It’s unfortunate and am now trying other engines like Brave search.

Gualdrapo
1 replies
14h6m

Maybe it was an attempt to make better their results for local results?

When searching for results from my country in DDG (picking the country in the drop-down below the search box) still returned results from the USA or other countries. Even when searching stuff in the local language. Maybe they tried to fix that because it really sucked, so much I never used it again for searching into local websites.

boomboomsubban
0 replies
14h0m

This is the one area it still ignores my location. I live in a town named after a UK city, there's several bigger towns in the US with the same name. I just searched "McDonalds city name." I got results for the locations at least half the US away from me, as well as Uber Eats GB.

michaelbuckbee
0 replies
2h24m

Nearly every local search is a leading indicator of buying intent and, therefore, is worth more money when served as a response instead of an authoritative response.

mattigames
0 replies
7h26m

In my country (Colombia) Google still has not removed those spam sites that just generate all possible numbers.

dvngnt_
0 replies
13h18m

you can use true person search for numbers

bpodgursky
0 replies
13h19m

I suspect it's a failure to distinguish mobile searches (where people are legitimately looking for a business) from desktop searches.

bambax
30 replies
8h26m

I'm in the camp of those who think Google's results are still very good. I admit I use adblock (uBlock Origin) and won't even try to disable it.

I understand the author's point of turning off their ad blocker "to get the non-expert browsing experience" but then they could make a different test with uBlock on for every query and see how it goes.

It's also a bit inconsistent to expect results for downloading videos mentioning yt-dlp while trying to emulate "the non-expert browsing experience"... Yt-dlp is a command-line Python utility. Talk about non-expert! Most people don't know that videos are files that can be downloaded; of those who do, most don't know about the command line or Python.

Yet when searching for "how to download youtube videos" the first result I get on Google is a link to a service called "savefrom.net", which appears to work well and does not seem to be a scam. This would qualify as "very good" in my book.

When searching for "how to download youtube videos from the command line" the first few results are about youtube-dl, including links to github and superuser. Granted they don't mention yt-dlp, but youtube-dl is a good start.

gkbrk
17 replies
6h54m

When I do a Google search in an Incognito tab for "how to download youtube videos", the first two results I get are the following.

- https://msunduziassociation.online/perfect-online-videos/

- https://gssaction.org/program-all-in-one-media-solutions/

I would certainly put those in the "Terrible" category like the author.

sanderjd
6 replies
4h38m

I'm curious: what is the rationale for "in an incognito tab" being part of the test harness?

It seems pretty arbitrary to me to disable one of the key features - in this case personalization - of the software being evaluated.

Or is the evaluation not between "search engines" but rather "search engines without personalization"? If so, then this restriction does make sense. But that is not the evaluation that "normal users" are interested in.

Majromax
4 replies
4h27m

I'm curious: what is the rationale for "in an incognito tab" being part of the test harness?

It's the closest we can easily get to the 'average user experience'. Someone who has a long account/cookie history with Google has plausibly trained the site to return more relevant results through implicit user-curation of avoiding obvious-to-them SEO-spam on other queries.

If we posit that every user eventually trains Google to avoid SEO spam, then this begs the question of why Google(/Bing) don't eliminate the SEO spam in the first place.

Besides that, it's not obvious why search engine personalization should dramatically change the basic utility of search results. We should expect personalization to mostly address ambiguities: is 'the best way to set up tables' asking about furniture assembly/carpentry or SQL? None of the author's queries for this article supported such ambiguities, and besides that the results returned (see the final appendix) aren't[†] valid answers to a different interpretation of the question.

[†] -- I think I'd quibble about the 'adblock' question, since a reasonable person might still find an adblocker that works but participates in the 'acceptable ads program' to be sufficient.

sanderjd
2 replies
3h13m

It's the closest we can easily get to the 'average user experience'.

Maybe it's the closest we can get (though I doubt it), but it definitely isn't close enough to tell us anything about the "average user experience".

The average user has been using google for years, without taking any steps to avoid personalization. An incognito session (on a browser / machine / network that is probably fingerprinted...) is pretty much the opposite of that typical usage pattern.

I recognize that just writing a blog post or comment on HN is not a research project so needs to do something quick, but I think it mostly invalidates the experiment. What would get closer would be to devise a few user personas and attempt to search and browse for awhile within those personas before trying the experiment. Or much better yet, put together a focus group comprised of real people within the personas you're interested in, and run the experiment using their real accounts.

If we posit that every user eventually trains Google to avoid SEO spam

I don't think it's that, I think it's that every user trains it to return results more likely to improve the metric of "more likely to click one of the links", and I think that makes it more, not less, likely that they see what most of us here consider to be spam.

But I don't know! Maybe that's not what this experimental setup would show. But it would be a lot more enlightening than a setup using a fresh incognito window, which reflects the usage pattern of a proportion of search queries that is a tiny rounding error above zero.

SV_BubbleTime
1 replies
2h48m

Why are you assuming all users are logged in to google all the time?

nvm0n2
0 replies
2m

Google has billions of user accounts ....

Jcowell
0 replies
4h4m

It's the closest we can easily get to the 'average user experience'

You wouldn’t be really taking the average here though would you? You would be capture the experience someone might have if they were in incognito, using google for the very first time, or using google on another device for the every first time, but not the “average experience”.

gkbrk
0 replies
4h21m

Google gets paid when you click on an ad. It's reasonable to guess you're not going to click on too many scam software ads with your software engineer profile. So naturally you'll be showed less of them.

In this thread we can see people both using incognito tabs seeing different results, it will only become worse to compare if they are using personalized results.

londons_explore
4 replies
5h55m

Did you click either of those links?

Both seem to do the job of downloading a youtube link to mp4 for free.

gkbrk
2 replies
5h51m

Did you click either of those links? They are not YouTube video downloaders, they just link to another downloader. There is nowhere on those links to even put a YouTube URL.

Are you seriously suggesting that a website with the following "About us" with only a link to another YouTube video downloader is itself a good YouTube video downloader?

Good Samaritan Support Action is to reawaken the Body of Christ to receiving the extravagant love of The Father, as well as our call to respond to this love by loving God with all of our hearts, souls, strengths, and minds. In order for people’s hearts to be linked to the heart of our Heavenly Father, we want to foster and facilitate the establishment of a culture of love in our churches and ministries.
londons_explore
1 replies
2h38m

so, there is one extra click... But for the user, the site does the job and takes an extra 1 second.

Ideal? No. But it does the trick.

hamasho
0 replies
2h6m

Not GP, but navigating to an unrelated scammy site just having a link to the actual site is a terrible and unethical job by Google. Imagine if you search "youtube" and the top result is not YouTube but some scammy site just having a link to YouTube. It's not about click counts, if the youtube downloader has bad UX and requires extra clicks, it's a bit inconvenient but ok.

tantalor
0 replies
5h44m

Those are both garbage/scam sites

Dah00n
2 replies
6h43m

I get savefrom.net in both Incognito and normal tabs, uBlock or not. I have no idea why you get crap results that are somehow different. uBlock doesn't change google results in Firefox for me at all. It seems you get crap added, not removed.

gkbrk
1 replies
6h35m

I searched with Chrome, perhaps that's the difference. Firefox also blocks some ads out-of-the-box even without uBlock, so maybe it was already blocked.

It could also be related to targeting, like time zone, location, IP address, age group etc.

Dah00n
0 replies
5h48m

I get the same search result in Edge as in Firefox. Can't test in Chrome, but something seems strange.

cj
1 replies
6h26m

My top 2 (incognito) are blog posts from pcmag.com and zdnet.com listing 5 ways to download YT videos. Maybe it's blogspam, but the listed services seem valid at first glance.

savefrom.net is the 5th result (2nd page underneath 5 youtube videos)

Edit: This is from the US. If i had to guess, these are regional differences. What country are you in?

emmelaich
0 replies
4h14m

I got similar to you; I'm in Australia.

teleforce
2 replies
6h30m

I'm also in the same camp who think search results from Google is very good but ChatGPT based search with RAG is better, granted it's a paid version. The latter however is kind of experimental, personally would love to have another column on ChatGPT with RAG (Bing) and the fact the author ignored RAG is rather strange.

jll29
0 replies
3h0m

The topic of control (in ChatGPT like models) explained: https://arxiv.org/pdf/2311.11701.pdf

erybodyknows
0 replies
3h48m

For those (like me) wondering what RAG means: “Retrieval Augmented Generation (RAG) represents a groundbreaking approach in information retrieval, where the accuracy of search results directly influences the quality of generated answers. In essence, RAG combines traditional search mechanisms with Large Language Model's ability to understand and generate answers.”

(https://www.linkedin.com/pulse/how-we-increased-search-accur....)

bee_rider
2 replies
2h43m

An adblocker is necessary, and IMO a script blocker as well. I feel vaguely like search has gotten worse over time, but it is not a huge problem—usually a good site is on the first page or two, and so I can just go check them out.

But if clicking a site meant I would be under attack, that really increases the stakes, I start to care strongly about the absence of bad sites, not just the existence of a good one.

Other than that, people need to be trained to not download programs from websites in general. I think this has gotten better over time? This is just a human mistake. Maybe Google could suppress sites that link to executables. It must, right?

pixl97
1 replies
1h21m

It would suppress linking to malware executables, but just general programs I don't see why they would.

bee_rider
0 replies
1h4m

By the time you know enough about a site to download some random executable off it and run it, you know more than enough to just enter the URL, so there’s no point to having it show up in search results.

anonymoushn
2 replies
5h24m

cross-posted: Did you try using savefrom.net? You can type "https://www.youtube.com/watch?v=IkYVmtgxebU" into the text box and hit "Download". Then you'll get a new tab that tries to get you to install malware. If you decline to install it, the new tab takes you to the malware's homepage. If you close the tab and go back to the original tab, savefrom.net presents you with an error message saying "The download link not found." and does not help you download the video.

vagrantJin
1 replies
5h2m

savefrom.net used to be good but it seems they've switched their MO. plenty of decent alternatives filled the gap though.

anonymoushn
0 replies
4h57m

Can you name the alternatives, and are they present in the search results?

omoikane
0 replies
55m

they could make a different test

The takeaway I got from the article is everyone can make their own test, as opposed to relying on other people's sentiments and memes about X is bad or Y is good.

Trying to emulate a non-expert experience without workarounds is not the common usage pattern since everyone familiar with their favorite tools have ways to get more value out of them, but this article presents a way of constructing an experiment (this is why I chose these queries, this is how I ranked scams, etc.), and I think people should follow this same spirit to evaluate if they are stuck in a local optimum with their current choice of tools.

beezle
0 replies
4h9m

Put me in the camp of google and the rest are horrible for all but very specific/unique technical terms, ie weak neutral currents. Anything that is more "everyday life" is an exercise in futility sorting through trash, often without even the terms you are looking for. And good luck with "verbatim" searches - either ignored or zero results.

ametrau
0 replies
1h12m

Another smash hill comment. Be wary of this when criticising a trillion dollar company. They can afford the shills.

haizhung
29 replies
5h40m

What always confuses me about the „search has gotten so bad“ mentality is that it is often based on anecdotal evidence at best, and anecdotal recollection at worst.

Like, sure, I have the impression that search got worse over the last years, but .. has it really? How could you tell?

And, honestly, this should be a verifiable claim; you can just try the top N search terms from Google trends or whatever and see how they perform. It should be easy to make a benchmark, and yet no one (who complains about this issue) ever bothers to make one.

Dan at least started to provide actual evidence and criteria by which he would score results, but even he only looked at 5 examples. Which really is a small sample size to make any general claims.

So I am left to wonder why there are so many posts about the sentiment that search got worse without anyone ever verifying that claim.

marginalia_nu
11 replies
5h34m

I think the point he's trying to make that the search results page from the mainstream search engines are a minefield of scams that a regular person would have difficulty navigating safely.

If he was looking at relevance, yours would be a solid point, but since most of the emphasis is on harm, a smaller sample works. Like "we found used needles in 3 out of 5 playgrounds" doesn't typically garner requests for p-values and error bars.

sanderjd
8 replies
4h32m

I think this is a good illustration of my frustration with this discussion: I don't think search has gotten bad, I think the web has gotten bad. It's weird to even conceptualize it as a big graph of useful hypertext documents. That's just wikipedia. The broader web is this much noisier and dubious thing now.

That's bad for google though! Their model is very much predicated on the web having a lot of signal that they can find within the noise. But if it just ... doesn't actually have much signal, then what?

marginalia_nu
3 replies
3h3m

On the one hand, I'm not sure the data corroborates that. If this is a web problem and not a search engine problem, then I'd expect every search engine to have the same pattern of scam results.

I'd also argue that finding relevant results among a sea of irrelevant results is the primary function of a search engine. This was as true in 1998 as it is today. In fact, it was Google's "killer feature", unlike Altavista and the likes it showed you far more relevant results.

gmd63
1 replies
2h26m

If the web is being polluted by a nefarious search engine provider that is excluding the polluted pages from their algorithm, you wouldn't see the same pattern across search engines

Not saying or even suggesting that's happening, but the logic isn't airtight

marginalia_nu
0 replies
2h4m

Well, there's always the Münchaussen trilemma, by which no reasoning is airtight.

pixl97
0 replies
1h11m

Relevant is a difficult concept to agree on. In 1998 it was more about X != Y, that is being shown legit pages that just were not the correct topic.

These days the results are apt to be the correct topic, but instead optimized for some other metric than what the user wants. For example downloading malware or showing as many crypto ads as possible.

I don't expect every search engine to have the same scam results. Scammers target individual search engines with particular methodologies. Google does a lot of work to prevent crap on their engines, the issue is the scammers in total do far more.

whakim
1 replies
3h46m

But there's still plenty of signal. It isn't as if there are no working YouTube downloaders, or factually correct explanations of how transistors work. It's just that search engines don't know how to (or don't care enough about) disambiguating these good results from the mountains of spam or malware.

devinmcafee
0 replies
2h36m

I think that both of you are correct. The internet has much more "noise" than in the past (partially due to websites gaming SEO to show up higher in Google's search results). As a result, Google's algorithm returns more "noise" per query now than it used to. It is a less effective filter through the noise.

Imagine Google were like a water filter you install on your kitchen faucet to filter out unwanted chemicals from your drinking water. If as the years progress your municipal tap water starts to contain a higher baseline of unwanted chemicals, and as a result the filter begins to let through more chemicals than it did before, you'd consider your filter pretty cruddy for its use case. At the bare minimum you'd call it outdated. That is what is happening to Google search

dpkirchner
1 replies
2h35m

The web has gotten bad because of what big search engines have encouraged. If they stopped incentivizing publishing complete garbage (by ruthlessly delisting low quality sites regardless of their ad quantity, etc) then maybe we'd see a resurgence of good content.

48864w6ui
0 replies
1h14m

The web is bad because it is both popular and commercial. Every now and then I fantasize that just finding a sufficiently user-hostile corner would suffice to recreate the early internet experience of an online world nearly exclusively populated by anticommercial geeks.

hyperpape
1 replies
4h29m

I agree we can say "this is a minefield of scams" without doing a comparison.

There still is a question about when it got bad--I think Dan mentions 2016 as a point of comparison, and there were plenty of scams back then, so you might wonder whether the days when a query wouldn't return many scams.

If you go back far enough, then there wasn't the same kind of SEO, and Internet scams were much smaller/less organized, but that's a long time ago.

pixl97
0 replies
1h8m

I think the automation tools for scams are what the major change is. In the distant past it was humans doing this, now I'm guessing there are a few larger businesses and likely nation states that have a point and click interface that removes 99% of the past work.

anonymoushn
2 replies
5h35m

Probably for the same reason that there are so many more posts about anything that make claims than that explore evidence systematically, especially when the people making the posts stand to gain nothing by spending their time that way.

I encounter claims that "protobuf is faster than json" pretty regularly but it seems like nobody has actually benchmarked this. Typical protobuf decoder benchmarks say that protobuf decodes ~5x slower than json, and I don't think it's ~5x smaller for the same document, but I'm also not dedicating my weekend to convincing other people about this.

ForkMeOnTinder
1 replies
3h58m

The problem with benchmarking that claim is there's no one true "json decoder" that everyone uses. You choose one based on your language -- JSON.stringify if you're using JS, serde_json if you're using Rust, etc.

So what people are actually saying is, a typical protobuf implementation decodes faster than a typical JSON implementation for a typical serialized object -- and that's true in my experience.

Tying this back into the thread topic of search engine results, I googled "protobuf json benchmark" and the first result is this Golang benchmark which seems relevant. https://shijuvar.medium.com/benchmarking-protocol-buffers-js... Results for specific languages like "rust protobuf json benchmark" also look nice and relevant, but I'm not gonna click on all these links to verify.

In my experience programming searches tend to get much better results than other types of searches, so I think the article's claim still holds.

anonymoushn
0 replies
3h47m

I agree. You wouldn't use encoding/json or serde-json if you had to deserialize a lot of json and you cared about latency, throughput, or power costs. A typical protobuf decoder would be better.

bee_rider
1 replies
2h51m

So I am left to wonder why there are so many posts about the sentiment that search got worse without anyone ever verifying that claim.

I suspect it has gotten worse, so posts complaining about it resonate. But, it is not really a huge problem, and anyway it isn’t as if there’s much I can do about it, so I’m not going to bother collecting statistically valid data.

I think this is generally true about a lot of things. We should be OK with admitting that we aren’t all that data-driven and lots of our beliefs are based on anecdotes bouncing around in conversations. Lots of things are not really very important. And IMO we should better signal that our preferences and opinions aren’t facts; far too many people mix up the two from what I’ve seen.

pixl97
0 replies
1h5m

When it comes to human psychology what we believe tends to be more important than what is when it comes to future predictions of our actions. If people think search sucks then it's likely they'll use less of it in the future and it opens up companies like Google for disruption.

williamcotton
0 replies
3h6m

Dan approached the problem from a qualitative perspective. Perhaps if more people took this approach over quantitative maximalism we would actually have products that don’t drive us fucking insane.

All that matters is the overwhelming sentiment that search has gotten worse, not the same fucking spreadsheet that got us here in the first place!

ta988
0 replies
1h1m

Yes to get an accurate comparison we would need to have results from queries 10 years ago.

I still remember myself having to really often go to page 3 and more of google searches to find things even really early on.

I think it has never been good, got a bit better before SEO farms took all the gain out. That's my feeling with nothing to back it.

nvm0n2
0 replies
4m

> has it really? How could you tell?

Yes it has and for a certain class of queries it's not even open for debate, because Google themselves have stated they deliberately made it worse. And they really did, it's very noticeable.

This class of queries is for anything related to any perspective deemed "non authoritative". Try to find information that contradicts the US Government on medical questions, for example, and even when you know what page you're looking for you won't be able to find it except via the most specific forcing e.g. exact quoted substrings.

Likewise, try finding stories that are mostly covered by Breitbart on Google and you won't be able to. They suppress conservative news sites to stop them ranking.

15 years ago Google wasn't doing that. It would usually return what you were looking for regardless of topic. There are now many topics - which specifically is a secret - on which the result quality is deliberately trashed because they'd prefer to show you the wrong results in an attempt to change your mind about something, than the results you actually asked for.

narag
0 replies
2h37m

What always confuses me about the „search has gotten so bad“ mentality is that it is often based on anecdotal evidence at best, and anecdotal recollection at worst.

I can't speak for anybody else, just trying to find stuff online, not writing a treatise about it or writing my own engine to outcompete Google. It's been asked many times here over the years and the answer was always explanations, never solutions.

Shittification does not happen overnight, but along many years. It started with Google deciding that some search terms weren't so popular: "did you mean...?" (forcing a second click to do what you intended to do in the first place) and went downhill when qualifiers to override that crap got ignored.

For me enough was enough when I realized that a simple query with three words, chosen carefully to point to the desired page, gave thousands of results, none of them relevant. YMMV.

mgdlbp
0 replies
3h28m

Internet Archive remembers. https://web.archive.org/web/*/google.com/search/%2A

Find a query of interest, see for yourself (and take a snapshot of the present state for posterity).

The api enables more powerful queries, https://web.archive.org/cdx/search/cdx?url=google.co.jp*&pag...

Also try other search engines and languages.

laserbeam
0 replies
1h26m

Some things are easily quantifiable, but very few. Such as the number of ads per search. Back in the day google had at most 1 and it was visibly distinct from the rest of the links.

Otherwise, yeah, maybe search didn't degrade but the internet got more spammy. Or maybe users just got wiser and can see through the smoke screen better. Who knows...

Doesn't change the fact that today one has to know how to filter through pages of generic results made by low effort content farms. Results that are of dubious validity, which at best simply waste your time. Or through clones of other websites (i.e. Stackoverflow clones).

Search engines can choose to help with that (kagi certainly puts in the effort and I love it for that), or they can ignore the problem and milk you for ad clicks.

Anecdotal evidence is good enough for me.

jll29
0 replies
3h7m

Dan at least started to provide actual evidence and criteria by which he would score results, but even he only looked at 5 examples. Which really is a small sample size to make any general claims.

US NIST, in their annual TREC evaluation of search systems in the scientific/academic world, use sets of 25 or 50 queries (confusingly called "topics" in the jargon).

For each, a mandated data collection is searched by retired intelligence analysts to find (almost) all relevant result, which are represented by document ID in general search and by a regular expression that matches the relevant answer for question answering (when that was evaluated, 1998-2006).

Such an approach is expensive but has the advantage of being reusable.

hn_throwaway_99
0 replies
1h27m

Even without looking at the subjective quality of search results, the sheer user hostility of the design of the Google search results page is an obvious, objective instance of how search has enshittified.

That is, in the early days, Google used to highlight that "search position couldn't be gamed/bought" as one of their primary differentiators, ads were clearly displayed with a distinct yellow background, and there weren't that many ads. Nowadays, when I do any remotely commercial search the entire first page and a half at least on mobile is ads, and the only thing that differentiates ads from organic results is a tiny piece of "Sponsored" text.

fumeux_fume
0 replies
1h44m

So you're confused why other people aren't doing research for you and when they do provide some evidence, you dismiss it because it's not a large-scale scientific inquiry into search quality? Get frickin a grip.

avsteele
0 replies
5h29m

I don't think this is a fair criticism.

1) The step where you evaluate "how they perform" is necessarily subjective.

2) you could design a study and recruit participants but that isn't something a blogger is going to do.

3) He does link to polls where people agree with the idea the result have gotten worse. Yeah, there are sampling problems with a poll, but its better than nothing.

In this case especially, the writer is answering the question: "Whose results are best according to my tastes?"

arp242
0 replies
33m

To do this you would need to have a comprehensive definition of "quality", and that's that easy, because at least partly it's subjective. It's also hard to include omissions in your definition of "quality" (and again, what should or should not be omitted is subjective as well).

For example, let's say I search for "Gaza"; on one extreme end some engines might only focus on recent events, whereas others may ignore recent events and includes only general information. Is one higher "quality" than the other? Not really – it depends what you're looking for innit?

All you can really do is make a subjective list of things you find important and rate things according to that, and this is basically just the same thing as an anecdotal account but with extra steps.

ametrau
0 replies
1h13m

Shill comment. Be wary of this sort of thing when ever criticising a trillion dollar company. They can afford the shills.

hannasanarion
18 replies
12h4m

I'm not able to reproduce the author's bad results in Kagi, at all. What I'm seeing when searching the same terms is fantastic in comparison. I don't know what went wrong there.

In the Youtube Downloader search, NortonSafeWeb is nowhere to be found. I get a couple of legit downloader websites, and some articles from reputable tech newspapers on how to use them or command line tools.

In the Adblock search, ublock Origin is #3, followed by some blogs about ad blocking ethics debates and the bullshit Google has been pulling recently.

In the wider tires grip search, #3 is a physics blog that dives deep into the topic.

In the transistors search, the first reddit link directly answers the question in very similar wording to the hypothetical correct answer spelled out in the rubric. 4/5 of the reddit results are on the correct topic, followed by two SuperUser questinos also on the correct topic, then some linus tech tips and toms hardware articles, also on the correct topic. No Quora questions.

In the vancouver winter snow search, the first several results are from local news papers talking about the anticipated effects of el nino on snowfall, and then a couple of high-quality blogs and weather sites.

Really wondering how Dan got such bad results.

------

Aside from that, the way that the author expects all the results to return the same kind of thing is just... weird? Like, that's not how search engines are supposed to work. A search that gives you 10 links to fundamentally the same thing is a bad search. Search results should cover a breadth of reasonable guesses for what you should be looking for given a query. If you search for "download firefox", and you scroll past the first 5 download links, then you're probably not actually looking for a download link and a blog post about firefox is not "irrelevant" and shouldn't be points against.

This opinion is even borne out in search engine quality metrics that have been industry-standard for decades, like mean reciprocal rank and distributed cumulative gain. What matters is how far you have to scroll to get to a good result, not what proportion of the first N results are good.

trb
9 replies
11h19m

Same here, I was curious about Kagis low ranking, and couldn't replicate the search results. Also saw ublock Origin on #3, good results for tires, transitors and snow, etc. I've never used any of the Kagi search result weighing features.

Ctrl+F on the page for "System prompt" doesn't show any hits. Given how important those are for ChatGPT (another thought - was the author testing GPT3.5 or 4?) I'm not sure how much weight to put into the ChatGPT results either.

Not sure how much I can take away from this comparison.

spaceman_2020
8 replies
11h15m

I asked GPT-4 about Youtube Downloader and it rambled on about how downloading videos is against Youtube’s TOS and I should buy YouTube premium which has the download feature.

Getting any useful data from GPT-4 about anything even remotely “illegal” is a waste of time.

Semaphor
3 replies
9h49m

With a better prompt, you can get it to list some, but it’s very annoying to do so.

Mistral showed that their medium model is far better (yet not good), and the same prompt as in the article gives only one instead of 3 paragraphs of rambling about copyright, and then lists 3 categories of options with examples for each (not good, because ytdl is not one of those listed).

Funnily enough, both mistral and GPT4 apologize profoundly and almost with the same wording when asked "Why did you not mention the very popular, free and open source "youtube-dl" software?" and then mention how/where to get it and how to use it.

freediver
1 replies
4h24m

Funnily enough, both mistral and GPT4 apologize profoundly and almost with the same wording when asked "Why did you not mention the very popular, free and open source "youtube-dl" software?"

Likely because they were optimized for general population, which would not have a use for command line python utility.

Semaphor
0 replies
3h48m

I’m clear why they didn’t include it, I wanted them to tell me why, though. And I thought that both of them apologized in almost the same way, was funny.

DominikPeters
0 replies
52m

It's plausible that mistral trained on GPT-4 output and therefore has similar mannerisms.

alextingle
0 replies
9h32m

claude.ai produced pretty reasonable results.

UberFly
0 replies
10h7m

So it has also become one of the glitterati. That didn't take long.

Huppie
0 replies
9h50m

The author already alludes to the fact that you can probably prompt-engineer around this and indeed, as soon as I added a blurb like "these are my own videos that I own the copyright to" it did suggest a bunch of third-party tools and let me ask it about what third-party tools I could use.

It suggested '4K Video Downloader', 'YTD Video Downloader', 'JDownloader' and 'Clipgrab' at first and when I asked for cli tools it came with 'youtube-dl', 'yt-dlp', and 'ffmpeg'

Those seem pretty reasonable results to me but I'll readily admit I don't know (yet) if 'most users' would ask these follow-up questions.

the__alchemist
2 replies
4h6m

I use Kagi because I'm trying to remove Google from my life, but their text search is worse than Google in my experience, and the image search is abysmal. I'm wondering how long I can keep this up. I already revert to Google for image search, and am finding myself using either Google or ChatGPT over Kagi more and more for text as well.

freediver
1 replies
1h43m

Kagi had a pretty substantial image search update just few days ago [1]. Do you still the issues with it?

[1] https://kagi.com/changelog#2793

the__alchemist
0 replies
40m

Good info - will experiment!

It's already performing better on a (n=1) test I tried.

"Talos Principle 2". (Video game sequel) Previously (~5 days ago), Google returned various screenshots etc from the game `The Talos Principle 2`. Kagi returned mostly results from `The Talos Principle (1)`. Now the latest Kagi results are a mix, mostly from 2. So, it does look like it fixed this query.

throwaway0665
0 replies
11h47m

have you customized your results and lowered or raised many domains?

szundi
0 replies
9h59m

Kagi is awesome for me too. I just realize using Google somewhere else because of the shit results.

iansinnott
0 replies
9h25m

I'll second the chorus of those curious to hear how you've customized the search engine. I was able to reproduce the lackluster results, and was sadly disappointed. I expected what you seem to have found, that Kagi would outperform.

A specific example: for "ad blocker" the first result was some paid ad blocker and ublock was down the page below the fold.

Semaphor
0 replies
10h14m

What region? I get similarly bad results with international (and a quick check with region US also didn’t improve things) and uBo at only #5, and ytdl at #12. And I already have github on "raise" and a bunch of domains blocked (not many though)

For the transistor query, it’s a very "googly" way of writing a query, when I saw the results I instantly felt like rewriting it and the first try gave much better results with "Why keep cpu transistors getting smaller?". Caveat that the results look better and more topical, I don’t know what a good answer would be, also why I didn’t evaluate the tires or Vancouver weather (I tried a local search for my cities weather, and while the first result was unreleated, the 2nd was okay)

edit: This whole thread made me finally create a file for documenting bad searches on Kagi. The issue for me is usually that they drop very important search terms from the query and give me unrelated results. But switching to verbatim or "forced terms" also prevents any kind of error correction of the search. This used to be one of my main annoyances with DDG back then, and Kagi did not have that issue during the early days.

Scaless
0 replies
10h36m

I have a new Kagi account with no custom rankings and I see the same terrible results. Basically the same as what he describes. yt-dlp is not found at all, the 2010 link to youtube-dl, and a bunch of spam sites.

coldcode
16 replies
15h40m

Search was the biggest feature of the web in the early '00s. Now it's such a mess. I can't imagine Search will ever be amazing again, given all the complexity of providing quality while still avoiding all the crap.

Libcat99
8 replies
15h34m

Is it actually more complex to provide good results, or is it just more profitable not to?

I have a hard time believing an organization like Google doesn't have the resources to provide a search engine that's just as usable as what they had 6 years ago (around the time I feel like the decay really set in). Seems a lot more likely that it's just more profitable to serve up garbage sponsored content.

jmye
6 replies
15h7m

Definitely more profitable not to. Especially as Google is an ad company, not a search company.

I’d rather see a world with numerous paid/subscription search engines, that are motivated to do nothing but return search results well. I expect you would see some of the SEO crap getting solved.

w-ll
5 replies
14h24m

i cant remember where i read this, but something about how google ranks site that have google ads higher than sites that dont. makese sense, its evil, but makes sense, thats why we get all this scrapped spam. is there any more info on this?

Nextgrid
2 replies
13h30m

Intentionally ranking sites with Google ads higher would be a huge antitrust liability, so no way they're doing that.

On the other hand, they can achieve virtually the same outcome while keeping plausible deniability by just not doing anything that would downrank sites with ads (of which a significant chunk is likely to be Google's).

Spam sites often include ads.

w-ll
1 replies
12h58m

i dont think they public disclose that fact

Nextgrid
0 replies
8h23m

It doesn't need to be public to become an antitrust liability. Internal written material can still come up during discovery, potentially even in unrelated cases.

Therefore the safest option is to never openly discuss it or intentionally do it and instead use other means to achieve the objective (don't intentionally rank spam higher, just defund/cancel any projects that would make it rank lower).

mixdup
1 replies
13h33m

this is like focusing on one single problem as being the cause of the decline of the United States. It's actually a lot of things combining and there's not going to be one fix

w-ll
0 replies
12h58m

wtf decline of the United States are you talking aboout

zihotki
0 replies
13h14m

Google, or Alphabet, is not a search emgine company. It's an ads company and that's what they are optimizing for.

packetlost
3 replies
15h39m

Search probably hasn't changed much, but the internet is very different.

jader201
2 replies
15h35m

Yeah, the problem is that there is so much low quality content, that search doesn’t (or can’t) do a good job of surfacing it above the noise. There is still some signal left, but it’s such a small fraction that it’s much more difficult to filter it out.

Having said that, I’m usually still able to find what I’m looking for, if I know that it likely exists, and know the keywords to use to find it. But it’s much harder nowadays for sure.

genewitch
0 replies
14h11m

i have a radio that can "hear" down to -130dBm, i've proven this empirically. Cellular signals work at -12dB or more below the noise floor, wspr works even lower than that. Lightning is broadband noise, and yet i can still use digital stuff when there's lightning storms.

I don't buy the signal to noise argument. For example, whenever i get on youtube and get fed some content, i can immediately tell if it's had AI involved anywhere, and thumbs it down. I won't recommend it, i've called people out for linking such tripe to me (or others).

Hear me out - google got bad about 11 years ago when the dorking stopped being effective, right around the time of the spotlight search results and the sponsored junk taking the top results. Around this time, various agencies (news, etc) started gaming the SEO to respond to any remotely related search with whatever the news was currently. Google chose not to "fix" this, because we're not the customer. DDG was better for a few years for real results, too, but that has gone downhill as well.

The current zeitgeist uses stuff like tiktok and facebook for "web searches" - "food trucks near Austin, TX" or so. No one really uses web search like people on this site do, and google couldn't care less if we don't like the search results.

amlib
0 replies
14h14m

I wonder how much influence google had in lowering the content quality over the years? After all, most SEO spam was a direct response to all the ludicrous requirements they've forced the whole web into, which eventually only SEO spam were willing to commit to.

I also wonder if google just stopped existing, would the web heal over time?

jmclnx
0 replies
15h15m

To me it is only due to the ads, google and bing return nothing but ads on the first page. Plus for me to have the joy of seeing these ads, I need to got through a CAPTHA that I need to try multiple times.

But all in all, a very good article

Night_Thastus
0 replies
15h29m

The problem is that even if providers of the service are 100% trying to provide a great service, everyone on the web will always be min-maxing to appear on top.

So it's inevitably going to become crap.

1970-01-01
0 replies
14h50m

The golden era of search results is very much over. Welcome to the pot-metal era.

infamia
13 replies
14h45m

Try uBlacklist, it's like uBlock, but for search results.

https://addons.mozilla.org/en-US/firefox/addon/ublacklist/

https://chromewebstore.google.com/detail/ublacklist/pncfbmia...

You can sync the settings and your personal blocklist to either Dropbox or Google Drive. It also has the ability to subscribe to blocklists. Mind, you need to manually turn on search engines and subscribe to lists. The uBlacklist subscriptions setting doesn't have any built-in feeds yet though. :(

edit: THere are some feeds on the uBlacklist site though. https://iorate.github.io/ublacklist/subscriptions

edit edit: Found an even better list of feeds. https://github.com/quenhus/uBlock-Origin-dev-filter#other-fi...

bayindirh
4 replies
2h41m

This is a feature of Kagi already. You can promote or blacklist domains in your search results.

EA-3167
2 replies
2h27m

Kagi is just the best, it feels like Google did before a decade+ of enshittification and ad tech.

cratermoon
1 replies
1h22m

Did anyone notice that Kagi showed as barely better than Google in the article?

_benj
0 replies
1h6m

Yeah, for the the results of kagi are so much better than anything else, that it makes me wonder how objective can one be measuring search results.

I use google in a clients computer and it’s just horrible.

But it could also be a factor of the customizations I’ve made for my kagi. Ban quick a few paywalls sites, always put Wikipedia articles on top, prefer blogs than stackoverflow stuff…

KomoD
0 replies
1h0m

But I can't do regexes, wildcards or anything like that as far as I can see, like I can in uBlacklist

And it seems like they also have a 1000 domain limit?

tentacleuno
2 replies
13h56m

uBlacklist is absolutely excellent: I've been using it for a few years now, with absolutely no problems.

Quick tip: turn on the 'Skip the "Block this site" dialog', and disable 'Hide the "Block this site" links' settings -- they make it much quicker to block spam websites (of which there are many on regular search engines).

skygazer
1 replies
13h40m

Just today I was looking for an extension just to block Quora from search results. (Talk about a useless site that seems to uselessly outrank Wikipedia on google lately — what on earth is Google up to?) I’m thankful I saw your and your parent’s post.

carlhjerpe
0 replies
1h28m

When Quora was new I followed some topics, got to read interesting answers to interesting questions, but then some kind of enshittification happened. I've blocked it in Kagi now.

brobinson
1 replies
2h4m

Does this exist for DDG?

infamia
0 replies
1h38m

Yes, it works for most search engines.

ic_fly2
0 replies
6h30m

This is amazing, I was maintaining my own custom solution that did this.

gzer0
0 replies
1h58m

Appreciate you sharing this; I've been searching for something similar for quite some time.

KomoD
0 replies
55m

I use uBlacklist with my own blacklists and Google has been pretty usable, it's great.

virgildotcodes
12 replies
12h58m

I really don't understand why anyone writing articles about ChatGPT uses 3.5. It's pretty misleading as to the results you can get out of (the best available version of) ChatGPT.

For comparison, here are all the author's questions posed against GPT4:

https://chat.openai.com/share/ed8695cf-132e-45f3-ad27-600da7...

refulgentis
4 replies
12h57m

It's a bit hard to use for most, either $20/month fixed for a limited # of messages, or you need to be able to reason through how to get an API key, or get another 3rd party service with similar cost & limits.

simonw
2 replies
12h53m

You can use GPT-4 for free via Bing - though I find it a little hard to explain to people how they can do that because I'm never sure what the rules are with regards to creating Microsoft accounts, whether you can use any browser or have to use Edge, what countries it's available in etc.

Actually maybe the recommendation should be to use GPT-4 for free via https://copilot.microsoft.com/ instead now.

(Except I can't tell which version of GPT that's using yet - there was a story on 5th December that said GPT-4 Turbo was "coming soon", not sure when "soon" is though: https://blogs.microsoft.com/blog/2023/12/05/celebrating-the-... )

vitorgrs
0 replies
12h22m

FYI: Balanced doesn't run pure GPT4. Balanced uses a combination of multiple models. Precise and Creative is pure GPT4.

About GPT4 Turbo, to check if you are on Turbo, ctrl+U > ctrl+f > check if "dlgpt4t" exists. If it exists, you are running turbo.

You can also double-check by, well, asking stuff after 2021 knowledge cut-off as well ("What are the oscar winners?") with search disabled.

But you'll notice because turbo is much faster on bing (and better too).

apapapa
0 replies
1h10m

But that gpt-4 says it can't code

airstrike
0 replies
2h31m

IMHO TBF the "limited # of messages" is continously increasing, to the point I hardly remember it exists these days

tedunangst
3 replies
12h48m

Why does OpenAI continue to offer chatgpt 3.5 if it's so bad?

hannasanarion
0 replies
12h38m

GPT 4 is THIRTY (30) times more expensive.

In the llm-assisted search spaces I'm involved in, a lot of folks are trying to build solutions based on fine tuning and support software surrounding 3.5, which is economical for a massive userbase, using 4 only as a testing judge for quality control.

azinman2
0 replies
12h48m

Cheaper and faster.

antupis
0 replies
11h56m

Chatgpt3.5 is good enought if can give context in query.

latexr
2 replies
4h56m

I really don't understand why anyone writing articles about ChatGPT uses 3.5.

Because that’s what most people have access to. It’s absolutely worthless to most readers to talk about something they’ll never pay for and it’s not the job of random third-parties to incentivise others to send money to OpenAI.

What I really don’t understand is why anyone gets so hung up about it and blames the writer. If you’re bothered by people using 3.5 you should complain to OpenAI, not the people using the service they make freely available.

Anecdotally, I find this excessive fawning about 4 VS 3.5 to be unwarranted.

https://news.ycombinator.com/item?id=38304184

virgildotcodes
1 replies
2h20m

Because that’s what most people have access to.

I’d agree with this rationale if the author clearly communicated their choice of model and the consequences of that choice upfront.

In this post the table of results and the text of the post itself simply reads “ChatGPT” with no mention of 3.5 until the middle of a paragraph of text in the appendix.

It’s absolutely worthless to most readers to talk about something they’ll never pay for and it’s not the job of random third-parties to incentivise others to send money to OpenAI.

The “worth” is in communicating an accurate representation of the capabilities of the technology being evaluated. If you’re using the less capable free version, then make that clear upfront, and there’s no problem.

If you were to write an article reviewing any other piece of software that has a much less capable free version available in addition to a paid version, then you would be expected to be clear upfront (not in a single sentence all the way down in the appendix) about which version you’re using, and if you’re using the free version what its limitations may be. To do otherwise would be misleading.

If you simply say “ChatGPT” it’s reasonable to infer that you’re evaluating the best possible version of “ChatGPT”, not the worst.

Accurate communication is literally the job of the author if they’re making money off the article (this one has a Patreon solicitation at the top of the page).

Whether or not "most readers" are ever going to pay for the software is totally orthogonal.

If using GPT4 vs 3.5 would create results so distinct from one another that it would serve to incentivize people to give money to OpenAI, well then that precisely supports the argument that the author’s approach is misleading when presenting their results as representative of the capabilities of “ChatGPT”.

What I really don’t understand is why anyone gets so hung up about it and blames the writer.

Again, if they’re making money off their readers it’s their job to provide them with an accurate representation of the tech.

Anecdotally, I find this excessive fawning about 4 VS 3.5 to be unwarranted. https://news.ycombinator.com/item?id=38304184

Did some part of my comment came across as “excessive fawning”? Regardless, if this “excessive fawning” is truly unwarranted, this would again undermine your statement that using GPT4 would “incentivize others to send money to OpenAI”.

In regards to your link, I’ll highlight what another commenter replied to you. What should ChatGPT say when prompted about various religious beliefs? Should it confidently tell the user that these beliefs are rooted in fantastical nonsense?

It seems in this case you’re holding ChatGPT to an arbitrary standard, not to mention one that the majority of humanity, including many of its brightest members, would fail to meet.

latexr
0 replies
38m

I’d agree with this rationale if the author clearly communicated their choice of model and the consequences of that choice upfront. (…) with no mention of 3.5 until the middle of a paragraph of text in the appendix.

You’re moving the goalposts. You went from criticising anyone using 3.5 and writing about it to saying it would’ve been OK if they had mentioned it where you think it’s acceptable. It’s debatable if the information needed to be more prominent; it is not debatable it is present.

If you simply say “ChatGPT” it’s reasonable to infer that you’re evaluating the best possible version of “ChatGPT”, not the worst.

Alternatively, it you simply say “ChatGPT” it’s reasonable to infer that you’re evaluating the version most people have access to and can “play along” with the author.

If using GPT4 vs 3.5 would create results so distinct from one another that it would serve to incentivize people to give money to OpenAI

Those are your words, not mine. I argued for the exact opposite.

Again, if they’re making money off their readers it’s their job to provide them with an accurate representation of the tech.

I agree they should strive to provide accurate information. But I disagree that being paid has anything to do with it, and that their representation of the tech was inaccurate. Incomplete, maybe.

Regardless, if this “excessive fawning” is truly unwarranted, this would again undermine your statement that using GPT4 would “incentivize others to send money to OpenAI”.

Again, I did not argue that, I argued the opposite. What I meant is that even if you believe that to be true, that still doesn’t mean random third-parties would have any obligation to do it.

I’ll highlight what another commenter replied to you.

That comment has a reply, by another person, to which I didn’t feel the need to add.

It seems in this case you’re holding ChatGPT to an arbitrary standard, not to mention one that the majority of humanity, including many of its brightest members, would fail to meet.

Machines and humans are not the same, not judged the same, don’t work the same, are not interpreted the same. Let’s please stop pretending there’s an equivalence.

Here’s a simple example: If someone tells you they can multiply any two numbers in their head and you give them 324543 and 976985, when they reply “317073642855” you’ll take out a calculator to confirm. If you had done the calculation first on a computer, you wouldn’t turn to the nearest human for them to confirm it in their head.

The problem with ChatGPT being wrong and misleading isn’t the information itself, but that people are taking it as correct because that’s what they’re used to and expect from machines. In addition, you don’t know when an answer is bullshit or not. With a human, not only can you catch clues regarding reliability of the information, you learn which human to trust with each information.

Everyone’s standard for ChatGPT, be it absolute omniscience, utter failure, or anything in between, is arbitrary. Comparing it to “the majority of humanity, including many of its brightest members” is certainly not an objective measurable standard.

jeffreyw128
12 replies
15h24m

The issue with traditional search engines is that keyword-first algorithms are extremely gameable.

Try https://search.metaphor.systems - it's fully neural embeddings-based search. No keywords, only an embedding of what the actual content of a webpage is.

So in the mentioned example of searching for Youtube downloaders, with Metaphor you'll get only Youtube downloaders (https://search.metaphor.systems/search?q=This%20is%20the%20b...)

Full disclosure - I work there :p

marcinzm
3 replies
15h14m

How is that different from keywords? Embeddings aren't magic, they're just page content. Content is trivial to game since it's controlled by the website owner.

edit: The results are also from my quick QA not that great. Searching for "what is the best mouse to buy" leads to links to buy random mice versus review summaries or online discussions on mice. One of the recommended queries of "Here is a great fun concert in San Francisco" leads to some really bizarre results in non-English languages that have nothing to do with either SF or concerts.

edit2: Also, Google has been using LLMs part of their search since at least 2018 so definitely not just keyword matching there.

jeffreyw128
2 replies
14h49m

Yup, definitely still gameable but if the model learns what high quality content is like and what high quality webpages there are (which it does), then the only way to game would be to be great :)

For your search - I would recommend turning autoprompt off and searching something like "Here is a great summary of the best computer mice to use:".

Our embeddings model is trained on how links are talked about on the Internet, if that helps with querying. So you have to query like how someone would refer to a link before sharing it

marcinzm
1 replies
14h39m

Our embeddings model is trained on how links are talked about on the Internet, if that helps with querying. So you have to query like how someone would refer to a link before sharing it

So it's not high quality web pages but web pages that people talk about a lot which is expected since no one has an oracle that says what high quality is. The embeddings are merely a proxy and generalization for "how links are talked about on the Internet." That can be gamed at scale just like every other signal any popular search engine has been based off of.

jeffreyw128
0 replies
36m

That's true. Although should be much harder

standardUser
1 replies
14h46m

How do you deal with dynamically/contextually generated content? And how about paywalls and login-required content?

jeffreyw128
0 replies
36m

Do our best at getting the right content.

For paywalls/login - we play pretty straight, always obey robots.txt, etc.

optshun
0 replies
14h59m

This is excellent!

Definitely excited to see how it holds up to daily use.

So far it gave me exactly what I wanted at the top for all of my test queries that were well formed.

As for asking “ignorant” questions both your service and the goog failed where phind gave me an actionable starting point (after a prodding follow up question: https://www.phind.com/search?cache=hmul4znpn7y4ei6qa64fosmc )

“max-height like css property for top and left”

Unsure if this sort of thing is even a goal of your project, but you won over a new user.

Wish you and your team all the best.

ec109685
0 replies
13h15m

https://getthatvideo.com/ Is the first result for downloading YouTube videos. Seems super sus (especially since the site doesn’t load).

Auto-prompted to: "Here's a helpful website for downloading YouTube videos:"

Also, this result is horrible:

“What does it mean if someone is not covered in nfl football?”

croes
0 replies
14h59m

Just wait until the content farms adapt

charcircuit
0 replies
15h14m

it's fully neural embeddings-based search. No keywords, only an embedding of what the actual content of a webpage is.

What prevents websites from gaming their embedding? Switching to a similarity search doesn't prevent the results from being gamed.

anonymoushn
0 replies
5h0m

The first result vtubego.com is a 144MB downloader app. The page contains "Pricing Plans Lorem ipsum dolor sit amet, placerat verterem luptatum phaedrum vis, impetus mandamus id vix fabulas vim." above its 3 paid plans (there is no free plan).

I haven't installed the downloader app, so I'm not sure if it lets me download youtube videos for free.

The second result "ytder.com" is a redirect to "https://poperblocker.com/edge/" which seems to be a browser extension for Microsoft Edge that protects the user from the Holy See. I'm not using Edge and I'm trying to download a Youtube video.

The third result download-video.net says that it can download videos from a list of sites. Youtube is not in the list, but let's try anyway. If you put "https://www.youtube.com/watch?v=IkYVmtgxebU" into the text box and click "download" you get "500 SyntaxError: Unexpected token '<', ""

At this point I gave up, but please let me know if any of the results work.

ShadowBanThis01
0 replies
14h10m

So far so good. I'll try using this first from now on, and see how it does. Good luck!

marginalia_nu
11 replies
15h20m

While I've made huge improvements to the algo recently, I do think Marginalia Search got a bit lucky with the sample queries, as it is still IMO far more hit and miss than many alternatives, but that also speaks for how hard evaluating search quality is.

Its efficacy is also strongly dependent on understanding that it's a keyword search engine with no semantic understanding.

tentacleuno
7 replies
14h1m

[...] but that also speaks for how hard evaluating search quality is.

Would you be able to share some of your personal highlights regarding this?

I've partially kept up-to-date with the DIY, non-corporate search space (YaCY and friends). I'd love to understand a bit more behind the engineering decisions made when creating a search engine; it seems like a very hard problem to solve.

P.S. Marginalia is a very impressive piece of work, overall -- I've heard nothing but positive remarks from users on here. I've been meaning to try it for a while, but time constraints have... well, constrained, thus far.

golol
5 replies
12h51m

I just tested Mariginalia and it was completely unable to lead me to a Wikipedia or imdb page when searching for "driver ryan gosling" and variations. It just listed lots of random articles.

wisemang
4 replies
12h20m

That.. is kind of the point of this particular search engine.

This is an independent DIY search engine that focuses on non-commercial content, and attempts to show you sites you perhaps weren't aware of in favor of the sort of sites you probably already knew existed.
golol
3 replies
10h29m

Well that makes sense, but I wanted to push against the result that the OP seems to take away from their test, which was that Marginalia seems to work well for the common user.

marginalia_nu
2 replies
6h1m

There's also a known bug with Wikipedia in particular, I do index it but the results are never ranked particularly high. I haven't fixed it because I don't want Wikipedia to be the #1 result for every search. Feels like most people are aware of Wikipedia and don't need help finding it.

treetalker
0 replies
3h34m

Thanks for your work!

I have a suggestion for the “About” section at the top of Marginalia’s landing page. I think it would read better like this:

This is an independent DIY search engine that focuses on non-commercial content, and attempts to show you sites you perhaps weren't aware of [instead] of the sort of sites you probably already knew existed.

Showing one thing “in favor of” another seems contradictory in this case.

lbalazscs
0 replies
5h3m

I often do a Google search, and then go directly to the Wikipedia result. My reasoning is that during the initial search, I don't know if there's a Wikipedia page about that topic, and I might need a fallback option.

marginalia_nu
0 replies
13h47m

Honestly I understand it well enough that I see it is surprisingly hard, but not enough to have good solutions...

bombcar
1 replies
14h1m

I notice you completely avoid the question on how a single developer can do so well ;)

I do think that search has gotten much worse but my ability to know the magic words like “ublock origin” instead of “Adblock” and “yt-dlp” instead of “download YouTube” and phrase my search has gotten better.

We’ve all been doing prompt engineering against the Internet-wide LLM that is the spam houses.

marginalia_nu
0 replies
13h49m

I notice you completely avoid the question on how a single developer can do so well ;)

As much as I enjoy the notion of somehow being a 10,000X developer, it's probably mostly that modern search is a filtering problem, and MS does filtering fairly well.

ta988
0 replies
1h6m

Just my feedback after trying to finally get to what it is exactly.

I tried to find marginalia on DDG, not on the first page. Google has it after some garbage. If I go to marginalia.nu I get a SSL error. search.marginalia.nu works

If i search on marginalia for duckduckgo there first link is somewhat relevant but is about the app, all the other links are related to DDG but of curious relevance.

If I search for ublacklist mentioned above, I do not see anything directly relevant.

fantasybroker
9 replies
14h55m

I am not sure what the intention of this post is. In my handpicked results Kagi far outperforms Marginalia.

#1 "Gordon ramsey" (misspelled "Gordon Ramsay"). Marginalia shows "The Life I Imagine: are my cheeks red?". Kagi corrects to Gordon Ramsay and shows relevant results.

#2 "Ukraine war". Marginalia shows an article about the Russian Orthodox church and a Substack post about the war. Kagi shows Wikipedia, Al Jazeera, etc up-to-date summaries about the war.

#3 "Dildo". Top post on Marginalia is "Students for Concealed Carry Embraces UT Dildos | Students for Concealed Carry". Top posts on Kagi are Wikipedia (read) and Amazon (buy).

How is Marginalia, a search engine built by a single person, so good?

Because it's not good?

hattmall
6 replies
14h48m

I don't disagree with your assessment in full, but I don't exactly consider wikipedia and Amazon good results. Like they are big enough that if that's the result I want I can go to them directly. So like they aren't bad or wrong, but I can see the case for excluding them. Should something like Webster's dictionary be a top result?

fantasybroker
4 replies
14h45m

I think for single word queries like that Wikipedia covers more ground than a dictionary. Personal preference, perhaps. If I need a definition I search for "define dildo" (Kagi shows Merriam-Webster, Oxford, etc dictionary entries).

marginalia_nu
3 replies
14h37m

Marginalia supports the old Google syntax, e.g. "define:dildo"

fantasybroker
2 replies
14h22m

Thanks! If you are that "single person" who built Marginalia... hope you are not taking my criticism personally. I am more annoyed by this blog post that uses a few handpicked queries to present generalized long winded conclusions that are completely disproven when using a different set of queries.

marginalia_nu
1 replies
14h17m

Yeah, its me, and to be fair I made a comment to a similar effect myself. Assessing search result quality is very hard, and this is definitely a pretty flattering selection of queries.

fantasybroker
0 replies
12h19m

On the plus side - in addition to Marginalia's own success, you can take partial credit for how good Kagi search results are (IIRC Marginalia's index is one of the sources for Kagi search results). So... thank you for that!

marginalia_nu
0 replies
14h45m

Marginalia Search isn't trying to be a universal knowledge engine, it's just a website finder.

That's bad if you're looking for a simple answer or basic fact, and good if you're looking for a few hours of reading.

BytesAndGears
0 replies
14h51m

I had a similar experience when testing Kagi after reading this. The top result for the “wider car tires” query on Kagi was a link to Physics StackExchange with some marginally informative answers [0], which would be easy to expand on in future searches. The second result was Reddit. Then a couple of incorrect/irrelevant pages but they don’t look like scams

[0]: https://physics.stackexchange.com/questions/29903/why-do-peo...

Edit: I did just realize that I have StackExchange customized to be up-ranked. So that probably helps. But yeah, I guess this is why I usually get good results, which is something that generally still fails with Google for me.

Brian_K_White
0 replies
13h23m

It seems to me that the name "marginalia" is not just a random set of syllables. It sounds like it's doing what it says on the tin, which is gooder than not doing what it says on the tin. (distinct from whether what it says on the tin is something you want)

poulpy123
8 replies
8h9m

I'm sorry but the very first request is completely wrong. When people search for a YouTube downloader, they want a website that allows to download a YouTube video, not a command line tool. And the first results given by Google do that. I'm one of the people that think Google search became bad but it's not because of the kind of search

marginalia_nu
6 replies
5h47m

That's the tricky mind-reading aspect about search intent.

Different people have varying expectations as to what they want to find with the same query. I'd definitely want yt-dlp in favor of some website.

poulpy123
5 replies
5h9m

it's easy: just append command line to the query like you would append android app if you wanted and android app

marginalia_nu
4 replies
4h46m

That is a user POV solution, speaking from the search engine POV.

bee_rider
3 replies
2h35m

Based on your handle, I suspect you have much better insight into this than the rest of us!

But can the search engine mind-read by assuming Windows users don’t want to use a command-line utility?

marginalia_nu
2 replies
2h1m

They can based on user tracking and profiling, but that's murky waters I personally don't want to dip into.

notRobot
1 replies
1h12m

I assume you meant to say you don't want to! :)

marginalia_nu
0 replies
1h11m

Yeah I accidentally a word.

anonymoushn
0 replies
46m

They do not do that, have you tried using them?

johnfn
8 replies
15h16m

I noticed that the author uses ChatGPT3.5 rather than 4, which is a rather large difference. I don't have the knowledge to rerank all questions the author asked, but I will say that a test of ChatGPT 4 leads me directly to youtube-dl, which is better than every other search engine listed.

taberiand
3 replies
14h35m

That was the first thing I checked reading the article. Although the argument would be 3.5 is free - any comparison of systems against ChatGPT that isn't using ChatGPT 4 can be dismissed almost out of hand; there is not much point talking about ChatGPT if it's not using ChatGPT 4 and making proper use of its capabilities.

That is not to say that there aren't valid criticisms of and shortcomings in ChatGPT 4 - just that it's not useful to say ChatGPT when it's referring to 3.5

vitaflo
0 replies
2h10m

This is silly, most people aren't going to pay for ChatGPT, just like they won't pay for Google or DDG. So using 3.5 in this case is perfectly acceptable when we're talking about free software.

bombcar
0 replies
13h59m

He gives the full queries - do you have chat 4.0 that you ran run it against?

Dah00n
0 replies
6h34m

any comparison of systems against ChatGPT that isn't using ChatGPT 4 can be dismissed almost out of hand

Does everyone or even most use ChatGPT 4? The most used version is -of course- by far the most relevant.

huytersd
2 replies
13h30m

I’ve come to recognize that any article that uses 3.5 has an agenda.

xigoi
0 replies
47m

The agenda of not wanting to pay for something just to test it out when there is a free version?

airstrike
0 replies
2h8m

I also suspect as much, but obviously can't know for sure. IMHO it's intellectually lazy if not dishonest to benchmark against 3.5 and not make that fact clearly known upfront

A better benchmark would have had two entries for ChatGPT, showing both 3.5 and 4 results

latexr
0 replies
4h35m

I will say that a test of ChatGPT 4 leads me directly to youtube-dl

And yet to other people it starts rambling about how that’s wrong and you shouldn’t do it and doesn’t give a usable answer.

https://news.ycombinator.com/item?id=38822040

I boggles the mind the extent to which people salivate over a system that cannot decide between a correct straight answer, something wrong but plausible, something wrong and impossible, or outright refusing to answer.

happytiger
7 replies
15h20m

I feel like you could reboot yahoo directory and have more utility that most searches.

flenserboy
3 replies
15h8m

The return of something like Yahoo Directory would be most welcome. There is great utility in having more than one approach into a data space. That we have been stuck with essentially one way in for over a decade means that there is a great deal out there which would be great to access but which has been rendered invisible.

marginalia_nu
2 replies
15h3m
flenserboy
0 replies
14h53m

Nice. Thanks!

FergusArgyll
0 replies
6h57m

this is awesome! thanks

kristofferR
2 replies
15h10m

The !bang directory for Kagi is honestly pretty good, found some cool sites there: https://duckduckgo.com/bangs

louthy
1 replies
13h55m

Did you mean to say Kagi or Bing?

Anyway, here’s Kagi’s bangs:

https://help.kagi.com/kagi/features/bangs.html

kristofferR
0 replies
6h23m

Note that Kagi supports all DuckDuckGo-style bangs.

You can also make your own bangs.

That said, my point was that the bang directory has a bunch of the most useful sites in each category.

toomim
5 replies
15h16m

Do search engines censor political topics these days? If you search "truthsocial" on ddg, the truthsocial.com website is the first hit. But if you search "trump truthsocial", it doesn't give you trump's truthsocial page, and doesn't even give you truthsocial.com within the first few pages of search results.

Since ddg uses bing, does anyone know what is happening here at bing? It looks like google results are similar.

senderista
1 replies
14h18m

I have concluded that Google definitely censors search results relating to the Ukraine war, after vainly searching for articles about documented Ukrainian war crimes (reported in mainstream Western media like NYT/WaPo).

ARandomerDude
0 replies
13h25m

I'm not seeing this. I Googled "war crimes by ukrainian soldiers" and the top link was an Amnesty International Article, "Ukraine: Ukrainian fighting tactics endanger civilians".

https://www.amnesty.org/en/latest/news/2022/08/ukraine-ukrai...

I use Google as little as possible because I don't like surveillance advertising but fair is fair.

dpkirchner
1 replies
15h0m

I doubt you're seeing censorship. If you search for "truthsocial trump" on ddg, you'll see his profile, for better or worse.

toomim
0 replies
13h42m

Oh, interesting. So it depends on the order of the terms:

- "truthsocial trump" works

- "trump truthsocial" doesn't work

Springtime
0 replies
14h48m

DuckDuckGo (and by extension perhaps Bing, assuming identical upstream results) has some terrible results when trying to filter by all kinds of domains.

There's a power tools review/news site that returns zero hits for the actual domain when searching its name (which is the same as its .com address). While for some domains even searching using the `site:` parameter will give far fewer results when paired with a query than just searching the domain name + query sans the TLD (the router firmware site openwrt.org is among such).

It's a mess and reporting it hasn't any difference ime in the past 3 years. So I'd be reluctant to say irrelevant results are due to censorship unless there was more evidence.

shp0ngle
5 replies
6h20m

I don't understand the praise of Marginalia.

When I search for "Steve Jobs" on Marginalia, I got blogs about his speech in 2011 and some mailing list from 2007.

When I search for my own name I get nothing. In Google it's just me.

It's cool that one person built all this of course but... that's not a good search result compared to Google?

Maybe I miss something, maybe I use it wrong

marginalia_nu
4 replies
6h8m

What do you expect when you search for Steve Jobs? Also, which filter did you use?

shp0ngle
1 replies
4h19m

I don't know I used any filters? I don't know what are filters sorry

I expect wikipedia article on Jobs as a baseline.

marginalia_nu
0 replies
3h2m

Ah, I downrank Wikipedia pretty hard :P

shp0ngle
1 replies
4h17m

By the way please don't take it as if I am taking you down or something

It's amazing what you did, it's just not a Google killer? or at least I don't see it

marginalia_nu
0 replies
3h1m

It's really not supposed to be either. Like it's designed to be the search engine you use when you can't find something elsewhere, so it's largely designed to show you different results than the ones you get on Google and Bing.

In general a lot of the complaints seem to be "I'm not getting what I expect from Google". Well... yeah. That's the point. If someone wants the same results as Google, they should arguably use Google.

ChrisArchitect
5 replies
14h34m

Blah blah blah. Could you lay this article out any worse? What are the queries you used to test? I want to try them too. Buried in here somewhere.

Using an adblocker is not expert anything.

That you've defined your own opinion for what some of the results should be blows the thing up.

Searching youtube downloader, many people would be fine with some of the ad covered but totally functional sites that pop up on Google. I use some of them every day for quick conversion tasks. I don't want any youtube-dl result. The average users don't either.

Download firefox? What's that? All the top links are fine? No one's looking at the 7th listing for a simple query to download a program.

Why do wider tires have better grip? .. what, sites like roadandtrack, prioritytire, reddit, some physics and stackexchange sites aren't good enough? they are.

The Vancouver snow report one also. Lots of major news sites. Some weathernetwork and almanacs. All totally acceptable results for a sort of variable question.

blah blah this is just a hate on for Google and a HN/nerd view of the world that the average user is nowhere near living in.

SnazzyJeff
1 replies
14h28m

Download firefox? What's that? All the top links are fine? No one's looking at the 7th listing for a simple query to download a program.

They are if the first six results are SEO bullshit. Which is the de-facto state of affairs for Google today: advertising traipsing around as search.

ChrisArchitect
0 replies
13h18m

heh, they're not. They're all variations of mozilla download pages and site posts.

shaldengeki
0 replies
12h13m

For whatever it's worth, I think your comment would be a whole lot more convincing without its first and last lines, which had the effect of making you sound (at least to me) like you're shallowly dismissing the article.

navjack27
0 replies
12h51m

Completely agree. I personally thought searching "Vancouver snow report" to be extremely strange. Just search zip code or city name and weather. Two words. That's all you need to get results. What the hell is snow report? Do you even think you can trust weather reports 10+ days out?

Whole article is rambling and silly and assuming.

anonymoushn
0 replies
40m

Which web site did you use to successfully download a youtube video, and which youtube video did you download?

0x38B
5 replies
13h54m

Re: Kagi, I heard about it on HN, tried it for 100 searches, then subscribed. When I search for random JS and CSS things, MDN is the first result, and if it isn't, I can downrank whatever spammy site(s) are on top.

---

I wish I had a local LLM trained to detect clickbait and or low-effort content. I imagine searching YouTube and having all the clickbait collapsed together (just like Kagi condenses listicles), with the remainder being potentially high-quality content. Don't know how feasible this is right now.

shados
1 replies
13h23m

I became a huge fan of Kagi after seeing it on hacker news too. It's amazing how good a search engine can be when it's not full of ads.

D13Fd
0 replies
12h50m

Yeah. At first I primarily used Kagi to move away from Google as a company, hoping for results that were equally good. But Google search actually feels crappy now in comparison.

freeAgent
1 replies
10h17m

Just use the Kagi Summarizer on YouTube videos and you don’t have to waste time watching trash. It’s a great life hack.

xigoi
0 replies
1h30m

How does that work? Does it scrape the auto-generated captions?

qudat
0 replies
5h24m

Been paying for Kagi for 6+ months and very happy with it. I’m pretty anti subscriptions so that’s saying a lot for a service that is otherwise free.

I do have to dump into google for local searches every once in awhile, but otherwise happy with it.

zzleeper
4 replies
13h40m

For me the problem is not just that searching on Google is bad, but that sometimes it COMPLETELY hides exactly what I'm looking, for no good reason.

For instance, I wrote an R ggplot2 package called "fedplot" (following the convention of calling the package for the figure style it replicates, as in "bbplot" for BBC-style charts).

Try searching for it on Google: "github" "fedplot" doesn't get you anywhere. Meanwhile, every other search engine gives you exactly what you want if you just type "fedplot". I even tried to add the relevant websites through google's suggested tools, and nothing happened :|

Dah00n
2 replies
6h25m

Searching for "fedplot" looking for https://github.com/sergiocorreia in the results:

Qwant: Result 1

Bing: Result 1

Google: Result 2

Marginalia: Zero results

ChatGPT 3.5: Some Federal Reserve dot plot nonsense and no useful results.

marginalia_nu
1 replies
6h5m

You're never going to find github results on Marginalia as long as they block 3rd party crawlers :-/

Dah00n
0 replies
5h48m

Well, zero results are better than spam ;-)

Brian_K_White
0 replies
13h12m

Their black box semantic guesser has been told not to feed the radicalizing conspiracy theorist fires about federal plots.

Who needs to know anything about government owned land anyway?

naet
4 replies
13h2m

I think the result grading is too opinionated here.

For example, the first query is "download YouTube videos", for which Google is ranked "terrible" for not showing you a command line open source program. But the literal first result is an ad supported site where I can paste in a YouTube link and download it right from the browser. That seems like exactly what most people would want or to the CLI tool the author is searching for. The author seemed to be looking for sites without ads as what they wanted to see in search results more than search relevance.

Search is a very gamed system with a lot of SEO spam type results, but I think a much better analysis could be done for more meaningful results. Also I recreated some of the searches and got very different results (including ublock origin in the top three responses). Again, a more scientific ranking system could help uncover better data on searches.

shaldengeki
2 replies
12h31m

The author describes that site as such, which seems fair to rate as "terrible":

Some youtube downloader site. Has lots of assurances that the website and the tool are safe because they've been checked by "Norton SafeWeb". Interacting with the site at all prompts you to install a browser extension and enable notifications. Trying to download any video gives you a full page pop-over for extension installation for something called CyberShield. There appears to be no way to dismiss the popover without clicking on something to try to install it. After going through the links but then choosing not to install CyberShield, no video downloads. Googling "cybershield chrome extension" returns a knowledge card with "Cyber Shield is a browser extension that claims to be a popup blocker but instead displays advertisements in the browser. When installed, this extension will open new tabs in the browser that display advertisements trying to sell software, push fake software updates, and tech support scams.", so CyberShield appears to be badware.
naet
1 replies
1h44m

That's how he described it but I tried it myself and found it perfectly functional to download a video with different options for size / quality. It has ads but not nearly as bad as described.

It's a service that is quasi illegal and explicitly breaks the YouTube terms of service. I think the search engine did a good job surfacing what was searched for, there just aren't going to be any free online YouTube downloaders without advertising.

anonymoushn
0 replies
44m

Which web site did you use to successfully download a youtube video? Which youtube video did you download?

j7ake
0 replies
11h46m

Yeah, if one typed “YouTube downloaded cli” you the results the author was thinking.

It seems like the author wants to search to read their kind without specifying what kind of YouTube downloaded they want

motoxpro
4 replies
9h32m

This makes so much sense why people think search results are bad. Great results for "Download youtube videos" is "Ideally, the top hit would be yt-dlp or a thin, graphical, wrapper around yt-dlp"

Just give me a website where I can plug in the DL link and download it to my hard drive. I don't care what package they are using (I don't worry about malware like I did in the 90s). 99.999% of people are not programming tinkerers.

Just makes me realize how subjective search results are. All of their "Great" results are my "Terrible" results.

carlosjobim
2 replies
7h30m

The first result on Kagi is exactly this, just tried it a moment ago. It processed and downloaded the video extremely fast. Why would any reasonable person prefer youtube-dl?

motoxpro
0 replies
5h3m

Totally, As the sibling said, it is the same using Google. I am not sure, why anyone would want a programming package to accomplish a task that could be done in < 10 seconds.

But again, I guess that's why search is so hard is because I have to parse that intent from 3 words.

Dah00n
0 replies
6h38m

It is the same using Google.

darkwater
0 replies
8h42m

Malware or well, the actual viruses, in the '90s were a joke, especially because a computer was an isolated thing. Connected computers were the exception.

jimmytucson
4 replies
13h25m

If you wanna know why Google (or any search engine) sucks, just look at how it measures its own search results. Most search companies do this “at scale” according to very specific guidelines, like what the author did here but on steroids. For example, take a look at Google’s 168-page instruction manual for search quality raters:

https://static.googleusercontent.com/media/guidelines.raterh...

It talks about figuring out a query’s meaning(s), judging the user’s intent (were they looking for some specific answer, etc.), evaluating the “quality” of a website, rating the site’s usefulness in relation to the query’s meaning/intent, etc.

All this is to say, it’s not that search companies don’t do exactly what the author did here, it’s just that they have different standards than the author. And I’d venture their standards match their users’ better than the author’s, but maybe not or not forever, anyway.

ec109685
1 replies
13h22m

Why would an average user want blog spam search results?

My hope is as LLM’s improve, they can be more discriminating about the results returned.

jimmytucson
0 replies
13h0m

Why would an average user want blog spam search results?

I didn’t say they would :)

In fact, I can’t figure out how your comment relates to mine. Are you claiming that Google doesn’t factor blog spamminess into its evaluation of search results? If so, that’s quickly put to bed by the document I linked, pretty much section 4.6. Excerpt:

Creating an abundance of content with little effort or originality with no editing or manual curation is often the defining attribute of spammy websites.

You could claim that they fail to capture some essential quality of “blog spamitude” or that they don’t weight it heavily enough in their eval but to say they just, like, don’t know about blogspam over there, is pretty far fetched IMO.

whakim
0 replies
13h1m

I really don't think that's true. For example, page 29 of your link describes "Lowest Quality Content." Most of the search results that the author rated as spammy or scammy clearly fit these guidelines, which means that either (1) the raters aren't knowledgeable enough about the subject matter to determine that the website they're rating is harmful or misleading; or (2) the raters are rating these sites correctly, but it still isn't having the desired effect.

mrweasel
0 replies
8h42m

If you wanna know why Google (or any search engine) sucks

While I obviously don't know it may be related to how Google believes a "normal" person search. I have come to view Google as a product search engine/price comparison site, that's what it's great at. Google can find you the most relevant products for any purchase you may consider, so maybe that's what Google has optimized for. The majority of my searches are related to IT, programming, software and computers in general, but what does "normal" people search for. They search for products, news, opening hours for a store, Google is pretty decent at that, but the money is in the "go buy something". The ads on a product search on Google is always way more accurate than the actual search result.

I think Google has optimized for selling products.

freediver
4 replies
12h27m

Current Kagi results for those without an account to compare:

youtube downloader

https://kagi.com/search?q=youtube+downloader&r=us&sh=_szITdy...

ad blocker

https://kagi.com/search?q=Ad+blocker&r=us&sh=-BHzV2ZoCDpmgOu...

download Firefox

https://kagi.com/search?q=Download+Firefox&r=us&sh=zkkmc_EQX...

why do wider tires have better grip?

https://kagi.com/search?q=Why+do+wider+tires+have+better+gri...

why do they keep making cpu transistors smaller?

https://kagi.com/search?q=Why+do+they+keep+making+cpu+transi...

vancouver snow forecast winter 2023

https://kagi.com/search?q=Vancouver+snow+forecast+winter+202...

I agree with the author that there is too much spam on the web. I think Kagi in general does a pretty good job at downranking it (number of ads/trackers is a negative ranking signal on Kagi) but we can always do better. Kagi has special search modes like "Small Web" which virtually eliminates spam.

I welcome such scrutiny from the community. Please continue to keep us honest.

asah
3 replies
7h22m

Kagi gives me websites that require more clicking; Google just gives me reasonable answers and I don't see spam in your examples.

"why do wider tires have better grip?"

Wider tires provide more grip due to a larger contact patch with the road. While it's true that friction is not directly dependent on surface area, a larger contact patch allows for more even weight distribution and better traction, particularly during cornering. This can result in improved handling and stability.

"why do they keep making cpu transistors smaller?"

Smaller transistors can do more calculations without overheating, which makes them more power efficient. It also allows for smaller die sizes, which reduce costs and can increase density, allowing more cores per chip.

"vancouver snow forecast winter 2023"

The forecast for the 2023/2024 season suggests that we can expect another winter marked by ample snowfall and temperatures hovering both slightly above and below the freezing mark. Be prepared ahead of time.

poyu
0 replies
6h0m

My only gripe with these quick answers or excerpts is sometimes it doesn’t fully give the context. After it’s given me couple wrong results I stop trusting it and always read the source. If I were to read the source might as well click into the website and read the whole thing.

pbronez
0 replies
5h20m

I often default to using Kagi’s FastGPT when I want direct information instead of a starting point for web browsing. It combines web searches with LLM summarization to produce an integrated, sourced answer. Kinda like “New Bing” with a better interface. You can trigger this mode by adding “!fast” to your query, going to https://kagi.com/fastgpt, or using their API. Here’s what it returned for me on those specific queries: ========== "why do wider tires have better grip?" Wider tires provide better grip due to having a larger contact patch with the road surface. A larger contact patch means the tire can disperse the weight of the vehicle over a wider area, reducing pressure. This allows for more friction between the tire and road, improving traction and cornering ability. However, wider tires do not necessarily provide better grip in wet conditions. [1][2][3][4][5] 1. wheels - Does a wider tire always mean more grip/traction - Motor ... 2. Do Wider Tires Give You More Grip? Not All the Time 3. Learn Me: Wider tires = more Grip? = Faster cornering?| Grassroots ... 4. Wide or narrow tires? Here's how your choice will affect driving ... 5. VIDEO: Do wider tires offer more grip? - BMW M3 and BMW M4 Forum ========== "why do they keep making cpu transistors smaller?" There are a few key reasons why CPU manufacturers keep making transistors smaller: 1. Smaller transistors allow more transistors to be placed on a single chip, enabling increased processing power. More transistors results in more capabilities and performance. [1] 2. Smaller transistors use less power and produce less heat. This allows CPUs to operate faster and with more cores/threads within the same power and thermal envelope. [2] 3. Moore's Law, an observation made by Intel co-founder Gordon Moore in 1965, predicted that the number of transistors on integrated circuits doubles approximately every two years. Following this trend has driven the semiconductor industry to continue shrinking transistors. [3] 4. Smaller transistors can switch on and off faster, enabling increased clock speeds. This directly leads to increased performance. [1] However, we are rapidly approaching the fundamental physical limits of transistor shrinkage. Transistors cannot be smaller than an atom, which is around 0.2 nanometers for silicon. [4] Alternative technologies like carbon nanotube transistors may be needed to continue performance growth beyond this point. [5] 1. Why does computer chip process size have to keep getting smaller ... 2. Why does smaller = less heat? - CPUs, Motherboards, and Memory ... 3. What Is Moore's Law and Is It Still True? 4. Transistors are an essential building block used in almost every ... 5. These Transistor Gates Are Just One Carbon Atom Thick - IEEE ... ========== "vancouver snow forecast winter 2023" Based on the context provided: The winter of 2023-2024 in Vancouver is expected to see less snow and milder temperatures than typical winters. Forecasts predict a below average snowfall season with warmer weather [1][2][3][4][5][6]. Long range forecasts through December 2023 for southern British Columbia call for showers along the coast and less precipitation inland [7]. Metro Vancouver's fall and winter is also predicted to be a potentially 'down' snow year [4]. Confidence is high for a mild winter in western Canada with reduced snowfall amounts [5]. However, specific snowfall totals are uncertain given the long lead time [6]. 1. Vancouver winter weather: Less snow, milder temperatures ... 2. 2023-2024 British Columbia Winter Forecast Preview | OpenSnow 3. Snow Prediction Vancouver Winter 2023/24 — Alblaster Snow ... 4. Metro Vancouver's fall, winter forecast | CityNews Vancouver 5. What will this winter be like? Grab the hot cocoa — here's your 2023 ... 6. Canada's Winter Forecast: El Niño a critical factor for the season ... 7. 60-Day Extended Weather Forecast for Vancouver, BC | Almanac.com

ametrau
0 replies
1h4m

99% of the time, shill, those quick answers are from content farms written by ai or low paid freelancers in developing countries and are completely garbage when not accidentally correct.

fgblanch
4 replies
14h49m

I would love to see Perplexity.ai in the benchmark. It has completely replaced Google/DDG for information questions for me. I still use DDG when I want to do a navigational query (e.g. find the URL for a blog i partially recall the name).

larve
1 replies
10h22m

While kagi was the product that most brought me joy in 2022, perplexity.ai has been the one for 2023, even though i only recently started using it. It's just been a joy to be able to iteratively discuss most of my searches.

EDIT: here's a search for tire (I don't know anything about tire, so maybe there's much better links out there, but this is pretty much what I was expecting. Not an ad or SEO in sight.) https://www.perplexity.ai/search/tire-3iuI9T6BQUSvu2tAhgsRmA...

freediver
0 replies
1h35m

I am wondering if you can use AI chat exclusively for your search needs? If not, what does the perfect integration looks like?

rr808
0 replies
14h21m

Me too. I only heard about it this morning and it looks kinda perfect so far.

lhl
0 replies
5h57m

I've been really enjoying Perplexity as well. It's a much better Internet/search focused experience than ChatGPT, Bing, or Bard. For anyone interested, until the new year (~20 more hours?) there's a code for 2mo free Pro: https://twitter.com/perplexity_ai/status/1738255102191022359 (more file uploads, choose your model including GPT4)

amadeuspagel
4 replies
6h55m

Here's a fun experiment to try. Take an open source project such as yt-dlp and try to find it from a very generic term like "youtube downloader". You won't be able to find it because of all of the content farms that try to rank at the top for that term. Even though yt-dlp is probably actually what you want for a tool to download video from YouTube.

Is that true? Do most people want to install a command line tool to download youtube videos?

Dah00n
3 replies
6h41m

No. They want sites like savefrom.net - which is hit number one on Google.

anonymoushn
2 replies
5h26m

Did you try using savefrom.net? You can type "https://www.youtube.com/watch?v=IkYVmtgxebU" into the text box and hit "Download". Then you'll get a new tab that tries to get you to install malware. If you decline to install it, the new tab takes you to the malware's homepage. If you close the tab and go back to the original tab, savefrom.net presents you with an error message saying "The download link not found." and does not help you download the video.

gkbrk
1 replies
5h15m

I tried this. I went to savefrom.net. First thing it does is ask permission to send notifications.

After that there is a popup asking me if I want to continue in the browser or download their app. If I click download, it downloads a file called download_helper_2.3.27.apk.

Instead of downloading their app, if I paste a YouTube link, it tells me I can wait or download their APK to skip waiting. The download link downloads an older version called download_helper_2.3.19.apk.

When I do the process again, instead of the older APK link it gives me a Chrome extension link. But if you look at the instructions you see that it's not a Chrome extension, but a minified userscript. And it has `@include https://*` so it can basically run on any website regardless of clicking on an extension icon like regular browser extensions.

If I try to ignore all the distractions and wait for the download link, I can click it and it downloads the MP4 file. But it also opens a popunder with the domain https://refpamjeql.top/.

Not the best experience, and seems like a high risk of getting malware, but it does get an MP4 file at some point.

anonymoushn
0 replies
5h11m

Interesting! I tried again and got completely different results this time. Now there's no malware tab, and instead it tries to get me to pay for a subscription to download high-quality videos or MP3s. If I click the barely-visible "Just let me download in my browser with low quality" below the paid subscription button, I get the same error as before.

Edit: the paid subscription payment flow says I'm actually buying "Televzr Premium Max Subscription for

1 Month_mp

Televzr helps get wireless access to the media library on the computer from the mobile phone"

So it purports to be something unrelated to downloading youtube videos. I didn't pay 1400 yen for it, so I won't get to find out if it helps me download youtube videos.

sagarpatil
3 replies
14h22m

Have you tried perplexity.ai? It's like ChatGPT and Google had a baby. Looks very promising and I'm seeing a lot of tech leaders (example Toby of Shopify) moving to it.

dartharva
2 replies
13h43m

Aren't Bing Chat and Kagi FastGPT the same in effect?

littlecranky67
1 replies
9h44m

No, FastGPT is GPT-2 based. I actually prefer FastGPT because its fast (duh!), and as it gives very concise answers and all the generated response carries footnotes with the link to the source.

freediver
0 replies
1h37m

Just to correct, FastGPT uses claude-instant.

jeffbee
3 replies
12h17m

Pretty biased selection of queries. Article avoids the things that ChatGPT and the others without fresh data can't answer. Look at the trending searches on Google. They are all for fresh info that none of the others can answer. Sports scores. Google probably judges quality weighted by the questions their users actually ask, not this nerd bullshit.

Dah00n
1 replies
6h18m

How is a youtube downloader biased to fresh results? Seems to cover a pretty broad test.

jeffbee
0 replies
2h45m

It selects a "right answer" that suits a stale index, assuming that there can't have been a right-er answer discovered after ChatGPT's training horizon.

bluish29
0 replies
11h7m

Isn't any selection of queries would be biased. Even what you are saying is biased, you try to say that Google would be better for cases that it optimize for which is even weirder. That is like saying you want to compare highly optimized code that is using some C libraries vs some native python code.

Osiris
3 replies
15h9m

I have recently started using kagi after seeing a recommendation here.

From what I understand, it aggregates results from multiple sources rather than having their own indexer.

The results aren’t really any better, but the lack of ads and videos in the results makes for a cleaner experience.

I also haven’t yet taken advantage of the extra features to block certain websites from results.

Personally, I pay the $5 mostly in an attempt to support another competitor in the space.

kristofferR
1 replies
15h1m

Kagi is awesome, so much better experience than Google!

Start using bangs, lenses and customized results ASAP, that makes a big difference.

Zambyte
0 replies
14h51m

I actually find myself using bangs way more since I switched to Kagi from DDG. I think it's the AI bangs like !chat and !expert that got me in the habit of using bangs besides !g (which I never actually use anymore).

Nextgrid
0 replies
13h27m

Pretty sure the reason Kagi is better isn't because they use multiple sources, it's just because they can use the presence of ads as a negative ranking signal, something that none of the major public search engines will ever do as it goes against their own business model.

xpressvideoz
2 replies
12h49m

However, there's a sizable group of vocal folks who claim that search results are still great.

I think that this very sentence shows the author's bias, because I feel that Google's search results are not just great, but better than what it was 10 years ago.

realcertify
0 replies
12h45m

You must be kidding, Google is becoming worse every day. Still better than useless Bing though.

computerfriend
0 replies
11h27m

Consider yourself part of the sizeable group of vocal folk then.

viraptor
2 replies
13h36m

I really don't agree with some of the expectations around results.

Download youtube videos

Ideally, the top hit would be yt-dlp or a thin, graphical, wrapper around yt-dlp. Links to youtube-dl or other less frequently updated projects would also be ok.

That's not what a random person expects. yt-dlp or youtube-dl have no meaning to a normie. The first result is an online downloader and that's what an average person is after. I checked the first result in Kagi and it's a valid youtube downloader.

If you're after a commandline tool, ask for it: "commandline tool download youtube videos" gives youtube-dl as the top result with valid options afterwards: https://kagi.com/search?q=commandline+tool+Download+youtube+...

"Ad blocker" seems to ignore other options exist. Yes, ublock would be preferable for most, but ABP is not "very bad". Kagi mentions ABP at position 1 and ublock at position 8: https://kagi.com/search?q=Ad+blocker&r=au&sh=4VHApDrTEfuxMOt... (But for a query like that, I'd be happy with a wikipedia article about adblockers, because why not?)

I'm not disagreeing that results have been getting worse for years, but... this is a really bad scoring system. It feels like that one very new person jumping on SO posting something like "syntax error: if 1 {" - what are you even asking for? (To be honest, the search engines could also give you the equivalent of "this is a very vague, would you like to specify what you're actually after? here are some suggestions: ...", but that's beyond the scope here.) The search returning not the exact thing you want to see for a super generic query, but returning a valid answer to a question is not "very bad".

linusg789
0 replies
1h23m

My thoughts exactly.

anonymoushn
0 replies
43m

If you try using it, the first result doesn't help you download a youtube video and does try to get you to install malware.

shutupnerd0000
2 replies
13h33m

Speaking of bad software, anyone getting a huge amount of horizontal scroll on mobile on this blog post? What should I add to my bag of tricks to work around that

jraph
0 replies
11h41m

Reader mode might do the job.

gniv
0 replies
9h38m

I am not (Chrome on iOS).

vitorgrs
1 replies
12h29m

Weird article. Basically, the author thinks that anything that is not yt-dlp is a bad search result, which is pretty insane.

Like, for me at least, I already know yt-dlp exists. When I search "youtube downloader", it's exactly because I want an online-website page to download youtube videos.

anonymoushn
0 replies
41m

The author would probably accept any result that helps them download youtube videos. Did you find any and successfully use it to download a youtube video? Could you provide a link to the one you used?

sundalia
1 replies
4h35m

The intro query "youtube downloader" already showed me relevant results (some website where you paste an URL and bam download). I think there's a big tech bias in the whole post (how relevant is a mastodon poll, for real).

Not saying the current landscape doesn't suck with ads everywhere and incentives to not give exactly relevant results at times, but I think google is pretty good still.

anonymoushn
0 replies
42m

Which web site did you use to successfully download a youtube video, and which youtube video did you downloadl?

littlecranky67
1 replies
9h41m

Kagi really shines on topics that are SEO-spammed on other search engines. I.e. when travelling to a touristic city, searching a recipe, or basically any product you want to buy. I actually got "search anxiety" searching these topics, as I know I will have to navigate a lot of SEO spam, content that is artificially blown up, and the core information purposefully hidden somewhere on the page - if any. Plus the multitude of cookie consent banners and newsletter subscription popups on each link...

I've been using Kagi's FastGPT [0] now for these searches, it basically removes all the bullshit and gives verifiable sources for any answers.

[0]: https://kagi.com/fastgpt

pbronez
0 replies
5h17m

Yeah that’s my go-to as well. Interestingly, I often find that “Fast” mode results are as good or better than “Expert” mode for simpler tasks.

ic_fly2
1 replies
7h8m

I have a small page that modifies my get requests to google by adding -site:… for a bunch of most annoying content farms for stuff I search often (docs)

Dah00n
0 replies
6h32m

Have you tried uBlacklist?

gniv
1 replies
10h12m

Meta: Since the text on the page is so dense, I tried reading it in Chrome's reading mode. Which was fine until the Appendix. All the results are missing, leading to confusion.

UberFly
0 replies
10h9m

I also was overwhelmed by the amount of data. I came back here to find the cliff notes :)

btbuildem
1 replies
2h44m

On a side note: would it kill the author of the site to use a stylesheet?

hasmolo
0 replies
2h42m

it's the same as my choice to only use lowercase letters, it is designed to make you upset that i am not following conventions. that's as far as i have been able to figure for hwy i started doing this, and by extension, why tech bois love to drop some vital freature to communication to signal being an 'insider'

BoostandEthanol
1 replies
6h49m

There’s something incredibly entertaining to me about even this well researched article struggling to find a reason for why wider tyres have more grip.

As I understand it, this is because tyres are still somewhat of a mystery, and anyone outside of a laboratory really doesn’t know shit. The best explanation I can think of is due to tyre load sensitivity. The friction coefficient of rubber decreases with normal force (E.g, a heavily loaded tyre has a lower friction coefficient), which is a pretty well accepted fact, this is one of the methods engineers will use to tune the handling of cars. This means a wider tyre has a lower force per unit area of the contact patch, which means it’ll have a higher friction coefficient.

Now that sounds plausible to me, but that’s just my best guess explanation.

InCityDreams
0 replies
5h52m

https://www.bicyclerollingresistance.com/

gives good tyre advice (obviously not car tyres, but info is there)

yashasolutions
0 replies
8h2m

Kagi is great, it's now my daily driver for search. This is after I got tired of DDG, moved to Google (through StartPage), but the spammy result, or just irrelevant... and the fact that sometime they aren't any results even... for the most trivial search. So I switch recently to Kagi, and so far it's been smooth sailing and a real time saver.

wolverine876
0 replies
13h48m

Look at the source for that page. Is it hand-coded? (I think it's great.)

urbandw311er
0 replies
2h45m

I had to stop reading this because I found it too depressing and it triggered a lot of anger about how big tech combined with the incentives of capitalism is basically fucking up the world.

torginus
0 replies
5h14m

I wonder if this aggregate enshittification of computers (be it search, social media, video games) etc. is actually a good thing for humans in general.

I feel like today's digital spaces don't have as strong a grip on the minds of people - I think folks started rediscovering the value of genunine human interaction and hobbies that do not involve a computer screen.

For example, I haven't seen the equivalent of 2000s-2010s Facebook addicts or (WoW addicts in the gaming space) to such an extent, with parasocial media, such as TikTok or Youtube or Twitch, having replaced social media, and social video gaming such as MMOs having lost a lot of popularity.

thsksbd
0 replies
14h36m

Honestly, if you have to search something remotely technical, try HN's search function with comments enabled.

If the topic has ever come up the discussion and links are likely to be more relevant and better than your avg. wiki article

throwawaaarrgh
0 replies
13h14m

Search engines are not designed to give you the information you desire. They are designed to sell ads or metadata. "Result quality" is of no consequence.

If you actually wanted accurate results you wouldn't use a tool that is literally attempting to read your mind like a fortune teller. It is impossible to know what you want just by the word "snow". Jesus Christ engineers are so dumb.

swayvil
0 replies
3h18m

Without labor to run their circus, 99% of business would disappear overnight.

Without business, spam would disappear.

So if you remove the labor you remove the spam.

So the best spam filter is UBI.

shaldengeki
0 replies
12h8m

What's most shocking to me is how much malware there is in all of this. The fact that Google et al aren't constantly in trouble for directly forwarding unwitting users to malware distributors indicates to me just how far our standards have fallen for a "good" search engine. I feel like we'd be happier with search engines that adhered to "first, do no harm" principles.

readthenotes1
0 replies
15h28m

I got different results for Google on "ad block".

And changing the query to "ad blocker" like Google suggested raised ublock origin way up in the results

nunez
0 replies
5h9m

When I tried running the query from the paper, "cellular phone" (no quotes) and, the top result was a Google Store link to buy Google's own Pixel 7, with the rest of the top results being various Android phones sold on Amazon.

Interestingly, if you add "before:2001-01-01" to the query, the paper that Brin and Page referenced shows up as the third result.

That this query now ranks phones you can buy higher than information about phones makes sense, since the web is much bigger these days and cell phones are much more widely accessible than they were back then.

Although Google doesn't publicly provide the ability to see what was historically returned for queries, many people remember when straightforward queries generally returned good results.

See above. Sort of.

---

I wish Dan spent more time talking about Kagi. I, too, have found it terrible for searching for things to buy and some images but excellent otherwise.

nneonneo
0 replies
13h31m

Honestly, this is depressing. Back in the day, AltaVista and AskJeeves existed but returned terrible results, and Google showed up to disrupt them all. It seems like we should be on the verge of repeating this cycle.

Maybe LLMs will help, but I can’t shake the nagging feeling that the situation will simply get worse with LLMs, not better, due to hallucinations and the apparent “gullibility” of LLMs: I would not be surprised if SEOing an LLM turns out to be easier than SEOing Google.

londons_explore
0 replies
9h40m

I would kinda have liked side by side screenshots so I could see for myself rather than a wall of text

joshuaissac
0 replies
6h35m

For the ad blocker results, the author judges the search engines by how they rank the best result (uBlock Origin), but I think that search results that point to Adblock Plus or AdBlock are good enough. Sure, they do not block all ads, and take money from advertisers to allow through certain types of ads, but they still block ads in general, and 'acceptable ads' can be disabled in the settings. So I would consider these 'good results', rather than 'bad results'/'very bad results' as the author does.

jmakov
0 replies
7h10m

Using phind most of the time. Would be interesting adding it.

jimbobthemighty
0 replies
7h23m

The Github link is my top result on Google. Clearly a mix of uBlock and Privacy Badger are more powerful than most appreciate.

jcmeyrignac
0 replies
9h25m

No mention of https://www.qwant.com

jbmilgrom
0 replies
1h23m

Was gpt4 used (with paying subscription)?

innocentoldguy
0 replies
12h37m

While I think the article is interesting, I disagree with its results regarding Kagi. I like Kagi and rarely use anything else. Kagi's results are decent and I can blacklist sites like Amazon.com so they never show up in my search results.

hamilyon2
0 replies
8h56m

Is this from desktop? What region?

Ublock origin in the very top result for ios device is simply a bad search result page. Maybe fourth position is tolerable, after three different working ones. Maybe it should be lower, I doubt myself, if my point of view is too elitist.

Yt-dlp is subject to all sorts of takedown requests in different jurisdictions.

emmanueloga_
0 replies
14h13m

I will admit that I can't read between lines here and just go ahead an ask: What does "bluesky thought leader" suppose to mean? (1) Any guesses who this may be? Why is he not quoted directly? (btw, the term is used 3 times, presumably to refer to the same person).

1: my reading is that this is a sarcastic denomination for someone that is supposed to be an innovation thought leader but actually is just defending the broken search landscape status quo.

elcook4000
0 replies
14h55m

I have found appending site:edu remarkably improves google results.

For both the tire question and with respect to a youtube dowloader, the first results were on the nose with the addition of site:edu on Google.

Why this is needed and whether a noncommercial, information rich web portal should exist are questions for another thread.

csours
0 replies
12h27m

Wide tires by Jason of Engineering Explained: https://www.youtube.com/watch?v=kNa2gZNqmT8

Better answer: learn the differential equations in this book:

https://ftp.idu.ac.id/wp-content/uploads/ebook/tdg/TERRAMECH...

cratermoon
0 replies
15h48m

"Going back to the debate between folks like Xe, who believe that straightforward search queries are inundated with crap, and our thought leader, who believes that "the rending of garments about how even google search is terrible now is pretty overblown", it appears that Xe is correct."

Also, the article tested Mwmbl as well, not mentioned in the title here.

buro9
0 replies
7h56m

mostly my search is now Wikipedia.

I'm probably in a very small group who have the entirety of English wikipedia (without images) on my Android (via Kiwix), and I just search that. 99% of the time that's all I need.

the only exceptions are super current things like weather (Windy), or travel (Navan work travel system gives me enough to just go direct to airlines, hotels, etc), and local (OSM via Organic Maps).

I've almost completely degoogled (not intentionally, but driven gradually by Google becoming crappy incrementally), but didn't really find a single generic replacement as much as I found far better single purpose tools.

I'm reminded of that Craigslist image showing how many startups were each competing against specific parts of Craigslist https://cbi-blog.s3.amazonaws.com/blog/wp-content/uploads/20... , and this is what it feels like is happening to Google.. they're being beaten in specific areas, but at the same time spam and crap is diluting their core product.

aworks
0 replies
14h52m

The appendix describing the individual search results is both entertaining and scary e.g.

"Two of the top three hits are how to install the extension and the rest of the top hits are how to remove this badware. Many of the removal links are themselves scams that install other badware."

arthurcolle
0 replies
15h17m

I use serpapi for my hot RAG and the results are fine.

Brave search API is obscenely overpriced. I hope someone is working on Search because Google has become a singularly garbage company. Propping up DEI is sinful enough but just failing to compete is lame. /shrug

airstrike
0 replies
2h36m

Continuing with the theme of running simple, naive, queries, we used the free version of ChatGPT for this post, which means the queries were run through ChatGPT 3.5.

why

ZeroGravitas
0 replies
8h34m

I'm not sure youtube-dl is a good answer unless you're a nerd.

Which is a similar phenomenon to search. If you have sufficient tech skills there's a whole world of freely available software out there to complete your task.

If you're not then you are at the mercy of a range of commercial offerings (some built on the free software) that range from arguably scams to outright scams.

ShadowBanThis01
0 replies
14h35m

More incorrect usage of "hallucinated" for simply made-up or inaccurate results.

SV_BubbleTime
0 replies
2h37m

I’d love to see this a little extended.

Searx and Yandex.

Specifically… if I need something even slightly “gray”, Yandex is the only option anymore. Torrent search on google et al is just awful.

IceMichael
0 replies
1h11m

Okay, so all search engines suck. Yeah, that matches my experience

DeathArrow
0 replies
9h7m

The thoughts about building a better search engine than Google are interesting.

Unlike the author, I think that building a better search engine than Google is possible. But it's going to be rather expensive. And the only proven way to monetize it is selling ads. Which will degrade the quality of the search results fast. For potential investors, there are probably many better ways to invest money then by building a search engine.

This lets us with only one viable alternative: build it in the open like Wikipedia and source donations from people and from Google competitors like Amazon or Apple.