return to table of content

Rediscovering the Small Web (2020)

raytopia
13 replies
5d4h

I feel like the internet needs a giant directory of indie websites. So you can actually surf around and find them.

The big modern search engines almost have to be intentionally hiding these websites because they're nearly impossible to find without using an alternative engine like wiby.me or search.marginalia.nu.

dartos
8 replies
5d4h

I don’t think Google hides small sites as much as people are really good at SEO for Google specifically.

Like my blog has literally 0 SEO and you’ll never find it, but a friend of mine has a blog where he does not post very often, but spends a lot of effort on SEO and it’s very easy to run into his blog.

The SEO meta destroyed small blogs.

marginalia_nu
7 replies
5d1h

It's impossible to say this is something they do, but it's worth noting that Google also has an economic incentive to mostly show commercial/ad-ridden results, as leading users to blogs with no adsense on them make them less money; so it would at least be in their interest to let the search results look like they do.

To fully understand Google you need to look at them not as a service that brings websites to people, but directs people to websites.

leephillips
2 replies
5d

Contains the quote above and “The goals of the advertising business model do not always correspond to providing quality search to users.”

So what we observe in the deterioration of Google search was predicted by its creators, who made the deliberate decision to let this happen by accepting advertising.

philistine
0 replies
4d20h

It’s so unfortunate that no one inside Google is taking any decision to clearly make things worse. It’s simply the structure of their business that is fundamentally wrong, and their founders had correctly identified the problem right at the start.

marginalia_nu
0 replies
5d

Google went public in 2004, after that I don't think any amount of founder idealism could have saved it from shareholder pressure. If anything it's remarkable they held out as long as they did.

dartos
1 replies
4d3h

I think the game of SEO works in the favor of advertisers naturally.

Google doesn’t need to downrank small sites, it just happens.

Maybe it’s just semantics

marginalia_nu
0 replies
4d2h

It's not like it's impossible to combat search engine spam. The by far most effective tool is to just go after affiliate links and ad-heavy websites. Penalize those websites and 99% of the search engine spam vanishes.

Though as noted, this is not in Google's economic interest to do.

ramses0
1 replies
5d4h

I was just going to post a comment similar to this. We've swung towards walled gardens of piles of content instead of graphs of individually curated links.

Exactly that "surfing" or "webring" or "stumbleupon" style of actually browsing in a larger content rather than searching or push-promote within that pile of content.

dudinax
0 replies
4d11h

The walled gardens are better for many of the internet's main uses.

If I need to find out what vodka to buy I Google with site:reddit.com and pick the post that's obviously written by alcoholics. The small web can't touch that.

fabianholzer
0 replies
4d9h

ooh.directory is fantastic, I particularly liked the stance that only add a few sites a week are added, which allows to "digest" these sites. Sadly, no new sites have been added since nearly three month. I assume this is just an instance of "Life happens" - it is a single person venture after all - but if there were a dozen similar attempts at handpicking and cataloguing the "good web", it would not hurt.

thehappyfellow
10 replies
5d9h

One of the best internet experiences I had in a while is reading (and writing!) posts on bearblog.dev, check out their discover feed. Wholesome place.

In similar spirit, check out https://ooh.directory

8organicbits
2 replies
5d5h

Here's another self-promotion.

https://alexsci.com/rss-blogroll-network/

This uses OPML blogrolls to crawl blog-to-blog recommendations. I seeded it with the blogs I follow and various planets (https://indieweb.org/planet) and then recursively followed recommendations to build an organic network. Lots of the content is tech-related, indieweb, and smallweb. It's grown to 17 languages and over 4000 RSS/atom feeds.

As an example, the linked blog has a page here [1] and it was discovered by a recommendation by [2].

[1] https://alexsci.com/rss-blogroll-network/discover/feed-12ac5...

[2] https://alexsci.com/rss-blogroll-network/discover/feed-8ecf9...

8organicbits
0 replies
4d19h

Yes, semi-regularly, I did a fresh update since your message.

To reduce storage size, only the title/link/metadata of latest post from each feed is saved. I run the crawler manually, aiming for weekly, but sometimes less frequently. So this won't catch every post and it lags behind quite a bit. I'm hitting some hosting sites faster than I'd like, especially ones that support custom domain names, so I'm planning on fixing up the rate-limiting strategy before I put it in a daily cron job.

There's a plan for ArchiveTeam to use the RSS feed as another way of discovering blog posts to archive. I don't think it's generally useful to point your feed reader at it as there's quite a diverse collection of content.

fortran77
3 replies
5d9h

I just tried using google to search for sites I see in ooh.directory, and it's very hard to get them to surface. I can take exact specific phrases from them, like "Scaling a Digital Panel Voltmeter" and without quotes neither Google nor Bing will find the site, and with quotes, only Bing finds the site: (https://zzncx.top/posts/scaling-a-digital-panel-voltmeter/)

Personal blogs with real information just can't be found anymore.

marginalia_nu
0 replies
5d1h

It's kinda hit and miss with regards to these types of queries.

I've got better phrase matching in the pipe though, give it a few weeks and it should do this even better.

smetj
0 replies
5d8h

Excellent point made

Cosi1125
1 replies
5d4h

The Reddit thread [1] in which the author introduces Bearblog explains the sorry state of today's Internet a bit. "What's the point of blogging if I can't track users and harvest their email addresses?"

[1] https://www.reddit.com/r/Blogging/comments/i8fmuc/%CA%95%E1%...

philistine
0 replies
4d20h

For me, the point of my pointless blogging is to sell dozens of books across the world with my words in them. That way, I feel like I’ve achieved immortality. No joke.

nils-m-holm
9 replies
5d10h

Here's my contribution to the small web: http://t3x.org

sylware
3 replies
5d8h

Careful, if you show noscript/basic (x)html does a good enough job, you will get attacked by big tech shadow-paid hackers (or idiotic ones), that to force you to use their javascripted grotesquely massive and complex web-engines.

...

__MatrixMan__
2 replies
5d5h

I also dislike the skinner box that is today's web, but do you really think it's somebody's job to attack you for having your site be a document?

sylware
1 replies
5d5h

Well, maybe not a static document, but as soon as you have some basic HTML forms doing a good enough job, I would not be surprised to it gets attacked by big tech shadow-paid hackers (or idiotic ones) to push forward their massive and complex javascripted web engines which they have control over.

Look at whom the crime is a benefit in the end.

NackerHughes
0 replies
5d4h

What? This can easily be averted by adding a captcha to the form (server-side validation so no JS needed) and/or some sort of rate limiter or firewall, e.g. blocking any IP address that sends too many requests too quickly.

pkphilip
1 replies
5d4h

This is such an useful set of links! thank you!

nils-m-holm
0 replies
5d3h

Pretty much everything that is linked to is also on T3X.ORG. :)

felixyz
1 replies
5d7h

Wow, I remember discovering your page in the late 90s. Never thought I'd find it again!

nils-m-holm
0 replies
5d3h

Being found is the greatest problem that small web sites have these days. Glad you found it again! :)

swayvil
0 replies
5d3h

Hey man I'm into meditation too. Nice to meet you.

freediver
7 replies
5d4h

Our contribution to the small web: https://kagi.com/smallweb

The site and list of blogs is open source, growing steadily by about 10 each day (almost at 15,000 at this point).

Every recent post from sites in Kagi Small Web is indexed and given preference in Kagi Search results.

How it works: https://blog.kagi.com/small-web

edit: The project just had its one thousandth commit!

zzo38computer
1 replies
4d15h

How it works: https://kagi.com/small-web

I tried to access it. It displays a different web page in a frame, which has an invalid certificate (among other things, it is expired and is for the wrong domain name), and then when I bypass the certificate error, I get a 404 error.

treetalker
0 replies
4d12h

Mission accomplished: sounds just like the web when I was in high school! ;-)

philistine
1 replies
4d21h

I know the friction to add websites is the point, but might I recommend a way to add our own websites without having to promote two others. My rinkydink website qualifies, but all the other small websites I know are all already on the list.

freediver
0 replies
1d21h

We make it high effort as we want to prevent low effort submissions - we only have limited resources available to review.

sbarre
0 replies
5d

https://wiby.me/ is also excellent. Someone else linked it elsewhere in the thread but worth riding the coat tails of the top post for anyone interested.

Teckla
0 replies
4d17h

Our contribution to the small web: https://kagi.com/smallweb

After opening this web page, I pressed down arrow a few times to scroll the page. At first, I didn't understand why it only scrolled a few pixels.

It looks like there's a scrollable area within a scrollable area. The outermost scrollable area only scrolls a few pixels.

This is a badly designed web page.

Kye
0 replies
5d3h

It's StumbleUpon without the spam problem. I like it.

roschdal
4 replies
5d12h

The web was so much more fun in the 90s.

marginalia_nu
3 replies
5d10h

Fun parts of the web still exist today, they're just struggling to be noticed. Arguably the biggest change since then is in signal to noise ratio.

nicbou
0 replies
5d9h

And the algorithms we live by.

Google does not easily surface those websites. Social networks suppress posts with links.

CalRobert
0 replies
5d5h

A lot of the good stuff got sucked in to walled gardens. People’s personal home pages or tacky MySpace pages were definitely more fun than the current semiprofessional content scroll. Forums like this very one were mostly subsumed in to Reddit. Nevermind the death of the bbs (not actually the internet I realise)

BaculumMeumEst
0 replies
5d7h

The problem is that most of the interesting content I'm interested is posted in the not-so-fun parts of the web, so I feel forced to participate.

tropicalfruit
2 replies
5d11h

mobile devices, app-ification and the social media that really started to kill the small web, kind of ironically.

and if you're a front end developer it was apple launching the meta viewport tag in 2007 killed the simple front end.

082349872349872
1 replies
5d4h

"Today is September 11323, 1993"

tropicalfruit
0 replies
4d7h

yep. truly eternal.

mjfl
2 replies
5d8h

In the same spirit, here is a site devoted to getting off the centralized platforms:

https://landchad.net/

sanjumsanthosh
0 replies
5d4h

Wow this looks clean !

janandonly
0 replies
5d4h

Wonderful collection of how-to’s to run your own server. Thanks for sharing.

Might I suggest (in the interest of privacy) that you give donators the option to use a Silent Payment address instead of a naked BYC address? I noticed you have a Monero address as well, so I assume you care about privacy

Jordan_Pelt
1 replies
5d3h

Thank you for this. It has inspired me to delete my Reddit account and create an HN account. This gives me hope that the web can survive the social media era.

righthand
0 replies
5d1h

Well this is social media too. Beware of shifting complexities!

xenodium
0 replies
5d1h

My contribution to the small web is a lightweight blogging platform: https://lmno.lol My blog is at https://lmno.lol/alvaro

You can drag and drop your entire blog from a single markdown file https://indieweb.social/@xenodium/112265481282475542

You can read the blogs from anywhere, even terminal (no JS needed).

No need to sign up or log in to try it out. I haven't officially launched, but if you'd like to start blogging now, I'll be happy to share an invite code.

rambambram
0 replies
5d2h

My list of shared links is here: https://www.heyhomepage.com/?module=timeline&view=sharedlist

It's basically all the sites and feeds I follow daily with the Hey Homepage built-in RSS reader. You can browse the list and click around, or download it as an OPML file.

RSS = Really Social Sites; OPML = Other People's Meaningful Links

r85804306610
0 replies
4d22h

i've been publishing things as html2 pages, but not interconnected in any way. so each page (or sometimes group of pages) will be dedicated to an exploration of a single subject. i then send those pages to people who i think might be interested in them. that's all, they otherwise don't see the greater internet. of course people are free to add them to link aggregators, etc. but i don't police this practice. i simply don't care for my output to be consumed by general public, or by llms, or by corporate media, or by whomever who is not my friend or in my immediate immediate circle of friends

kaeruct
0 replies
4d22h

One site in this vein that I hope never goes away is https://rpgclassics.com/

I discovered it as a young lad lost when playing some RPGs on emulators in the early 2000s

janandonly
0 replies
5d4h

This is now the 7th time someone shares this link on HN. It must be worth a read

Kovah
0 replies
5d9h

If anyone also misses StumbleUpon, there's something similar: https://cloudhiker.net

AstroJetson
0 replies
4d20h

There was a push during Covid on Gemini pages. I did that for awhile, but the lack of real formatting and not being able to cross link articles became a stopper.

You can see get to some of them here

Collaborative Directory of Geminispace: gemini://cdg.thegonz.net/

But you need a Gemini reader

1vuio0pswjnm7
0 replies
4d20h

Seems like a small web deserves a small client. Why use a "big web" client to read the small web. "Big web" clients are funded by advertising or advertising companies.

Bias disclosure: I have used a text-only client for the last 30 years.