return to table of content

Sourcegraph went dark

sqs
59 replies
8h3m

Sourcegraph CEO here. We made our main internal codebase (for our code search product) private. We did this to focus. It added a lot of extra work and risk to have stuff be open source and public. We gotta stay focused on building a great code search/intelligence product for our customers.

That's what ultimately lets us still do plenty of things for devs and the OSS community:

(1) Our super popular public code search is at https://sourcegraph.com/search, which is the same product customers use internally on their own codebases. We spend millions of dollars annually on this public instance with almost 1M OSS repositories to help out everyone using OSS (and we love when they like it so much they bring it into their company :-).

(2) We also have still have a ton of open-source code, like https://sourcegraph.com/github.com/sourcegraph/cody (our code AI tool).

BTW, if any founders out there are wondering whether they should make their own code open-source or public, happy to chat! Email in profile. I think it could make sense for a lot of companies, but more so for infrastructure products or client tools, not so much for full server-side end-user applications.

quantumwoke
26 replies
7h51m

Been a fan of sourcegraph since 2016 or so, it's been exciting to watch the pivots along the way. That being said, the loss of transparency here is pretty sad, speaking as a large FOSS repo owner. What were the main factors apart from risk that went into the decision?

sqs
25 replies
7h16m

Thanks for being a fan. And I understand it's a bummer to not have our code be public and open-source anymore. Sorry.

It's a bunch of reasons that add up. I'll give some more details for anyone curious.

(And I know that despite these reasons, lots of HNers probably wish it was not so. I agree! I too wish for a world where all companies could have their code be public and open source.)

- We have a lot of tech around large-scale code graph, indexing, etc., stuff that is very differentiated and hard to build. We were starting to put some of this in separate private repositories and link them in at build time, but that was complex. It added a lot of code complexity, risked bugs, and slowed us down, and if a lot of the awesome stuff was private anyway, what was the point?

- As we've been building Cody (https://cody.dev), our code AI tool, we've seen a LOT more abuse. That's what happens when you offer any free tier of a product with LLM inference. We had to move a lot more of our internal backend abuse logic to private repositories, and it added code complexity to incorporate that private stuff in at build time.

- It confused devs and customers to have 2 releases: an open-source release with less scaley/enterprisey features, and an enterprise release. It was a pain to migrate from one to the other (GitLab also felt this pain with their product) because the open-source build had a subset of the DB schema and other things. It was confusing to have a free tier on the enterprise release (lots of people got that mixed up with the open-source release), and it made our pricing and packaging complex so that lots of our time was spent helping customers understand what is paid and what isn't.

- There were actually very very few companies that were going to pay but then decided to use the open-source version and not pay us. A lot of people probably assume that's why we made this move, but it's not. I think this is because people like the product and see value in it, including all the large-scale code nav/search features that are in our enterprise version.

- Although very very few companies used our open-source version to avoid paying us, we did see it cause a lot of annoyance for devs who were asked by their management to try cloning our product or to research our codebase to give their procurement team ammunition to negotiate down our price. This honestly was just a waste of everyone's time.

- If we got a ton of contributions (we never really solicited any), then it might've changed the calculus. Sourcegraph is an end-user application that you use at work (and when fun-coding, but the primary revenue model is for us to charge companies). For various reason, end-user server-side applications just don't get nearly as many contributions. Maybe it's because you'd need to redeploy your build for a bunch of other users at your company, not just yourself. Maybe it's because they necessarily entail UX, frontend, and scaling stuff, in addition to just adding new features.

- We heard from people who left GitHub that people at GitHub were frequently monitoring our repository to get wind of our upcoming features and launches. Someone from GitHub told me his "job is to clone Sourcegraph". Since then, they obviously deprioritized their code search to re-found GitHub on AI, so we're not seeing this threat anymore. But I didn't love giving Microsoft an unfair advantage, especially since GitHub products are not open source either.

- Since we made our code non-open-source, we've been able to pursue a lot more big partnerships (e.g., with cloud providers and other distribution partners and resellers). This is a valuable revenue stream that helps us make a better product overall. Again, because Sourcegraph is an end-user application with a UI that devs constantly use and care about, we never really had the MongoDB/Redis/CockroachDB risk of AWS/GCP/Azure just deploying our stuff and cutting us out. We're not protecting from downside here, but we are enjoying the upside because now those kinds of distribution partnerships are viable for us. To give a specific example, within ~2 months of making our code non-open-source last year, we signed a $1M+ ARR deal through a distribution partner that would not have happened if our code was open source. This is not our biggest annual deal, but it's still really nice!

We are totally focused on building the best code search/intelligence and appreciate all our customers and all the feedback here. Hope this helps explain a bit more where we're coming from!

cdchn
16 replies
6h25m

Although very very few companies used our open-source version to avoid paying us, we did see it cause a lot of annoyance for devs who were asked by their management to try cloning our product or to research our codebase to give their procurement team ammunition to negotiate down our price. This honestly was just a waste of everyone's time.

Trying to spin that it was "for the devs" is really stretching the bounds of incredulity. We get it, its fine, you have investors to answer to, but come on don't pee on our shoes and tell us its raining.

sqs
7 replies
6h20m

Fair, I probably didn’t hear from the devs who weren’t annoyed by that. I heard from plenty of devs who were annoyed by it.

cdchn
6 replies
5h58m

This wasn't a decision made based on dev input, lets be real.

OliverGilan
5 replies
2h52m

This seems weirdly hostile. He laid out a bunch of points but you’re grabbing on to this one to make it seem like he’s using classic corporate-speak. Do you find it so unrealistic that the CEO of Sourcegraph has heard from devs that their managers asked them to try to clone or investigate the product before buying? That seems pretty likely

cdchn
3 replies
1h44m

I don't think calling out clear insincerity in the service of maintaining a public image is weirdly hostile. Maybe he did hear from devs saying it was annoying they were asked to clone or make his product work for free. But he _wasn't making these decisions "for those devs"_ as he claims, it was did to increase _sales._

tptacek
0 replies
10m

It's both hostile and, worse, boring. I know it sucks to be intrinsically less interesting than someone you disagree with passionately, but it is the case here that the CEO of the company explaining their policy shift is much more interesting than your rebuttals, which seem superficial and rote by comparison.

Someday somebody is going to be intrinsically more interesting about, like, supporting DNSSEC than me (maybe Geoff Huston will sign on and start commenting), and I'm going to want to claw my eyes out. I have empathy for where you're coming from. But can you please stop trying to shout this person down?

int3
0 replies
1h2m

people can do things for more than one reason

avianlyric
0 replies
33m

If we ignore the final sentence of his reason, then you might have a point. But given his reason ends with:

This honestly was just a waste of everyone's time.

Makes it pretty clear that the benefits to Sourcegraph (I.e. not wasting time negotiating with companies acting in bad faith), was a large part of this rationale.

Besides, if you had ever tried using the OSS version of Sourcegraph, you would realise that OSS Sourcegraph is a shadow of its enterprise version. Trust me, Sourcegraph didn’t loose any sales to people running OSS Sourcegraph, and anyone who’s willing to rip out the licensing system, so they can use the enterprise features without paying, obviously isn’t going to become a paying customer either.

HelloNurse
0 replies
2h22m

Investigating Sourcegraph's source code as part of procurement is not only plausible, but useful work that a software engineer should be happy to do.

Stating that making such evaluations impossible is a good thing is therefore more bullshit than other reasons to go closed source.

orochimaaru
5 replies
5h54m

Actually this one I get completely. There’s plenty of places or managers with dev orgs that will check if they can install something complex in house with open source. Nothing wrong with it. But it’s usually a huge waste of time.

Aeolun
3 replies
4h59m

But it’s usually a huge waste of time.

Is it? I think at this point my company has probably saved millions of dollars by not paying for subscriptions, but hosting everything in-house. The price point of a lot of these services makes perfect sense when you are small, but paying 1M/year in subscription fees when you can host the same thing for 10k/year is just bonkers. I appreciate that someone has to pay for it for them to continue making the product, but there’s a point where it makes more sense for me to spend a year setting it up (and really only costs two weeks).

orochimaaru
0 replies
3h58m

My experience was with things like openstack and kubernetes. The org decided to do “cloud” in house first with openstack and then kubernetes - and run critical services on them that had very strict performance SLA.

The amount of time needed to do the whole thing wasn’t worth it. Sure I enjoyed tinkering with the kernel and drivers and k8s. Also diving into known cgroups and namespaces worked etc. However, from a time to market/stability perspective the solution was nowhere comparable to what public cloud providers offer.

Yeah - the subscription costs more. My experience has been that when things get big and hiring gets tense in house solutions just add stress on the devs maintaining it. At least with public cloud services - it’s clearer - if the budget doesn’t exist don’t run it.

I will add that I don’t use sourcegraph nor am I connected with them in anyway. So I’m not batting for their go private strategy. Just commenting on this one point.

everforward
0 replies
46m

That math only works out nearly that cleanly if you avoid pricing out the engineer time for it.

If you’re paying $1M/year in fees, I would be shocked if you don’t have a whole team to support the open source version. Oncall, system upgrades, the usual stream of tickets about things not working right and people wanting to integrate, etc.

I do believe it can be cheaper to self-host, but I really doubt the difference in cost is 2 orders of magnitude. I’d be surprised if it was a single order of magnitude. I would wager it’s less than the sellers profit margins because of economies of scale; I would guess in the range of 10%-20%.

avianlyric
0 replies
38m

Well that obviously doesn’t apply to Sourcegraph because their self-host offering requires paying a subscription. You can’t use any form of Sourcegraph on private code, (at least not without all the important features being nobbled) without paying a subscription. So there’s no saving to be made from self-hosting sourcegraph

pas
0 replies
5h19m

Why? Getting operational experience with the product that you might then pay a lot for seems very important. Especially if you end up liking the product/service but not the pricing changes that might then happen, so doing some exploratory fact finding for a backup plan doesn't seem to be waste of time.

For example when we used Jira on-prem and it was snappy and we were happy ... and it was a rather important point of difference compared to the slow shitocumulus version.

Also, when people are using GitHub issues to ask questions the problem is usually a lack of clear documentation. (And if spending time to link FAQ answers to potential customers is a waste of time ... then maybe it's not surprising that Sourcegraph CEO is doing damage control on HN instead of focusing on focusing or whatever.)

Cthulhu_
1 replies
5h39m

Yeah while I'm sure the developers that were asked to just grab the code and make it work wasn't their favorite job, I think the bigger one is further down - Github developers being tasked with reverse-engineering an open source product to create a closed source clone.

I would've respected GH more if they just used Sourcegraph and instead spent those developers on improving the open source product itself. But, I suspect that Github / Microsoft would then need a locked down license that e.g. Sourcegraph would forever remain open source, or that GH gets free licenses if they ever went closed source, or whatever.

cdchn
0 replies
5h30m

Yeah while I'm sure the developers that were asked to just grab the code and make it work wasn't their favorite job, I think the bigger one is further down - Github developers being tasked with reverse-engineering an open source product to create a closed source clone.

They don't want Github to clone their product. They weren't doing it for the Github devs.

jsiepkes
6 replies
7h0m

Sourcegraph CEO here. We made our main internal codebase (for our code search product) private. We did this to focus.

There were actually very very few companies that were going to pay but then decided to use the open-source version and not pay us. A lot of people probably assume that's why we made this move, but it's not.

To give a specific example, within ~2 months of making our code non-open-source last year, we signed a $1M+ ARR deal through a distribution partner that would not have happened if our code was open source.

So the reason these deals are now possible is mainly because time was freed up by not having the code base opensource?

sqs
5 replies
6h56m

So the reason these deals are now possible is mainly because time was freed up by not having the code base opensource?

No, it's that if all the code is free and open source for anyone, we would not be able to charge for it and there would be no deals. Even if, say, 60% of our product was open-source and 40% was closed source, we might still get a lot of direct customers but would struggle to do distribution partnerships because the distribution partners have outsized incentives and capacity to reimplement the subset of the 40% they think their market needs.

vundercind
4 replies
6h30m

I believe the question came up because the original rationale given was “we did this to focus”, not “we couldn’t sell the code for as much if it was open source”.

sqs
2 replies
6h24m

Both are factors, as I said in my original post (focus and risk).

vundercind
1 replies
6h15m

“We stopped giving away some of our apples due to risk.”

“Of… liability? Or… uh, what?”

“Oh—risk that we couldn’t sell the apples we gave away, obviously.”

sqs
0 replies
6h10m

I was thinking business risk. Sorry it wasn’t clear.

tptacek
0 replies
8m

When a software business makes decisions in the name of "focus", they're usually implicitly saying the "on the stuff that will make the company more money" part. Focus implies product/market fit.

chubot
0 replies
2h53m

I appreciate this answer -- it clears a lot of things up!

breck
8 replies
5h44m

I think the term the industry needs to embrace is "Early Source": https://breckyunits.com/earlySource.html

Make everything public domain, fully open source, just delayed by N years.

breck
2 replies
3h31m

Interesting! I hadn't seen that term. Thanks!

I don't like their implementation though. If one thinks from natural principles, one has to reject the idea of licenses on ideas.

Early source is in harmony with nature.

Also "Early Source" rolls off the tongue better than "Delayed Open Source Publication". ;)

yjftsjthsd-h
1 replies
3h24m

Also "Early Source" rolls off the tongue better than "Delayed Open Source Publication".

Yeah, but nobody will know what "Early Source" means until you explain it, whereas the latter makes perfect sense on first reading.

breck
0 replies
3h6m

There was a time when no one knew what "Open Source" meant.

dudeinjapan
2 replies
4h10m

Will be lovely to have the source N years after AGI terminates humanity.

breck
1 replies
3h27m

Lol. You could write a sci-fi novel with a world of cyborgs where the age of all cyborgs is N (when they first got access to the source). And primitives are called "Pre-Ns"

dudeinjapan
0 replies
30m

Not too shabby an idea!!

mort96
0 replies
3h45m

Why?

BaculumMeumEst
5 replies
3h22m

This thread reminded me to finally try Cody, I've been bouncing on and off Copilot for a few months. I wish I knew how good this was sooner, and I had no idea there was a generous free tier.

BaculumMeumEst
2 replies
3h9m

My most popular repo is just barely under the cutoff, but I can't advertise it because I'll doxx my shitpost account! Damn! I'll try to apply anyways ;)

jdorfman
1 replies
3h6m

Submit it anyway, I'll approve it.

BaculumMeumEst
0 replies
3h2m

Submission sent! Thanks!

wesleyyue
0 replies
2h38m

If you're open to trying new AI coding assistants, would love if you can give https://double.bot a try! (note: I'm one of the creators) The main philosophical differences is that we are more expensive and are trying to build the best copilot with the technology possible at any given time. For example, we serve a larger, more accurate, and more modern autocomplete model, but it does cost more to serve. We also do a lot of somewhat novel work in getting the details right, like improving the autocomplete model to never screw up closing brackets, and always auto-close them as if you typed them.

mort96
4 replies
4h30m

Huh in what way does publishing a source tarball alongside a release introduce a lot of work, risk and distraction? Your explanation makes literally no sense

EDIT: I implore the downvoters to think about this for a second. You can, actually, publish source code for a project without also committing to providing support and documentation and testing across a variety of systems. Publishing a tarball takes very little time and effort.

collingreen
3 replies
4h22m

Doing a great job on an open source codebase requires a higher level of polish, testing, design, ux, documentation, architecture, and general forethought than internal tools just like any internal vs self serve product.

Only solving your own problems on your own hardware while being able to rely on your own well-informed team to bridge the gaps sounds much much faster and easier to me.

mort96
1 replies
3h45m

Sure but you can publish the source code while only solving your own problems on your own hardware, you're not required to provide support and documentation just to publish source code...

jandrewrogers
0 replies
3h12m

There is a significant intrinsic cost to making code "open source", through the simple act of making that source code available at all. This overhead exists without any regard for what you wish or promise. It invites myriad interactions that cost time and money for little or no offsetting benefit.

Publishing source code, if anyone uses it at all, is not "free" in any sense. I know several people that stopped open sourcing their projects (not even businesses) because the cost of making their code available isn't worth it.

yjftsjthsd-h
0 replies
3h22m

Doing a great job on an open source codebase requires a higher level of polish, testing, design, ux, documentation, architecture, and general forethought than internal tools just like any internal vs self serve product.

Moving the goalposts to doing a great job internally also requires those things. Meanwhile, doing a perfectly fine job of FOSS requires none of them.

depr
4 replies
4h8m

I hope code search will one day be offered at a lower price, so small/medium sized companies can use the product. I'll never be able to convince someone to buy it when it's 3 or more time as expensive source code hosting, and would in many cases be most expensive SaaS product per developer seat that the company uses. But it's a great product.

0x1ch
1 replies
3h59m

$9 to $20 per seat seems pretty average in the grand scheme of SaaS price modelling. I don't work in software development, but IT however.

hk__2
0 replies
1h21m

$9 to $20 per seat seems pretty average in the grand scheme of SaaS price modelling.

"SaaS" is not a feature; you can’t compare products just based on the fact thay they are "SaaS". Gitlab for example brings me far more value than a tool to search my codebase; I wouldn’t put the same amount of money in both.

prepend
0 replies
3h59m

I feel the same way. It’s really interesting and provides cool insights. But it seems hard to explain to myself to spend more on that than GitHub or IDEs.

I’d like to hear more about the value customers get out of it as I wonder if it’s just groups with unlimited budget.

beyang
0 replies
3h49m

This is in the cards and thank you for the feedback! (Sourcegraph CTO here)

rapnie
1 replies
6h44m

(1) Our super popular public code search is at https://sourcegraph.com/search,

Correction: Public code on Github.

This looks to be restricted to searching Github only.. even though it had "context:global" on the querystring every hit came from Github, and none seen from Gitlab, Codeberg, Sourcehut and other self-hosted forges (e.g. Forgejo).

cqqxo4zV46cp
0 replies
3h37m

I’m sure there are 50 other ways you could categorise all the code that it searches. Nobody said that it exhaustively searches all available open-source code. I’m sure you know that that’s an impossible claim. This isn’t a correction at all. It is, at best, an elaboration. Certainly not worthy of the snark you’re giving. The reality is that GitHub hosts >99% of all open-source source code that anyone really cares about. If you have some philosophical issue with it, that’s fine, but don’t shoot the messenger by attacking individuals.

cxr
1 replies
5h42m

Yet another person equivocating the concepts of publishing code under an open source license and managing a project in public.

nearlyepic
0 replies
3h42m

It has to be disingenuous, right? These concepts aren't complicated. I wish they would just say "we want to make more money" and stop polluting open-source discourse.

cryptonector
0 replies
1h20m

For business Open source is a business tool. Open source can be a goal, naturally, but for-profit entities have a duty to be profitable (or grow, plowing profits into building). I think there's no shame in saying this. You should not need to be elliptical in your public statements about this move. Everyone knows that this is about protecting your ability to monetize the product, and so it should be, and everyone knows this sort of move comes eventually.

adhamsalama
0 replies
4h53m

Why not go the SQLite way? Open source but don't accept external contributions. Literally just dump the code.

a_t48
0 replies
5h24m

The open/closed decision is a current weight on my mind right now. Our main competition is an open source product - it feels like it will be a tough sell to not also have the core of the product be free (Robotics framework). I might shoot you an email.

iddan
10 replies
8h51m

I wonder from all the people commenting here how much they relied on Source Graph, and how many actually paid for it. Running an open-source company is hard, just like running any company is. Sometimes you understand there are things you just can't give out for free, and that's part of maturing as a company.

pjmlp
6 replies
8h31m

100% this.

Devs have to learn the hard way to behave like the other professionals, want nice things to stay around?

Pay for the tools.

sunaookami
2 replies
8h6m

Paying doesn't guarantee anything. There are tons of examples of devs selling out even though their program/SaaS is paid.

pjmlp
0 replies
7h47m

Until supermarkets and landlords start taking pull requests as payment, it guarantees more often than not.

alephnerd
0 replies
7h58m

Companies are run based on margins, not just subscriptions.

That individual account you are paying for most likely does not have the RoI needed to manage it due to a mix of larger customers abusing individual accounts to get a discount or individual account users overrepresenting themselves in support tickets, asks, and feature requests.

If you don't like the direction a tool you like is going, go build a competitor and manage it to your liking.

For most products, the revenue skew is 80-20 so if you're not part of the 20% you aren't going to be heard.

hk__2
1 replies
1h7m

This tool is $49/user/mo. That’s more than the price I pay for a single 12-core + 64GB RAM server!

Edit: Ah, and it’s 50 users minimum, so the starting price is $2450/mo.

kstrauser
0 replies
59m

Ouch. That's also well above the threshold where we'd have to get IT approval to use them as a vendor, complete with security reviews, comparison shopping with other vendors, bringing in the legal team to look at the contract, etc. It's not "just" writing a check for $30K and calling it a day.

marcinzm
0 replies
1h51m

You mean pay for your own tools and then get fired for circumventing the corporate security policies on what tools you can use?

CAP_NET_ADMIN
1 replies
7h6m

My company looked into paying Sourcegraph many times in the past, but they were prohibtively expensive every time we checked.

It's 49 USD per user per month for Code Search, like what the hell man? It's more than twice as expensive as Github Enterprise. Almost twice the cost of Gitlab Premium.

At some point it was 100USD per month per dev, I also remember it being "Starts from 5k USD per year", you can find some quotes for that in old submissions regarding Sourcegraph going open, closed, open and closed again.

kstrauser
0 replies
1h2m

That's so often the case. I was recently looking at supply chain security / SBOM software. "Coincidentally", 3 different vendors with 3 very different products quoted us the exact same annual price for the features we wanted, and that price was on the order of magnitude of "hire someone to do this manually full-time".

There are IMO too many companies that have no tier between Free and Enterprise. I understand the desire to focus on a small number of whales, but can't help feeling like that's leaving money on the table from all the smaller companies who'd be willing to pay something in the middle.

josephcsible
0 replies
6m

Pretending to embrace open source while you're getting a foothold and then abandoning it as soon as you become successful isn't "maturing". It's pulling the ladder up behind you.

alin23
7 replies
11h18m

Damn, I use Sourcegraph so much for my reverse engineering efforts on macOS. They index all those private framework symbols that people extract on every macOS release, and allow searching for headers and even how they are called by other developers that were ahead of me.

A big part of https://lunar.fyi exists thanks to Sourcegraph search. Even now I'm using it to find a way to enable the second monitor on M3 MacBooks without needing to close the lid [1].

I really hope this is not a sign of them taking back the ability to search in the future.

[1] https://alinpanaitiu.com/blog/turn-off-macbook-display-clams...

sqs
5 replies
8h26m

Glad you use Sourcegraph! I remember that blog post and thought it was awesome. I am the Sourcegraph CEO, and we haven't changed anything about our public code search at https://sourcegraph.com/search. That's the same product tons of customers use for their internal code, and our public code search is a really important way for us to dogfood, iterate fast, etc.

We just made our own internal codebase private.

alin23
2 replies
8h12m

Ok, so glad to hear that from you directly! Thank you for all the value you’ve put out there for free!

About the codebase part, I don’t have any need for it so I’m not affected by this, but I wonder if it was possible to keep the current state of the code frozen in a public repository and only make private the future work.

That’s how I did it on Lunar, that’s also how the BetterDisplay dev did, it was a good compromise so as to not steal anything that was already free. But of course we don’t have the same business model or licensing needs so I’m pretty sure I’m missing something.

The way I did it is: - freeze the public code to a new branch “lunar3” - make a private repo LunarPro which works exactly like the previous Lunar repo - but on every commit the private repo syncs the code in an encrypted form to the public repo

That way, permalinks remain valid, everything that was free and accessible before is still available in the future and the branch serves as a “compilable” state without any encrypted files.

But again, I’m just one and you’re many, it might get hard to maintain this structure in a team. And some people might still find things to complain about. I know it was that way for me.

sqs
1 replies
7h59m

Yes! We took a snapshot of the code and are keeping it at https://github.com/sourcegraph/sourcegraph-public-snapshot.

I see the point about keeping the old repo name so that links work. That's also mentioned in the blog post. That's a good idea. Let me chat with our team about that.

For your approach with syncing an encrypted form of the private code, why did you need/want to sync it back? Why not just keep a public snapshot as of the switchover date?

alin23
0 replies
7h45m

Oh that’s great to hear then! Good to know there’s a snapshot already.

In my case, I only encrypt the code related to Pro features. There are still plenty of free features and improvements that I add and that I know people will benefit from having them searchable (for example people learned how to use private frameworks like MonitorPanel to change resolutions and presets, how to control Night Shift from swift etc. )

And so I needed to still sync some public source code from the private repo. You might not need that, it could be as easy as moving every dev in the team to using sourcegraph/sourcegraph-private

jlokier
1 replies
3h2m

I am the Sourcegraph CEO, and we haven't changed anything about our public code search at https://sourcegraph.com/search.

But in this other comment (https://news.ycombinator.com/item?id=41298516), you said you have changed public search in two significant ways:

We did cull lots of non-GitHub repositories and repositories with less star.

Removing low-star repos (and non-GitHub high-star repos) affects users who are looking for obscure or hard-to-find information that's not found anywhere in "popular" repos. I think most of my searches on GitHub (or via Google) are for things in repos with zero stars.

If you have repositories you want us to add that are below the star threshold [..]

How would I go about finding which repos to request, if my objective is to search the "long tail" for information? That seems like I would need an automated search engine first, to discover the repos :-)

If I found the repo containing specific, obscure or hard-to-find information I was looking for, what would I gain from writing to SourceGraph asking to add that one repo? By the time I've found the right repo, I've probably found the information I'm going to get from it. Future searches will likely need a different repo, one I don't know about yet. Perhaps that's the nature of long tail searches.

mdaniel
0 replies
1h51m

I see this same problem in the search engine space: upstarts need both long-tail sites indexed, and need the ratelimit/compute to actually follow those long tail sites

I wish Presearch <https://nodes.presearch.com/> all the best because that's the world I want to live in, although their current implementation is why I'm just clapping for them and not running nodes:

> This software is currently not open source and you are relying on our assurances that nothing malicious is happening underneath the hood.

Anyway, I mention this because if (e.g.) Sourcegraph supported federated queries, you could actually run a source code indexer to help offset some of the compute, ratelimit, and storage that is presumably jamming up their board members

welder
0 replies
11h16m

I really hope this is not a sign of them taking back the ability to search in the future.

Searching repos seems to be unchanged:

https://sourcegraph.com/search

MzHN
5 replies
9h59m

They also recently(?) silently destroyed[1] their public search index at sourcegraph.com/search. Since GitHub only recently got a working search and even that is behind login, I used to search a lot using Sourcegraph. It even supported searching GitLab.

Now it seems that all GitLab repos are gone from the index and a huge number of GitHub repos as well. If I can't trust the search I'll just have no choice but to fall back to GitHub.

It's a shame since their index was at some point even better than GitHub's own, although GitHub seems to have caught up.

[1] https://community.sourcegraph.com/t/most-public-repos-no-lon...

sqs
2 replies
8h29m

We still have tons of repositories searchable at https://sourcegraph.com/search, almost a million. We did cull lots of non-GitHub repositories and repositories with less star. It was very complex to keep up with millions of repositories due to GitHub rate limits and scaling. We tried to keep as many as possible while still being able to focus on making a good product for customers (our biggest customer has ~600k repositories).

We're still spending millions of dollars annually to offer public code search, so our intent is certainly not to "destroy" it! If you have repositories you want us to add that are below the star threshold, please post at https://community.sourcegraph.com/t/most-public-repos-no-lon....

elashri
0 replies
7h39m

Most of the academic open source projects except big names in scientific computing will not be searchable if you are relying on stars as a criteria.

MzHN
0 replies
6h2m

I appreciate that it is a free service and thank you for the time it worked for me.

At the same time I am a bit sad to see my use cases break. I often resort to more advanced code search when I have really obscure problems, for which the answers might be some old GitHub (or GitLab) repositories. I'm less interested in up-to-date information for those, so a stale index is better than no index for me.

But I can also feel the pain of working with GitHub and GitLab and their rate limits and such.

notpushkin
1 replies
9h47m

https://grep.app/ is another good one. Not sure how many repos they index though.

Alifatisk
0 replies
6h26m

It says half million git repos on the main page

WesolyKubeczek
4 replies
10h54m

Can’t wait for Steve Yegge putting out a huge article about how this is a great thing and comparing it to TV shows or something.

mannycalavera42
3 replies
8h3m

oh, first time I'm hearing this sentiment against Yegge. Not a specific fan of him but curious: do you have memories / link to similar BS-like statements from this person?

WesolyKubeczek
2 replies
7h2m

Just read his blog post titled "the death of junior programmer" or something like that. Be sure to not have eaten lunch prior to that; there's so much gatekeeping of the bad kind that it's quite vomit-inducing.

dpritchett
0 replies
3h49m

I’d been a huge fan of his for a solid decade, but that post was probably the last one of his I’ll ever read.

BaculumMeumEst
0 replies
3h33m

Really? I thought the content was interesting and it aligned with a lot of my thoughts. The end has a lot of practical advice- any junior who follows will probably be extremely successful. To that end, I think the title is a little clickbaity. I prefer the half glass full view - juniors have more access to rapidly learn and improve than they have ever had before. They can blow past the sea of mediocre seniors who don't bother to keep up.

speedgoose
3 replies
10h26m

It's a bit sad. I forked ~~the last~~ an open-source version some time ago[0]. I removed the telemetry, disabled updates, removed the proprietary code, made a docker image, and implemented some lightweight oauth2/oauth2-proxy authentication.

I plan to keep it running behind Oauth2-Proxy for a long time. It has been very reliable software and because it's behind a supposedly secure proxy, I don't feel bad about not updating it.

[0] https://github.com/SINTEF/sourcegraph

notpushkin
1 replies
9h43m

Thank you for this!

I think 5.0.6 is the last open source version though. Have you considered updating? (Not sure how viable it would be – seems they've moved quite a few things around)

speedgoose
0 replies
9h38m

Oh, my memory failed me. I don't know when I will have time to update, but that sounds like something that could be done!

cdchn
0 replies
6h18m

This is awesome thank you for this.

sixhobbits
3 replies
10h44m

I used to always point to Sourcegraph as a company that really understood dev culture and what it took to make devs happy, so this slow transition has definitely been painful to watch.

Just yesterday someone asked for an example of a public roadmap for a technical product, so I spent some time looking for Sourcegraph's, only to find out that they've also made most of their docs private. The public handbook was an amazing resource before, now it's been moved to Notion, and most of the interesting bits are links to private Google documents (which they used to do only for financial documents and other stuff that obviously needed to stay private).

Sad!

iknownthing
2 replies
5h25m

I interviewed with them once, they strung me along for about 6 months then ghosted me.

mdaniel
1 replies
2h12m

As a counterpoint, they scheduled me within days and I left the office with an offer letter

I'm cognizant that company culture is not one fixed thing, so maybe they're way different when you interacted with them versus when I did, I'm just saying I had the opposite experience so I doubt it's a trend

iknownthing
0 replies
1h37m

To be clear, I had many interviews with them over that 6 months (10+ I think).

yablak
2 replies
11h25m

Sourcegraph search is amazing. I can point to any hash in our repo and search by regex/path regex. Results are instant and in json format. I hacked together a 'cs' script in bash using the sg cli client and some git calls, as I missed Google's cs command since leaving. Works perfectly, faster than ctags/any local indexer.

sqs
1 replies
7h59m

Awesome to hear! What sucks about Sourcegraph for you, and how can we make it better?

PaulCarrack
0 replies
5h43m

I use Sourcegraph for personal use with my private repos.

In the past, I've found actual bugs that I reported to Justin Dorfman and worked with him to get those bugs assigned to the right engineers so they got fixed before your enterprise customers can experience them.

1. Is there still a path for reporting issues now that the repo has gone private? The Github issue tracker can no longer be accessed or searched through to report bugs or figure out if a bug is known or being worked on.

2. Do you plan to get rid of "Sourcegraph Free" for on premise personal use?

stpn
2 replies
10h34m

As much as I've have cited, loved, and recommended sourcegraph (even going so far as to help run the open source version at a previous co), I never paid a cent for the product.

I'm curious about the line of thinking in leaving open source behind, but it seems somewhat unsurprising in that lens.

sqs
0 replies
8h19m

Thanks! We appreciate you. It was really a focus thing. It added a lot of overhead, lost focus, and risk to have stuff be open source. Most customers weren't telling us it was valuable to them, and frankly we heard very little from people who were using our open-source build. (How could we have gotten your input earlier?)

We still have a lot of open-source code, but ultimately we need to focus on building a great product and making money on it. Which we are doing. :-) As Sourcegraph CEO, I obviously wish we could do all the things, but we gotta stay focused on building a great code search/intelligence product.

PaulCarrack
0 replies
5h20m

I never paid a cent for the product

I would love to contribute and pay, but, as a single personal / private onprem user, it's impossible. It's $49 per user with a 50 user minimum.

Sourcegraph doesn't make it possible to contribute in that circumstance.

reedf1
2 replies
11h23m

What happened to sourcegraph is very sad. It was a great tool, and the kind of software you wish the apache foundation was managing.

I've been looking for alternatives - any recommendations?

notpushkin
0 replies
10h45m

For code search, I've heard Hound is pretty good but I haven't personally tried it yet. The UI is a bit clunky though. I'm wondering if one can port the old Apache-licensed Sourcegraph UI? https://github.com/hound-search/hound

mdaniel
0 replies
1h59m

I've been looking for alternatives

Just out of curiosity, is code search something you need to get bleeding edge updates? What's wrong with running the pre-rug-pull release? There even seems to be a pseudo community fork that added features: https://news.ycombinator.com/item?id=41297879

zeroCalories
1 replies
2h3m

Open source? More like trojan horse. Nothing is "open" unless it has a GNU licence.

josephcsible
0 replies
4m

Those licenses themselves aren't even sufficient to protect against this, since copyright holders don't have to follow their own licenses. To be fully safe from this kind of rug pull, the project also needs to accept substantial external contributions without a CLA, like the Linux kernel does.

throwaway984393
1 replies
5h47m

Don't start your company as open source. It will attract the wrong customers (the ones who don't give you money but sure like to complain) and detract from building your product. If you can build a successful product, open source it afterwards to give you a competitive advantage without being a drag.

mobeigi
0 replies
2h21m

Agreed. I generally don't have an issue with companies private sourcing their work when they are the dominant contributor to it. I have a much bigger problem with companies that do it that have benefited dominantly from the contributions of others. For example, imagine if the Linux kernel went private.

throwaway290
1 replies
8h7m

Preparing to get bought by Microsoft?

mdaniel
0 replies
1h49m

I dunno the underlying tone to this message, but given the absolutely staggering number of MIT licensed repos under github.com/Microsoft I don't know that such a thing would be as bad as you envision

Now that GitLab is public, that's who I'd want to buy them, and there's precedent since (a) GitLab has already swallowed quite a few open source dev-tooling companies and (b) they already have a Sourcegraph integration so it would be very low drama to just fold it into the core offering and get rid of that stupid Elasticsearch dep

EMIRELADERO
1 replies
10h55m

Straight-up making all dev work private is very weird and perplexing. Why would their business model (which they had since some time, mind you) require not only a proprietary/"open core" license, which I would understand, but complete secrecy around source code itself? What business goal couldn't be accomplished with licensing restrictions alone? And is that difference in potential income generated by this new secret-requiring business model so big that it justifies throwing away the entire "open nature" of the company that has been a core value for most of its existence?

jsiepkes
0 replies
10h7m

I've seen this multiple times with companies. Another example which went fully closed in an instant is ForgeRock (OpenAM, etc.). Usually it happens when management caves in to complaints from sales. Who will complain being open makes selling the product hard. In the end they will probably find out it's just the sales people's "excuse du-jour" and even after closing the source they still don't hit their targets.

CAP_NET_ADMIN
1 replies
7h15m

I'd like to point to the previous episode in this YA drama series:

"Sourcegraph is no longer open source" by me, from last year

https://news.ycombinator.com/item?id=36584656

PaulCarrack
0 replies
3h10m

Right when I saw this post I immediately thought of your post from last year, some great discussion there.

These days on HN, anytime I see a post about Sourcegraph my knee jerk reaction is to whince because I know it's probably not a good thing.

It will be a sad day for tech when they get rid of the on prem free version. I feel like that's the next logical thing to cut given the direction and momentum they are heading.

solarkraft
0 replies
10h45m

That seems silly. Hope there will be an official statement (apology? lol).

sluongng
0 replies
10h10m

Kinda weird because they have already relicensed the entire repo recently. I wonder what problem they are trying to solve with a private repo.

kkukshtel
0 replies
2h27m

Build cultural capital on being loud about how you're open source, vibes, and a cool podcast. Pivot aggressively and crumble to realistic business challenges when the core business model (code search) gets eaten by Github (their code search), then try to act like nothing happened while pushing the new product and gaslighting existing users as you try to convince them the actually want Cody, not the code search. Which is to say, unsurprising.

fire_lake
0 replies
9h2m

They’re probably courting a buyer.

corroclaro
0 replies
2h3m

Another group of source snakes to add to the collaboration/purchase/business blacklist. Quinn S., Beyang L., etc. are individuals happy to ride on FOSS until they're big enough to cash out. OK. Just be upfront about it. "We did this to focus" - no, you did it to make more money. Jesus, be honest - you're talking to developers, not your investors, we can smell the BS from across the Atlantic.

afro88
0 replies
7h41m

All documents were public by default. Technical and product RFCs (and later PR/FAQs) were drafted, reviewed, and catalogued in a public Google Drive folder

Does this still exist somewhere?

JZL003
0 replies
2h27m

I guess I wish it was still open but want to reiterate how appreciative I am for the public free search. It's so amazingly useful while doing CS research to search through all of github with regez that way

AYBABTME
0 replies
9h39m

The sourcegraph folks are great. I think these days is a brutal period for startups. I can only guess how things are going. Just yesterday FT.com was publishing "Start-up failures rise 60% as founders face hangover from boom years"[1].

Like Cockroach's recent relicensing, I think we should be thankful for the good years and awesome stuff the last boom era brought, and not be too harsh on the principled founders who now find themselves having to make hard decisions. They're responsible to a lot of people at the end of the day - investors but also employees. Just crashing the whole thing to make a moral statement would be dumb. Employees also count on execs to care for them.

If startups have to make hard decisions to keep things afloat, it's the right thing to do.

** I'm extrapolating a lot here from this post, for all I know things may be rosey at SourceGraph, idk!

[1]: https://www.ft.com/content/2808ad4c-783f-4475-bcda-bddc02990...