return to table of content

Cyc: History's Forgotten AI Project

blueyes
28 replies
2d19h

Cyc is one of those bad ideas that won't die, and which keeps getting rediscovered on HN. Lenat wasted decades of his life on it. Knowledge graphs like Cyc are labor intensive to build and difficult to maintain. They are brittle in the face of change, and useless if they cannot represent the underlying changes of reality.

breck
10 replies
2d18h

I think before 2022 it was still an open question whether it was a good approach.

Now it's clear that knowledge graphs are far inferior to deep neural nets, but even still few people can explain the _root_ reason why.

I don't think Lenat's bet was a waste. I think it was sensible based on the information at the time.

The decision to research it largely in secret, closed source, I think was a mistake.

xpe
4 replies
2d18h

Now it's clear that knowledge graphs are far inferior to deep neural nets

No. It depends. In general, two technologies can’t be assessed independently of the application.

og_kalu
3 replies
2d16h

Anything other than clear definitions and unambiguous axioms (which happens to be most of the real world) and gofai falls apart. Like it can't even be done. There's a reason it was abandoned in NLP long before the likes of GPT.

There aren't any class of problems deep nets can't handle. Will they always be the most efficient or best performing solution ? No, but it will be possible.

mepian
2 replies
2d15h

They should handle the problem of hallucinations then.

og_kalu
0 replies
2d13h

Bigger models hallucinate less.

and we don't call it hallucinations but gofai mispredicts plenty.

eru
0 replies
2d13h

They are working on it. And current large language models via eg transformers aren't the only way to do AI with neural networks nor are they the only way to do AI with statistical approaches in general.

Cyc also has the equivalent of hallucinations, when their definitions don't cleanly apply to the real world.

galaxyLogic
2 replies
2d7h

I assume the problem with symbolic inference is that from a single inconsistent premise logic can produce any statement possible.

If that is so then symbolic AI does not easily scale because you cannot feed inconsistent information into it. Compare this to how humans and LLMs learn, they both have no problem with inconsistent information. Yet statistically speaking humans can easily produce "useful" information.

genrilz
1 replies
2d4h

Based on the article, it seems like the Cyc had ways to deal with inconsistency. I don't know the details of how they did it, but Paraconsistant Logics [0] provide a general way to prevent any statement from being provable from an inconsistency.

[0] https://en.wikipedia.org/wiki/Paraconsistent_logic

galaxyLogic
0 replies
1d23h

Interesting, from the article:

"Mathematical framework and rules of paraconsistent logic have been proposed as the activation function of an artificial neuron in order to build a neural network"

xpe
1 replies
2d18h

but even still few people can explain the _root_ reason why.

_The_ (one) root reason? Ok, I’ll bite.

But you need to define your claim. What application?

breck
0 replies
1d2h

_The_ (one) root reason? Ok, I’ll bite.

A "secret" hiding in plain sight.

thesz
7 replies
2d18h

Lenat was able to produce superhuman performing AI in the early 1980s [1].

[1] https://voidfarer.livejournal.com/623.html

You can label it "bad idea" but you can't bring LLMs back in time.

goatlover
6 replies
2d15h

Why didn't it ever have the impact that LLMs are having now? Or that DeepMind has had? Cyc didn't pass the Turing Test or become superhuman chess and go players. Yet it's had much more time to become successful.

eru
2 replies
2d13h

I'm very skeptical of Cyc and other symbolic approaches.

However I think they have a good excuse for 'Why didn't it ever have the impact that LLMs are having now?': lack of data and lack of compute.

And it's the same excuse that neural networks themselves have: back in those days, we just didn't have enough data, and we didn't have enough compute, even if we had the data.

(Of course, we learned in the meantime that neural networks benefit a lot from extra data and extra compute. Whether that can be brought to bear on Cyc-style symbolic approaches is another question.)

thesz
1 replies
2d11h

Usually, LLM's output gets passed through beam search [1] which is as symbolic as one can get.

[1] https://www.width.ai/post/what-is-beam-search

It is possible to even have 3-gram model to output better text predictions if you combine it with the beam search.

thesz
0 replies
2d11h

I guess it is funding. Compare funding of Google+Meta to what was/is available to Cyc.

Cyc was able to produce an impact, I keep pointing to MathCraft [1] which, at 2017, did not have a rival in the neural AI.

[1] https://en.wikipedia.org/wiki/Cyc#MathCraft

riku_iki
0 replies
2d

Why didn't it ever have the impact that LLMs are having now?

Google has its own Knowledge Graph, with billions of daily views, which is wider but more shallow version of Cyc. It is unclear if LLM user facing impact surpassed that project.

Barrin92
0 replies
2d11h

Because the goal of Cyc was always to build a genuinely generally intelligent common sense reasoning system and that is a non-sexy long term research project. Being great at chess isn't a measure that's relevant for a project like this, Stockfish is superhuman at chess.

zopf
5 replies
2d19h

I wonder to what degree an LLM could now produce frames/slots/values in the knowledge graph. With so much structure already existing in the Cyc knowledge graph, could those frames act as the crystal seed upon which an LLM could crystallize its latent knowledge about the world from the trillions of tokens it was trained upon?

tkgally
4 replies
2d19h

I had the same thought. Does anybody know if there have been attempts either to incorporate Cyc-like graphs into LLM training data or to extend such graphs with LLMs?

radomir_cernoch
1 replies
2d18h

From time to time, I read articles on the boundary between neural nets and knowledge graphs like a recent [1]. Sadly, no mention of Cyc.

I'd bet, judging mostly from my failed attempts at playing with OpenCyc around 2009, is that the Cyc has always been too closed and to complex to tinker with. That doesn't play nicely with academic work. When people finish their PhDs and start working for OpenAI, they simply don't have Cyc in their toolbox.

[1] https://www.sciencedirect.com/science/article/pii/S089360802...

viksit
0 replies
2d18h

oh i just commented elsewhere in the thread about our work in integrating frames and slots into LSTMs a few years ago! second this.

thesz
0 replies
2d17h

There are lattice-based RNNs applied as language models.

In fact, if you have a graph and a path-weighting model (RNN, TDCNN or Transformer), you can use beam search to evaluate paths through graphs.

mike_hearn
0 replies
2d9h

The problem is not one of KB size. The Cyc KB is huge. The problem is that the underlying inferencing algorithms don't scale whereas the transformer algorithm does.

xpe
0 replies
2d18h

The comment above misses the point in at least four ways. (1) Being aware of history is not the same as endorsing it. (2) Knowledge graphs are useful for many applications. (3) How narrow of a mindset and how much hindsight bias must one have to claim that Lenat wasted decades of his life? (4) Don’t forget to think about this in context about what was happening in the field of AI.

richardatlarge
0 replies
2d8h

We are living in the future /

I'll tell you how I know /

I read it in the paper /

Fifteen years ago -

(John Prine)

mindcrime
0 replies
2d

They are brittle in the face of change, and useless if they cannot represent the underlying changes of reality.

FWIW, KG's don't have to be brittle. Or, at least they don't have to be as brittle as they've historically been. There are approaches (like PROWL[1]) to making graphs probabilistic so that they're asserting subjective beliefs about statements, instead of absolute statements. And then the strength of those beliefs can increase or decrease in response to new evidence (per Bayes Theorem). Probably the biggest problem with this stuff is that it tends to be crazy computationally expensive.

Still, there's always the chance of an algorithmic breakthrough or just hardware improvements bringing some of this stuff into the real of practical.

[1]: https://www.pr-owl.org/

mepian
23 replies
2d21h

I wonder what is the closest thing to Cyc we have in the open source realm right now. I know that we have some pretty large knowledge bases, like Wikidata, but what about expert system shells or inference engines?

observationist
9 replies
2d20h

OWL and SPARQL inference engines that use RDF and DSMs - there are LISPy variants like datadog still kicking around, but there are some great, high performance reasoner FOSS projects, like StarDog or Neo4j

https://github.com/orgs/stardog-union/

Looks like Knowledge Graph and semantic reasoner are the search terms du'jour, I haven't tracked these things since OpenCyc stopped being active.

Humans may not be able to effectively trudge through the creation of trillions of little rules and facts needed for an explicit and coherent expert world model, but LLMs definitely can be used for this.

zozbot234
6 replies
2d20h

You can actually do "inference" or "deduction" over large amounts of data using any old-fashioned RDBMS, and get broadly equal or better performance than the newfangled "graph" based systems. Graph databases may be a clear win for very specialized network analysis, but that is rare in practice.

p_l
5 replies
2d20h

Graph databases win in flexibility and ease of writing the queries over graphs, honestly.

Of course the underlying storage can be (and often is) a bunch of specially prepared relational tables.

But the strength in graph databases comes from restating the problem in different way, with query languages targeting the specific problem space.

Similarly there are tasks where SQL will be plainly better.

totetsu
2 replies
2d19h

Every time I try to write a query for GitHub’s graphql API I lose a few hours and go back to rest. May be it’s easy if all the edges and inputs are actually implemented in ways you would expect.

p_l
1 replies
2d19h

GraphQL isn't exactly a proper graph database query language. The name IIRC comes from Facebook Graph API, and the language isn't actually designed as graph database interface.

totetsu
0 replies
2d10h

Thanks for the correction

zozbot234
1 replies
2d20h

ease of writing the queries over graphs, honestly

The SQL standard now includes syntactic sugar for 'Property Graph Query'. Implementations are still in the works AIUI, but can be expected in the reasonably near future.

p_l
0 replies
2d19h

Having seen PGQL, I think I'll stay with SPARQL.

And for efficient implementation the database underneath still needs to have extended graph support (in fact, I find it hilarious that Oracle seems to be spearheading it, as they have previously canceled their graph support around 2012 - enough that I wrote about how it was deprecated and removed from support in my thesis in 2014.

riku_iki
0 replies
2d13h

high performance reasoner FOSS projects, like StarDog or Neo4j

StarDog is not FOSS, that github repo is for various utils around their proprietary package in my understanding, actual engine code is not open source.

justinhj
0 replies
2d9h

Did you mean Datalog here?

breck
3 replies
2d18h

I tried to make something along these lines (https://truebase.treenotation.org/).

My approach, Cyc's, and others are fundamentally flawed for the same reason. There's a low level reason why deep nets work and symbolic engines are very bad.

eschaton
2 replies
1d12h

And what is that reason?

breck
1 replies
1d2h

The language before language.

eschaton
0 replies
22h55m

Care to expand on this, provide evidence, or even pointers to what you mean by it?

tunesmith
2 replies
2d18h

At a much lower level, I've been having fun hacking away at my Concludia side project over time. It's purely proposition level and will eventually support people being able to create their own arguments and contest others. http://concludia.org/

sundarurfriend
0 replies
2d14h

Nice! I've wanted to build something like this for a long time. It requires good faith argument construction from both parties, but it's useful to make the possibility available when you do find the small segment of people who can do that.

jnwatson
0 replies
2d11h

Very cool! I've had this idea for 20 years. I'm glad I didn't get around to making it.

gumby
2 replies
2d20h

Err…OpenCyc?

mindcrime
0 replies
2d14h

Yep. And it may be just a subset, but it's pretty much the answer to

"I wonder what is the closest thing to Cyc we have in the open source realm right now?".

See:

https://github.com/therohk/opencyc-kb

https://github.com/bovlb/opencyc

https://github.com/asanchez75/opencyc

Outside of that, you have the entire world of Semantic Web projects, especially things like UMBEL[1], SUMO[2], YAMATO[3], and other "upper ontologies"[4] etc.

[1]: https://en.wikipedia.org/wiki/UMBEL

[2]: https://en.wikipedia.org/wiki/Suggested_Upper_Merged_Ontolog...

[3]: https://ceur-ws.org/Vol-2050/FOUST_paper_4.pdf

[4]: https://en.wikipedia.org/wiki/Upper_ontology

Rochus
0 replies
2d20h

Unfortunately no longer (officially) available; and only a small subset anyway.

nextos
1 replies
2d20h

There are some pretty huge ontology DBs in molecular biology, like GO or Reactome.

But they have never truly exploited logic-based inference, except for some small academic efforts.

jerven
0 replies
2d9h

I think that GO with GO-CAM is definitely going that way. Basic GO is rather simple and can't infer that much (as in GO by itself has low classification or inference logic build in). Uberon, for anatomy, does use a lot of OWL power and shows that the logic-based inference can help a lot.

Reactome, is a graph, because that is the domain. But technically it does little with that fact (In my disappointed opinion).

Given that GO and Reactome are also relatively small academic efforts in general...

niviksha
0 replies
2d20h

There are a few symbolic logic entailment engines that run atop OWL the Web Ontology Language, some flavors of which which are rough equivalent of Cycs KBs. The challenge though is the underlying approaches are computationally hard so nobody really uses them in practice, plus the retrieval language associated with OWL is SPARQL which also has little traction.

gumby
13 replies
2d20h

This is a pretty good article.

I was one of the first hires on the Cyc project when it started at MCC and was at first responsible for the decision to abandon the Interlisp-D implementation and replace it with one I wrote on Symbolics machines.

Yes, back then one person could write the code base, which has long since grown and been ported off those machines. The KB is what matters anyway. I built it so different people could work on the kb simultaneously, which was unusual in those days, even though cloud computing was ubiquitous at PARC (where Doug had been working, and I had too).

Neurosymbolic approaches are pretty important and there’s good work going on in that area. I was back in that field myself until I got dragged away to work on the climate. But I’m not sure that manually curated KBs will make much of a difference beyond bootstrapping.

pfdietz
7 replies
2d20h

Is Cyc still implemented in Lisp?

pfdietz
5 replies
2d19h

Interesting that they're still using Allegro Common Lisp. I would be interested in knowing what technical issues (if any) prevented them from migrating to other implementations.

guenthert
1 replies
2d11h

Out of curiosity, which implementation(s) did you have in mind and why would it be desirable to migrate a large project there?

pfdietz
0 replies
2d8h

Steel Bank Common Lisp is the most performant (by speed of compiled code) and stable open source implementation, and arguably the most standard compliant and bug free of any Common Lisp implementation, free or otherwise. It also is available free of charge.

I mostly wanted to know of any technical obstacles so SBCL could be improved. If I had to wildly guess, maybe GC performance? SBCL was behind ACL on that many years ago (on both speed and physical memory requirements) the last time I made a comparison.

lispm
0 replies
2d11h

Maybe it's not "technical issues", but features and support? Allegro CL has a proven GUI toolkit, for example, and now they moved it into the web browser.

FYI: here are the release notes of the recently release Allegro CL 11.0: https://franz.com/support/documentation/current/release-note...

IIRC, Cyc gets delivered on other platforms&languages (C, JVM, ... ?). Would be interesting to know what they use for deployment/delivery.

jonathankoren
0 replies
2d12h

Allegro has an amazing UI and debugger. Like seriously, every alternative is janky experience that should be embarrassed to exist.

dreamcompiler
0 replies
2d15h

Franz has a nice RDF triple store. No idea if Cyc uses it but if so that could be a factor.

guenthert
2 replies
2d12h

which was unusual in those days, even though cloud computing was ubiquitous at PARC

I don't want to rob you of your literary freedom, but that threw me off. Mainframes were meant, yes?

gumby
1 replies
2d6h

Mainframes were meant, yes?

No not at all. We’re talking early-mid 1980s so people in the research community (at least at the leading institutions) were by then pretty used to what’s called cloud computing these days. In fact the term “cloud” for independent resources you could call upon without knowing the underlying architecture came from the original Internet papers (talking originally about routing, and then the DNS) in the late 70s

So for example the mail or file or other services at PARC just lived in the network; you did the equivalent of an anycast to check your mail or look for a file. These had standardized APIs so it didn’t matter if you were running Smalltalk, Interlisp-D, or Cedar/Mesa you just had a local window into a general computing space, just as you do today.

Most was on the LAN, of course, as the ARPANET was pretty slow. But when we switched to TCP/IP the LAN/WAN boundaries became transparent and instead of manually bouncing through different machines I could casually check my mail at MIT from my desk at PARC.

Lispms were slightly less flexible in this regard back then, but then again Ethernet started at PARC. But even in the late 70s it wasn’t weird to have part of your computation run on a remote machine you weren’t logged into interactively.

The Unix guys at Berkeley eventually caught up with this (just look at the original sockets interface, very un-unixy) but they didn’t quite get it: I always laughed when I saw a sun machine running sendmail rather than trusting the network to do the right thing on its behalf. By the time Sun was founded that felt paleolithic to me.

Because I didn’t start computing until the late 70s I pretty much missed the whole removable media thing and was pretty much always network connected.

p_l
0 replies
1d19h

Sockets were essentially a crash development program to deal with TOPS-20 being discontinued by DEC

stcredzero
0 replies
2d2h

I was one of the first hires on the Cyc project when it started at MCC and was at first responsible for the decision to abandon the Interlisp-D implementation and replace it with one I wrote on Symbolics machines.

Yes, back then one person could write the code base

A coworker of mine who used to work at Symbolics told me that this was endemic with Lisp development back in the day. Some customers would think there was a team of 300 doing the OS software at Symbolics. It was just 10 programmers.

m463
0 replies
1d16h

I remember reading about cyc.

I had learned about "AI" in the 80's. The promise was that with lisp and expert systems and prolog and more.

the article said cyc was reading the newspaper every day.

I thought, wow, any day now computers will leap forward. The japanese 5th generation computing will be left in the dust. :)

blacklion
8 replies
2d5h

I was born in late USSR and my father is software engineer. We had several books that were not available for "general public" (they were intended for libraries of science institutions). One of the book was, as I understand now, abridged translation of papers from some "Western" AI conference.

And there were description if EURISCO (with claims that it not only "win some game" but also that it "invented new structure of NAND-gate in silicon, used by industry now") and other expert systems.

One of the mentioned expert systems (without technical details) said was 2 times better in diagnose cancer than best human diagnostician of some university hospital.

And after that... Silence.

I always wonder, why did this expert system were not deployed in all USA hospitals, for example? If it is so good?

Now we have LLMs, but they are LANGUAGE models, not WORLD models. They predict distribution of possible next words. Same with images — pixels, not world concepts.

Looks like such systems are good for generating marketing texts, but can not be used as diagnosticians by definition.

Why did all these (slice of) world model approaches dead? Except Cyc, I think. Why we have good text generators and image generators but not diagnosticians 40 years later? What happens?..

og_kalu
2 replies
2d3h

Looks like such systems are good for generating marketing texts, but can not be used as diagnosticians by definition.

That's not true

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10425828/

Why did all these (slice of) world model approaches dead?

Because they don't work

hello_computer
0 replies
2d1h

Your article is a review of one language model (GPT-4) in diagnostics, while the original comment was an inquiry about world models.

eschaton
0 replies
1d13h

How do you know?

hello_computer
0 replies
2d1h

I've read similar things about image models from 12 years ago beating the pants off most radiologists. I think the difference is that most writers, illustrators, musicians, drivers, etc. eke out a marginal living, while radiologists have enough reserves to fight back. The "move fast and break things" crowd in silicon valley isn't going to undertake that fight while there's still so much low-hanging fruit, ripe for the harvest.

chris_st
0 replies
2d3h

I started my career in 1985, building expert systems on Symbolics Lisp machines in KEE and ART.

Expert systems were so massively oversold... and it's not at all clear that any of the "super fantastic expert" systems ever did what was claimed of them.

We definitely found out that they were, in practice, extremely difficult to build and make do anything reasonable.

The original paper on Eurisko, for instance, mentioned how the author (and founder of Cyc!) Douglas Lenat, during a run, went ahead and just hand-inserted some knowledge/results of inferences (it's been a long while since I read the paper, sorry), asserting, "Well, it would have figured these things out eventually!"

Later on, he wrote a paper titled, "Why AM and Eurisko appear to work" [0].

0: https://aaai.org/papers/00236-aaai83-059-why-am-and-eurisko-...

brain_elision
0 replies
2d4h

One of the first things software engineers learn is that people are bad at manually building models/programming.

The language and image models weren't built by people but by observing an obscene amount people going about their daily lives of producing text and images.

BlueTemplar
0 replies
2d1h

Cyc was used by the Cleveland Clinic for answering ad hoc questions from medical researchers; it reduced the time from as long as a month of manual back-and-forth between medical and database experts, to less than an hour.
Rochus
8 replies
2d20h

Interesting article, thanks.

Perhaps their time will come again.

That's pretty sure, as soon as the hype about LLMs has calmed down. I hope that Cyc's data will then still be available, ideally open-source.

https://muse.jhu.edu/pub/87/article/853382/pdf

Unfortunately paywalled; does anyone have a downloadable copy?

og_kalu
5 replies
2d16h

the hype about LLMs has calmed down

The hype of LLMs is not the reason the likes of Cyc have been abandoned.

Rochus
4 replies
2d15h

It's not "abandoned"; it's just that most money today goes into curve fitting; but there are features which can be better realized with production systems, e.g. things like explainability or causality.

og_kalu
3 replies
2d13h

it's just that most money today goes into curve fitting

It's pretty interesting to see comments like this like deep nets weren't the underdog for decades. You think they were first choice ? The creator of cyc spent decades on it, and he's dead. We use modern NNs today because they just work that much better.

Gofai was abandoned in NLP long before the likes of GPT because non deep-net alternatives just sucked that much. It has nothing to do with any recent LLM hype.

If the problem space is without clear definitions and unambiguous axioms then non deep-net alternatives fall apart.

eru
1 replies
2d13h

If the problem space is without clear definitions and unambiguous axioms then non deep-net alternatives fall apart.

I'm not sure deep-nets are the key here. I see the key as being lots of data and using statistical modeling. Instead of trying to fit what's happening into nice and clean black-and-white categories.

Btw, I don't even think Gofai is all that good at domains with clear definitions and unambiguous axioms: it took neural nets to beat the best people at the very clearly defined game of Go. And neural net approaches have also soundly beaten the best traditional chess engines. (Traditional chess engines have caught up a lot since then. Competition is good for development, of course.)

I suspect part of the problem for Gofai is that all the techniques that work are re-labelled to be just 'normal algorithms', like A* or dynamic programming etc, and no longer bear the (Gof) AI label.

(Tangent: that's very similar to philosophy. Where every time we turn anything into a proper science, we relabel it from 'natural philosophy' to something like 'physics'. John von Neumann was one of these recent geniuses who liberated large swaths of knowledge from the dark kludges of the philosophy ghetto.)

radomir_cernoch
0 replies
2d12h

I very much agree about the A* idea, but this idea

Tangent: that's very similar to philosophy.

doesn't click with me. Maybe, could your elaborate a bit, or provide an example, please?

Rochus
0 replies
2d8h

Curve-fitting has demonstrated impressive results, but it's definitely not the end of science.

Rochus
0 replies
2d18h

Great, thank you very much!

ragebol
4 replies
2d12h

Are there any efforts to combining a knowledge base like Cyc together with LLMs and the like? Something like RAG could benefit I suppose.

Have some vector for a concept match a KB entry etc, IDK :).

mindcrime
2 replies
2d3h

Are there any efforts to combining a knowledge base like Cyc together with LLMs and the like?

Yes. It's something I've been working on, so there's at least 1 such effort. And I'm reasonably sure there are others. The idea is too obvious for there to not be other people pursuing it.

ragebol
1 replies
2d3h

Too obvious indeed. Can I read anywhere about what you're working on? What other approaches exist?

mindcrime
0 replies
2d3h

Can I read anywhere about what you're working on?

Not yet. It's still early days.

What other approaches exist?

Loosely speaking, I'd say this entire discussion falls into the general rubric of what people are calling "neuro-symbolic AI". Now within that there are a lot of different ways to try and combine different modalities. There are things like DeepProbLog, LogicTensorNetworks, etc.

For anybody who wants to learn more, consider starting with:

https://en.wikipedia.org/wiki/Neuro-symbolic_AI

and the videos from the previous two "Neurosymbolic Summer School" events:

https://neurosymbolic.github.io/nsss2023/

https://www.neurosymbolic.org/summerschool.html (2022)

brendonwong
0 replies
1d16h

This is a significant part of my vision for Web 10! https://www.web10.ai/p/web-10-in-under-10-minutes

One of the immediate things I'm working on is a text to knowledge graph system. Yohei (creator of BabyAGI) is also working on text to knowledge graphs: https://twitter.com/yoheinakajima/status/1769019899245158648. LlamaIndex has a basic implementation.

This isn't quite connecting the system to an automated reasoner though. There is some research in this area, like: https://news.ycombinator.com/item?id=35735375

Cyc + LLMs is vaguely related to more advanced "cognitive architectures" for AI, for instance see the world model in Davidad's architecture, which LLMs can be used to help build: https://www.lesswrong.com/posts/jRf4WENQnhssCb6mJ/davidad-s-...

TrevorFSmith
4 replies
2d17h

Has Cyc been forgotten? Maybe it's unknown to tech startup hucksters who haven't studied AI in any real way but it's a well known project among both academic and informed industry folks.

nextos
3 replies
2d16h

Probably. IMHO there is a lot of low-hanging fruit for startups in the field of symbolic AI applied to biology and medicine.

Bonus points if that is combined with modern differentiable methods and SAT/SMT, i.e. neurosymbolic AI.

riku_iki
1 replies
2d14h

IMHO there is a lot of low-hanging fruit for startups in the field of symbolic AI applied to biology and medicine.

I think the issue in this area is mostly to convince and sell to bureaucratic institutions.

nextos
0 replies
19h36m

I was thinking more in terms of a pharma startup.

jimmySixDOF
0 replies
2d14h

There is MindAptive who have something about symbolics as a kind of machine language interface that I think went the other way as in trying to do everything under the sun but its the last time I came across anything reminding me of Cyc

https://mindaptiv.com/intro-to-wantware/

shrubble
3 replies
2d19h

Back in the mid 1990s Cyc was giving away their Symbolics machines and I waffled on spending the $1500 in shipping to get them to me in Denver. In retrospect I should have, of course!

markc
2 replies
2d17h

Probably could have driven round trip for under $500!

shrubble
1 replies
2d13h

I had a 2 door VW Golf... would have needed more space!

galaxyLogic
0 replies
2d8h

You also would have needed more money for the power-bill

chx
3 replies
2d19h

deleted.

anyfoo
2 replies
2d19h

If Susan goes shopping does her go with her?

Is there a typo in that question? Because this does not parse as a sentence in any way for me, and if that's part of the point, I don't understand how. How would a toddler answer this question (except for looking confused like an adult, or maybe making up some nonsense to go with it)?

Happy to be told how I missed the obvious!

(EDIT: By the way, I don't know why you got downvoted, I certainly didn't.)

chx
1 replies
2d18h

deleted

anyfoo
0 replies
2d18h

Oh, I think just fixing the original comment would quite likely have netted you upvotes, if there wasn't anything else wrong with your comment. I think downvoting because of an obvious typo is pretty dumb in the first place.

astrange
3 replies
2d11h

I have a vague memory in the 90s of a website that was trying to collect crowdsourced somewhat-structured facts about everything that would be used to build GOFAI.

Was trying to find it the other day and AI searches suggested Cyc; I feel like that's not it, but maybe it was? (It definitely wasn't Everything2.)

astrange
0 replies
2d

GAC! Yeah, it was Mindpixel.

I guess that's what happens when you learn too many facts.

toisanji
2 replies
2d20h

I would love to see a Cyc 2.0 modeled in the age of LLMs. I think it could be very powerful, especially to help deal with hallucinations. I would love to see a causality engine built with LLMs and Cyc. I wrote some notes on it before ChatGPT came out: https://blog.jtoy.net/understanding-cyc-the-ai-database/

ImHereToVote
1 replies
2d8h

I used to volunteer inputting data into Cyc back in the day. And I get massive déjà vu with current LLM's. I remember that the system ended up with an obsession with HVAC systems lol.

chris_st
0 replies
2d3h

When I got to go to Cycorp in the late 80's for training, I had some really interesting talks with the people there. They got funding from a lot of sources, and of course each source needed their own knowledge encoded. One person mentioned that they had a fairly large bit of the knowledge base filled with content about military vehicles.

radomir_cernoch
0 replies
2d12h

OMG, what an archeological discovery!

mietek
0 replies
2d

Wow! Thank you for mentioning this!

mtraven
1 replies
2d19h

I worked on Cyc as a visiting student for a couple of summers; built some visualization tools to help people navigate around the complex graph. But I never was quite sold on the project, some tangential learnings here: https://hyperphor.com/ammdi/alpha-ontologist

optimalsolver
0 replies
2d10h

if they could not come to a consensus, would have to take it before the Master, Doug Lenat, who would think for a bit, maybe draw some diagrams on a whiteboard, and come up with the Right Representation

So looks like Cyc did have to fall back on a neural net after all (Lenat's).

viksit
0 replies
2d18h

from 2015-2019 i was working on a bot company (myra labs) where we were directly inspired by cyc to create knowledge graphs and integrate into LSTMs.

the frames, slots and values integrated were learned via a RNN for specific applications.

we even created a library for it called keyframe (modeling it after having the programmer specify the bot action states and have the model figure out the dialog in a structured way) - similar to how keyframes in animation work.

it would be interesting to resurrect that in the age of LLMs!

ultra_nick
0 replies
2d17h

I interviewed with them in 2018. They're still kicking as far as I know. They asked me recursive and functional programming questions.

I wonder if they've adopted ML yet.

nikolay
0 replies
2d18h

No, Cyc has not been forgotten; Eurisko has been!

mindcrime
0 replies
2d3h

One thing the article doesn't really speak to: the future of Cyc now that Doug Lenat has passed away. Obviously a company can continue on even after the passing of a founder, but it always felt like Cyc was "Doug's baby" to a large extent. I wonder if the others that remain at Cycorp will remain as committed without him around leading the charge?

Does anybody have any insights into where things stand at Cycorp and any expected fallout from the world losing Doug?

carlsborg
0 replies
2d9h

The Cyc project proposed the idea of software "assistants" : formally represented knowledge based on a shared ontology, reasoning systems that can draw on that knowledge, handle tasks and anticipate the need to perform them.[1]

The lead author on [1] is Kathy Panton who has no publications after that and zero internet presence as far as i can tell.

[1] Common Sense Reasoning – From Cyc to Intelligent Assistant https://iral.cs.umbc.edu/Pubs/FromCycToIntelligentAssistant-...

bshanks
0 replies
1d22h

The first thing to do is to put LLMs to work to generate large knowledge bases of commonsense knowledge in symbolic machine-readable formats that Cyc-like projects can consume.

bilsbie
0 replies
2d1h

Could cyc data be used as an anti hallucination tool for LLM’s?

Or for quality checks during training?

acutesoftware
0 replies
2d13h

Cyc seemed to be the best application for proper AI in my opinion - all the ML and LLM tricks are statistically really good, but you need to parse it through Cyc to check for common sense.

I am really pleased they continue to work on this - it is a lot of work, but it needs to be done and checked manually, once done the base stuff shouldn't change much and it will be a great common sense check for generated content.

PeterStuer
0 replies
2d2h

Cyc was the last remaining GOFAI champion back in the day when everyone in AI was going the 'Nouvelle AI' route.

Eventually the approach would be rediscovered (but not recuperated) by the database field desparate for 'new' research topics.

We might see a revival now that transformets can front and backend the hard edges of the knowledge based tech, but it will remain to be seen wether scaled monolyth systems like Cyc are the right way to pair.

HarHarVeryFunny
0 replies
2d5h

Cyc was an interesting project - you might consider it as the ultimate scaling experiment in expert systems. There seemed to be two ideas being explored - could you give an expert system "common sense" by laboriously hand-entering in the rules for things we, and babies, learn by everyday experience, and could you make it generally intelligent by scaling it up and making the ruleset comprehensive enough.

Ultimately it failed, although people's opinions may differ. The company is still around, but from what people who've worked there have said, it seems as if the original goal is all but abandoned (although Lenat might have disagreed, and seemed eternally optimistic, at least in public). It seems they survive on private contracts for custom systems premised on the power of Cyc being brought to bear, when in reality these projects could be accomplished in simpler ways.

I can't help but see somewhat of a parallel between Cyc - an expert system scaling experiment, and today's LLMs - a language model scaling experiment. It seems that at heart LLMs are also rule-based expert systems of sorts, but with the massive convenience factor of learning the rules from data rather than needing to have the rules hand-entered. They both have/had the same promise of "scale it up and it'll achieve AGI", and "add more rules/data and it'll have common sense" and stop being brittle (having dumb failure modes, based on missing knowledge/experience).

While the underlying world model and reasoning power of LLMs might be compared to an expert system like Cyc, they do of course also have the critical ability to input and output language as a way to interface to this underlying capability (as well as perhaps fool us a bit with the ability to regurgitate human-derived surface forms of language). I wonder what Cyc would feel like in terms of intelligence and reasoning power if one somehow added an equally powerful natural language interface to it?

As LLMs continue to evolve, they are not just being scaled up, but also new functionality such as short term memory being added, so perhaps going beyond expert system in that regard, although there is/was also more to Cyc that just the massive knowledge base - a multitude of inference engines as well. Still, I can't help but wonder if the progress of LLMs won't also peter out, unless there are some fairly fundamental changes/additions to their pre-trained transformer basis. Are we just replicating the scaling experiment of Cyc, just with a fancy natural language interface?