Bard is now Gemini, and we’re rolling out a mobile app and Gemini Advanced

I'm surprised they got rid of the Bard name. It struck me as a really smart choice since a Bard is someone who said things, and it's an old/archaic enough word to not already be in a zillion other names.

Gemini, on the other hand, doesn't strike me as particularly relevant (except that perhaps it's a twin of ChatGPT?), and there are other companies with the same name. EDIT: I can see the advantage of picking a name that, like "Google" also starts with a "G".

Just as one data point, bard.com redirects to some other company (bd.com), whereas Gemini.com is a company by that name.

I'd be curious on the scuttlebutt on how this decision was reached!

I'm not surprised -- I thought Bard was terrible branding. It's all associations with Shakespeare and poetry and medieval England, and as much as I might personally enjoy those, it's extremely backwards-looking, with archaic connotations. Also it sounds close to "beard" -- hairy stuff.

Gemini sounds like the space program -- futuristic, a leap for mankind. It's got all the right emotional associations. It's a constellation, it's out in space, it's made of stars. Plus it contains "gem" which feels fancy, valuable, refined.

I'm not saying Gemini is the best name I've ever heard or even close to it, but it feels 100% appropriate, in a way that Bard does not.

Gem-in-eye? Ouch!

Also, Gemini was appropriate for the space program because (a) there were two astronauts in the capsule and (b) because of the constellation, "aiming for the stars" and all that. For the Google project however I can't come up with a plausible explanation - Google doesn't even try to give a reason for the name either.

I mean it makes sense to me. The AI is your digital assistant. It's a relationship between two minds, man and machine.

I see that angle, but those two things are complementary, not identical. It's not a clone of me — it's something that I ask questions of because I don't know the answer. If it were pitched as a bot that would draft email responses for me automatically, then maybe I would see it fitting better as my 'twin'.

it's something that I ask questions of because I don't know the answer

I think you're reading too much into what a twin is. It's not a copy! Real-life twins ask each other questions all the time, because just because one of them learns something doesn't mean the other one automatically learns it too via mind-meld.

I'm not saying all twins are identical. But they are all of the same species. What I want in an assistant is that it is very different from me. It has perfect memory and knows lots of things that I don't know.

You're using an overly strict definition of the word and over-interpreting it to boot. Consider this definition:

something containing or consisting of two matching or corresponding parts.

It sounds like you're thinking of the adjective form of the word, which is why you are thinking of a much broader definition. I was using the noun form, since Gemini are noun twins.

If we were talking about the word "twins" in the abstract, the broader definition might make sense. But we aren't — we're talking about Gemini. If that conjures up general notions of "matched-ness" for you, that's great. When I think about Gemini, I think about mythological twins. I don't think about corresponding parts that complement each other.

For a product name to be successful, it should appeal to a wide range of people. If I'm way out in left field on this one, perhaps they've found a great name. But I would point out that my comment, which is critical of the name change, is the very first comment in the entire thread. I would take that as evidence that most people don't see the name and think "oh, it's like complementary items, like my brain and the AI".

To be clear, I was shocked to see this comment above all substantive discussion of the new release. I would have thought it would have been buried under examples of ways in which Ultra is better/worse than some other LLM.

For a product name to be successful, it should appeal to a wide range of people.

Honestly, Google is called "Google". ChatGPT is called "ChatGPT". Maybe it'll be a joke, maybe people won't think about what they're calling it after 30 seconds.

This conversation is taking itself a bit too seriously for what's drifting into Pepsi logo gravitational pull territory, though.

But I would point out that my comment, which is critical of the name change, is the very first comment in the entire thread.

Sure. Everyone has an opinion on what color the bike shed should be, too.

From The Decoder: >In April 2023, Alphabet announced the merger of its two AI units, Google Brain and Deepmind. The resulting Google Deepmind was to focus on developing large multimodal AI models. It was a big move that showed how much pressure Google was under due to the massive success of ChatGPT. Jeff Dean, head of Google Brain until the merger with Deepmind, became the new merger's chief scientist, with a direct line to Alphabet CEO Sundar Pichai. Dean now explains that the name Gemini, Latin for "twin," is directly related to the merger.

From Jeff Dean's Twitter:

Gemini is Latin for "twins".

The Gemini effort came about because we had different teams working on language modeling, and we knew we wanted to start to work together. The twins are the folks in the legacy Brain team (many from the PaLM/PaLM-2 effort) and the legacy DeepMind team (many from the Chinchilla effort) that started to work together on the ambitious multimodal model project we called Gemini, eventually joined by many people from all across Google. Gemini was also was the Nasa project that was the bridge to the moon between the Mercury and Apollo programs.

The Decoder article - https://the-decoder.com/how-googles-gemini-ai-model-got-its-...

Jeff Dean's Twitter Post - https://twitter.com/JeffDean/status/1733580264859926941

Bicameral would have been better, IMO

It's more on the nose but probably less right from a marketing perspective.

During Project Gemini, it was pronounced it Gem-in-ee.

https://youtu.be/JeAUx6-vSmc?feature=shared

Which is closer to the proper Latin pronunciation (which would have a hard G although ecclesiastical Latin would have a soft G).

Interesting. I don’t like the name at all because it makes me think of people who take horoscopes seriously. You’re impression seems to be untainted by that which is nice

Same here, I think I'm more on your side which I guess goes to show how all over the map subjective reactions can be.

But first of all, I thought the whole idea of alphabet was a kind of cheeky way of telling the world you had a portfolio of projects, one for each letter, And B is for bard would be perfect, and Gemini is about as incompatible as it gets given that g is claimed.

I also find it bizarre to say that association with Shakespeare, or the association with whimsical poetic expression is in any sense a bad thing. It's a clean, simple, fun name that's remarkably short and surprisingly unclaimed. And I don't even strongly associate it specifically with Shakespeare, that's like a background association as far as I'm concerned.

I think perhaps the real talk here is that Bard was kind of an emergency response to chat GPT, but also people have some pretty specific and distinct experiences with Bard and have an idea of its quality, and Google just needs to turn the page on the perception of Bard.

The name bard is tainted by ridicule.

Besides that, personally I always thought it was a bad fit. It sounds old and outdated to those that do not know what the word means and wrong to those that do: a bard sings songs and maybe does poetry.

A bard does not help or assist you. A bard can be a creative person, but is generally not considered especially wise or knowledgeable. A bard is also always a man, which does not gel very well with modern sensibilities.

I can see why they dropped it.

Ugh...such an Aquarian thing to say. /s

Funny, my mind goes to NASA and the Gemini program.

When I think of Gemini, I think of a digital twin, something which can work with what I have on Google, think like me, or do the things the way I do.

Bard was nice, too, but it was like another thing, separated from me. Gemini sounds more cooperative.

I don’t like the name at all because it makes me think of people who take horoscopes seriously.

spoken like a true Sagittarius

The alliteration beauty of Google Gemini cannot be denied.

Only if you pronounce it with a hard G, like in GIF.

Or have I been mispronouncing ‘Google’ all this time?

We all say J-oogle, but only when you're not around.

Great, now its not fun anymore.

jif?

I think alliteration applies to pronunciation, not orthography. For example, "ceaseless sun" is an alliteration even though it is spelled with both C and S. I wonder if there is a word for the orthographic counterpart, which you describe here (and which I note in another comment, as the benefit of both starting with G).

Though I agree bard isn't the best name, gemini is a worse name.

Bard has connotations, but by it's self it makes one think of a person that talks well.

WTF is gemini? It is a twin, but this is a singular product. Beyond that it's just generic. It doesn't tell you what it does at all.

Not with regular people. Bard is just "old" to them.

Think of the syllables. Copilot. Gemini. They need to be close to the market leader on a subconscious level.

When the competition is called "ChatGPT", maybe the naming isn't as important to normal people as we think?

Big idea but maybe they should have just named it Google

They kind of didn't name their OK Google assistant anything other than "Assistant". What about ChatGooglePT?

Gemini sounds like the space program

LOL, maybe I'm a bit jaded but Gemini sounds like some kind of cryptocurrency scam.

I hear echoes of astrology, personally, and not good ones.

When I read new thread responses, I briefly thought that I wrote[1] your reply and was confused lol. Great minds think alike. I feel vindicated about my weird opinion.

[1] https://news.ycombinator.com/item?id=39306764

Ha, that's too funny -- I missed yours somehow or else I would have commented underneath it.

Gemini as a zodiac sign: "Smart, passionate, and dynamic, Gemini is characterized by the Twins, Castor and Pollux, and is known for having two different sides they can display to the world. Expert communicators, Gemini is the chameleon of the Zodiac"

Which is pretty on the nose for an AI project. A chameleon with two different sides (good/evil?) and expert communicator

Gemini sounds like the space program -- futuristic, a leap for mankind

80 years ago, sure.

It's a hint about how to use it if you want to circumvent the censorship.

This. It creeps women out. It's difficult enough getting them to use an AI tool.

My wife commented on this when she saw it in Google News. Something about some dude in a medieval pub.

Gemini sounds like the space program

That isn't at all the association I have with that word. I think of the astrological sign instead, so to me the association is pseudoscience and a hint of being bipolar.

This sort of thing is part of what makes naming things difficult. You can't count on any name having the same connotations to everyone.

even though i liked Bard, it is only one (extra) letter away from being the word "bad". "Bard" is cooler imo but "Gemini" starts with "G", has "gem" (a rare, valuable thing) in it, and sounds pretty. Personally i don't care at all either way though.

I'm sure "Bard" was primarily a Shakespeare reference (The Bard of Avon, frequently just The Bard), and I liked it too. An appropriate name for a technology that's all about language.

Gemini sounds cool and sci-fi though, and maybe it's a bit easier to localize since it's just straight Latin.

To me, bard just sounds phonetically gross. Reminds me of “fart” or “beard.” It calls to mind medieval stuff: the Monte Python mud scene, Skyrim’s most annoying NPCs, plucking lutes. But Gemini? That sounds like a legendary space mission; this collective engineering push against the boundaries of human knowledge.

I do not have refined tastes. My b.

Trying saying it non-rhotically, like a British television presenter

Sounds like "bot", which is good from a topical perspective, but bad from a false-positive perspective.

If you really give it some gusto ("baaaaaaahuhhd") nobody will confuse them :-)

That sounds closer to a working class Massachusetts pronunciation.

Yes, just in time for a Super Bowl commercial: Smaht Bahd

To me, bard just sounds phonetically gross. Reminds me of “fart” or “beard.”

WTF? Do people normally think about words in this way, utterly divorced from their meaning?

Yes, people that can hear similarities between words do that.

Understand that this is not condesending in any way, as I do not have this experience.

If there are these "Feelings" around these words, how is any sentence correctly taken at face value. How does one communicate to these people the direct and correct meaning of the terms used.

For example, sentence sounds like seance, do they feel like i'm asking the spirits of the dead ?

Correct sounds like wrecked, do they assume that everything is broken in the above sentence.

Is communication with fraught with unknown minefields of unintended emotions and misundestandings ?

Not at all, these "unintended" emotions can be ignored for the most part. But if you ask me, then google is in my foreigner ear one of the stupidest brand name I know of due to its phonetical ressemblance with some words from my native tongue.

Barti the only bard to me

When I hear "bard", I think of this guy from the Asterix comics first: https://asterix.com/en/portfolio/cacofonix/ - who is notorious for getting on everyone's nerves with his constant singing.

We are not talking here about the rain he brings on each time exercises his vocal cords, but rather about the prevailing atmosphere in the village: when it is time to party, when wild boar are roasting on the spit, you can be sure to find Cacofonix tied hand and feet with a gag in his mouth.

I remember when the iPad was announced, and everyone said that people would only ever think of feminine products when they heard the name. It might have been true for a few months, but now it seems quaint that we ever had such concerns.

[Gemini] sounds like a legendary space mission

Well, it is one. I wish they'd choose a slightly more unique name but camping on well-known words is a beloved tech tradition.

Bard is really funny to me to make fun of. It feels like the discount version of ChatGPT. Like the way that (ironically) TV shows would get microsoft sponsoring and the characters would say "oh you should Bing that", a phrase no human would normally say, and I like to be "ah let me see what Bard thinks about this".

I thought “Bard” was an Asimov reference: https://en.m.wikipedia.org/wiki/Someday_(short_story)

(on top of the more obvious references)

It's too close a match for it not to be

The story concerns [...] an old Bard, a child's computer whose sole function is to generate random fairy tales. The boys download a book about computers into the Bard's memory in an attempt to expand its vocabulary, but the Bard simply incorporates computers into its standard fairy tale repertoire.

Well the ending sure sounds like an LLM getting stuck:

"the little computer knew then that computers would always grow wiser and more powerful until someday—someday—someday—…"

https://blog.gdeltproject.org/llm-infinite-loops-failure-mod...

It also rhymes with Card as in Orson Scott Card.

"Gemini" must refer to its inherently multimodal origins?

It's not a text-based LLM that was later adapted to include other modalities. It was designed from the start to seamlessly understand and work with audio, images, video and text simultaneously. Theoretically, this should give it a more integrated and versatile understanding of the world.

The promise is that multimodality baked in from the start, instead of bolting image recognition on to a primarily text-based LLM, should give it superior reasoning and problem-solving capabilities. It should excel at complex reasoning tasks to draw inferences, create plans, and solve problems in areas like math and programming.

I don't know if that promise has been achieved yet.

In my testing so far, Gemini Advanced seems equivalent to ChatGPT 4 in most of my use cases. I tested it on the last few of days worth of programming tasks that I'd solved with ChatGPT 4, and in most cases it returns exactly what I wanted on the first response, compared with the a lengthy back-and-forth required with ChatGPT 4 arrive at the same result.

But when analyzing images Gemini Advanced seems overly sensitive and constantly gives false rejections. For example, I asked it to analyze a Chinese watercolor and ink painting of a pagoda-style building amidst a flurry of cherry blossoms, with figures ascending a set of stairs towards the building. ChatGPT 4 gave a detailed response about its style, history, techniques, similar artists, etc. Gemini refused to answer and deleted the image because it detected people in the image, even though they were very small, viewed from the back, no faces, no detail whatsoever.

In my (limited) testing so far, I'd say Gemini Advanced is better at analyzing recent events than ChatGPT 4 with Bing. This morning I asked each of them to describe the current situation with South Korea possibly acquiring a nuclear deterrent. Gemini's response was very current and cited specific statements by President Yoon Suk-yeol. Even after triggering a Bing search to get the latest facts, the ChatGPT 4 response was muddy and overly general, with empty and obvious sentences like "pursuing a nuclear weapons program would confront significant technical, diplomatic, and strategic challenges".

It seems odd to me that would work better necessarily considering that humans evolved different capabilities many millennia apart and integrated them all with intelligence comparatively late in the evolutionary cycle. So it’s not clear that multimodal from the get go is a better strategy than bolting on extra modalities over time. It could be though since technology is built differently from evolution but interesting to consider

I get all the multi-modality stuff, but what is it about the word "Gemini" that invokes that, to you?

It's not a bad name, but personally when I first heard the name Bard I chuckled because LLMs had already come under so much criticism for their tendency to embellish the truth or say stuff that is just straight up false but sounds cool.

Bards were the people who kept history and genealogy before written history. Think like Homer rather than Shakespeare. I think the name was meant more to evoke the idea that the AI is a repository of all linguistic knowledge in the same way that the bard was. And maybe also the idea that the AI was at your service in the same way the bard was at the service of the ruler.

Gemeni, or the twins, is a deeply symbolic name for anyone who knows Greek history. It’s the story of Castor and Pollux, and in many versions of the story one brother killed the other only to beg for them to come back. It’s ominous to use this brand name for AI.

It’s also associated to the Gemini killer and Joseph Testa and Anthony Senter who were famous as the mafia’s Gemini twins hitmen.

I think better brands could have been had.

It does sound like some battlefield AI system from Robotron. “Sir, Gemini is charged and ready for battle.”

Gemini was a stepping stone to a moonshot, which is almost certainly why the name was chosen.

Edit: another poster shared the etymology, the merger between Google Brain and DeepMind. I shall eat my words.

Perhaps. Corporate entomologies tend to be very well rehearsed stories, and I’ve been around the valley long enough to know those stories aren’t always the whole story.

I would encourage you to read the Kissinger / Schmidt book before settling your opinion.

That origin story may be true. But it doesn’t make the whole story necessarily.

https://time.com/6113393/eric-schmidt-henry-kissinger-ai-boo...

"Corporate entomology" is a lovely term, evoking surreal (and yet strangely familiar) images of cockroaches in suits.

Brilliant!

corporate entomologies

Now there's a ready-made Far Side concept.

For me it's associated with Gemini crypto and their horrible Gemini Earn investments in Genesis: https://www.web3isgoinggreat.com/?id=gemini-genesis-and-dcg-...

The crypto Gemini was named after the Winklevoss twins.

Also the Gemini Lounge, where Roy DeMeo and his crew killed and dismembered people targeted for hits by the Gambino family.

It’s also associated to the Gemini killer and Joseph Testa and Anthony Senter who were famous as the mafia’s Gemini twins hitmen.

I've never heard of any of these people and I doubt most others have either. Maybe you have to be a true crime enthusiast to know the lore? Whereas if the name were Zodiac, then I would at least be aware there's a potential murderer connection.

"Bard" always struck me a bad naming - unfamiliar, unfriendly, too cerebral. I think the name was an impediment against establishing a household brand.

unfamiliar, unfriendly, too cerebral

The Witcher is one of Netflix's most watch shows. I'd also imagine that most people in English speaking countries have been exposed to Shakespeare's nickname in high school English classes.

It’s generally a common trope in fantasy and Romanticist literature. It’s also a word that exists in virtually all European languages in a similar form (bard, bardo, barde, бард), although similar but different forms may be a negative.

Yes, but I didn't want to assume that most people read literature. Even if they hadn't, "bard" is definitely out there.

I don't think it's that out there. You'd have to be quite uninformed to have never heard of it. It's no verderer or reeve (medieval positions that most people actually will not have heard of).

I meant "out there" as in a word people are exposed to. Not "out there" as in outside of most people's experience.

Maybe named for The Bard’s Tale?

It's possible that it sounds even worse in other languages. That is, it might sound like bad words, onomatopoeia for bodily functions, or common exclamations (that would lead to lots of false positives).

I think it could have been established as a brand in the US, given Google's scale. Put a lute in the branding, run some funny commercials, and you're done.

EDIT: one thing no amount of branding can fix — the likelihood that people reach for "doh, Bard" (a la Simpsons) when Bard messes up. I could see that becoming a thing.

Gemini is Latin, my guess is it more easily translates to other languages than Bard.

Who translates product names?

"How does that translate to ..." means "how well does that work in" some other area or context; more analogous to a mathematical translation than a linguistic translation.

Just a confusing turn of phrase. They almost certainly didn't mean "what does that translate to ..." in another language.

Harmonising product names across regions is hard: Jif was a bathroom cleaning solution in the UK, but it's name was changed to Cif to match the name elsewhere in Europe; and that name sounds silly to UK ears. Meanwhile GIF were always presumed to be pronounced like "gift" (a present) without the final T; but we learnt the creators preferred "Jif" which sounds silly to UK ears because it sounds like a cleaning product! (And also wasn't JIF already a file extension (JPEG Interchange Format).

Anyway ... language is hard.

Jif was a bathroom cleaning solution in the UK

One man's bathroom cleaning solution is another man's creamy peanut butter.

“a floor wax and a dessert topping.”

One man’s creamy peanut butter is another man’s crunchy peanut butter.

I think the suggestion was that it would work well as-is in other languages. It would certainly be natural in romance languages.

Bard is latin too: https://en.wiktionary.org/wiki/bardus#Etymology_2

Bard: fat inept old guy who gets by telling stories of yore

Gemini: a crypto exchange

Clearly they shouldn’t be asking the 21 year old interns what to call it.

Now before anyone in google gets butthurt like they usually do I doubt I could do much better but Gaia IMO would have been mint you know it starts with G is four letters and represents a nebulous deity. Took 3 seconds but hey I’m not paid to think. In fact I’m not paid at all

Edit> Gemini ultra is the best llm so far ( it seems) - apart from the name good job guys

And it has ai in it's name.

If that's the reason, they should've gone with Moai (more AI)

thats even better lol. hey 缾 how come so few comments? you have a few accounts or what? I feel priviledged.

huh I didn’t even notice that I guess I’m sleepy

They’ve plastered “bard” ads everywhere in Tokyo for a while. Surprised to kill the name so quickly, the marketing team in Japan probably had no idea

(Personally, I never liked how Bard sounded. Can’t put my finger on why, it was just not a pleasant name to me)

Barudu?

Baaado (バード)

They also did a lot of (cringe) ads in Brazil.

Same here. Bard is not a sexy name. Gemini is way more sexy. (Neither is ChatGPT, or Google though either). I can't wait until we can call the assistant whatever we want, like Jarvis.

In an increasingly commodity game (the big player LLM game), it’s already starting to hit the asymptote on the main levers: ties to NVIDIA and/or TSMC, serious financing capacity, and enough engagement channel to push it through. (There is much great work happening outside of the Peninsula).

I always thought GPT-4 was a little “HAL 9000” of a name for broad-based adoption, but the jury seems in, and the jury rules “cyberpunk is in”.

The broad name is ChatGPT, not GPT-4

That’s fair, though given the stark UI cue / cost difference, I’m not surprised when I overhear in a random cafe or bar: “yeah but what did ChatGPT Four say?”

In any event, it seems that the image of a Decepticon ready for battle on your behalf has a lot more traction than the image of a quaint singer/priest/poet always there with a verbal shot in the arm when the going is tough.

They literally call it "ChatGPT 4" (with a colored 4) in the app though

The Bard name gave me a warm fuzzy feeling immediately transporting me back to my youth playing (or at least trying to play) Bard's Tale. The name evoked adventure, excitement and a good dose of dread. And, the idea of it being "role playing" struck me as a master meta stroke.

Gemini, from the mythological standpoint, seemed to make more sense to me from an overall business/marketing standpoint. "This AI thing right here is your twin, see? It'll finish your sentences and stuff."

Bard's Tale

https://apps.apple.com/us/app/the-bards-tale/id480375355

And similarly anyone playing modern tabletop RPGs will probably associated "Bard" with the smart, charismatic person who buffs the party and debuffs your enemies; perfect for an AI assistant

And if the brand took off, I imagine you could “Bard” something as a verb but not “Gemini” it.

Perhaps they're hoping people will stick with "google it".

Bard just sounds terrible phonetically. Bard. Like something you find in Home Depot or some kind of old timey woodworking tool. Barf. Bored. Bard.

Yes I know what it really means but it doesn't change the fact that it's a terrible word.

That's an incredibly subjective take. I don't agree at all. I don't care what they call it but I don't understand this reasoning.

I thought it was just supposed to be a pun on “gen’ AI”

thought it was just supposed to be a pun on “gen’ AI”

Then they'd have gone with Genie.

Which would be kinda genius on their part, but Google isn't that kind of company anymore.

As a Brit I’ll be glad to see the word association gradually return to that of our greatest playwright rather than something appropriated by big tech.

I'm so glad that they've changed the name :) Bard is really difficult for me, a non-native English speaker, to pronounce correctly. I think most people from my country (Vietnam) pronounce Bard as "bahd". Gemini on the other hand is so easy to pronounce.

There presumably was a time when Google considered going more into the “assistant” branding. They own assistant.ai but they don’t do much with it.

Bard sounds archaic, almost like an unintelligent persons name, no offense to the bards out there.

To me Gemini is just sort of generic and uninteresting. There has to be hundreds or thousands of products and companies based on the name "Gemini" - "Bard" was at least interesting, different and distinct.

I've no idea about the quality of the product itself, I have never had a reason to use it. It's long past cliché now but I wouldn't get too attached to a Google product that is definitely costing a lot of money but which has no clear pathway to turning a profit. I think they will keep it ticking over until the hype train moves on from Chatbots/LLMs, and then it'll join the Google Graveyard @ https://killedbygoogle.com

I thought it was in reference to Trurl's Electronic Bard, which just about presciently predicted LLM output (though the process is a bit more dramatic, what with how it simulates the whole universe to get to that output): https://electricliterature.com/wp-content/uploads/2017/11/Tr...

Bard felt like Google was treating LLMs as merely an amusing spinner of tales, just a narrative UI layer over the "real stuff" of the Google Search.

Gemini

Symbolizing both human origin intelligence and AI becoming super-intelligent together. I think it's a good name because it draws on ancient human mythology and serves as a reminder of the concept of alignment.

Gemini is not distinct and memorable. It feels like a muddled compromise.

I'd name it GooGaa or Googa. Like "Google" and "goo-goo-gaa-gaa" (baby's first babbling). It's flowing, friendly & familiar.

Honestly surprised I'm the first to mention the name collision with the retro-modern linked documents protocol I keep hearing about (on HN) https://geminiprotocol.net/docs/faq-section-1.gmi

But glass half full, maybe it's for the better to have one's name shadowed by a Google product if one prefers to avoid eternal septembering one's community.

There are other considerations when naming something like this. "Bard" likely could never be a wake word on its own, for instance, but I'd imagine that "Gemini" will be at some point.

Astrologically, Gemini is associated with Communication. Specifically social, superficial, quick, back and forth communication. The sign is ruled by Mercury which is associated with Intelligence and being the messenger of the Gods. Mercury is often depicted with winged shoes as the planet itself is the fastest moving planet, orbiting the sun every 88 days. Mercury is considered to be dualistic (The Twins) and also rules the sign of Virgo, an earth sign that is associated with more deep cold analytical categorization.

The got rid of Bard, and missed the opportunity to either name it Genesys or Legion.

The real question is what's nearby each name's vector embedding in terms of whatever similarity metric Gemini will use to talk about the world. That's their new canonical ontology, after all.

One awkward thing is that Google's Gemini app is currently the third result searching Gemini on the Play Store-- after some bitcoin related applications. The namespace is occupied.

Same reason Arthur Anderson changed it’s name.

Bard was panned. Change the name, lose the bad press.

Bard showed some creativity in name selection. Gemini does not. You see that everywhere. Or at least my first thought was about the Gemini spacecraft

I'd suspect they're just trying to start over - Bard kinda got pegged as Google's dumb, not-as-good, late-to-the-party clone of ChatGPT. It got teased a lot. I'm not sure Gemini is much different (still more impressed with GPT 3.5 & 4) but I guess the name change gives them a bit of a reset.

I think they should have named it gAIl.

I agree. The original reason [1] for the gemini name seems artificial for a generic chatbot. It is OK for the model, and I'm sure a lot of "work" was put into "validating" it for the assistant, or... was it?

[1] https://the-decoder.com/how-googles-gemini-ai-model-got-its-...

In the 20 minutes of experimentation, I'm really impressed with the quality of Bard Advanced (Gemini Ultra). The results are as good as GPT-4, and in some cases is better. So far:

pros:

- better at translation (tried Chinese and Japanese idioms to English)

- better at incorporating search results in its answer vs gpt-4 bing

cons:

- slightly worse at coding

- censorship is more annoying (have to ask multiple times about medical topics)

- Worse at logic (e.g. it contradicts itself in a single sentence, and is unable to figure it out)

- Hallucinates a lot when asked to describe an image

Also as time goes by, it'll get smoothly integrated into docs/gmail/maps/calendar/youtube/search/colab/sheets/android/assistant

So Gemini could by your one-stop AI shop for everything. Only Microsoft can match it (but Microsoft doesn't have a popular maps, youtube, mail, smartphone OS service).

Apple is another strong player (but they don't have productivity tools like docs, sheets or youtube).

It really is Google's to lose this AI race from now on.

Going to chatGPT and copying and pasting results will become painful (not to mention it's painful bing integration). Also at this point, they seem to be focusing on scaling LLM (while Google Deepmind is exploring other avenues)

Google can also bundle Youtube TV, YouTube Premium, Google Drive, Storage, Ad free Search, Gemini integrated Docs/Sheets/Gmail, Subsidized Pixel Phones / Watch for a monthly fee of say $99 and it'll be very compelling for a lot of people

Strange to say Apple doesn't have productivity tools when Pages, Sheets, and Keynote exist on every Mac. I get the scale arguments, but Handoff and iCloud integration are a sleeper IF you've bought into the ecosystem...

Also hard to overstate just how much more valuable the enterprise market is over the consumer market when comparing Microsoft vs. Google as one-stop anything shops.

I don't see Google as having the obvious dominant position to make the argument it's their race to lose, considering Microsoft has a stake in chatGPT and is actively integrating it into their browser and productivity suites.

There's a Google bubble on HN, as demonstrated by small-to-medium business facing SaaS launching here offering login with Google and not offering login with Microsoft.

I've talked to many of HN's Google Docs jockey founders that genuinely didn't realize 85% of the US domestic business market is in M365. And they further don't realize that "Continue with Microsoft" is dirt simple and lets companies offer company-managed logins to your SaaS without all the AD/SCIM/SAML nonsense.

"But everyone has Gmail." Well, no, that's not how companies work. And if you think everyone's in Google, that's fine, your login page should look like one of these:

https://www.xsplit.com/user/auth

https://id.atlassian.com/login

You don't even need the "continue with SSO" if you do the Atlassian practice of letting a firm "claim" the domain part of an email and using that to redirect to an SSO flow. And to start, skip SSO, and just use the "Continue with" Oauth2.

Unfortunately, You are in a US bubble.

Globally, Google brand is 10x stronger than Microsoft for Small Businesses

Absolutely untrue - Every company and university I've ever worked with or for in Europe used Microsoft 365. Not a single exception.

US + Europe is not the world

In my country we used pirated version of Office products. Does that count?

I have the most respect for this country

East-Asia chiming in here, Microsoft dominates enterprise, Google is a tiny player, single digit percentage market share for any enterprise product. Only on cloud does MS have competition from AWS and local players, everything else is Microsoft. And even in cloud Azure is rapidly eating through AWS marketshare.

Microsoft or pirated Microsoft? I was surprised when <insert major Arab oil company> was using pirated MS Office on all their systems. This was a bit more than ten years ago though, so 365 wasn't a thing. But still....

I love this "X in not the world comment" while others are giving more and more evidence of the opposite. Care to give actual evidence, like, I don't know, this Statisa survey (https://www.statista.com/outlook/tmo/software/productivity-s...) because you've now looked farcical. Even looking at African results (which is a relative greenfield), it seems that Microsoft has a sight lead here (https://www.statista.com/outlook/tmo/software/productivity-s...).

Edit: Indonesia is the largest market which has Google beat Microsoft (https://www.statista.com/outlook/tmo/software/productivity-s...). India (due to Zoho) and China (due to... locally-specific software like Kingsoft (aka WPS) et al.) took a large chunk but on Microsoft-Google tussle MS still beats Google (especially in China where Google doesn't really operate).

In South America, Microsoft brand is waaaaay stronger than Google's for businesses of all sizes. Google is viewed more as a consumer brand.

But is it really? It seems to me that almost every business is using the Exchange/Outlook combo, not Google products.

bubble

It's not a bubble when one specifically names the Venn diagram circle "85% of the US domestic business market". It's naming a market.

brand is stronger

Presumably the founders' interest is wallet share, not market share.

Are you saying Atlassian is in a US bubble?

I think Google lost the top researchers when they destroyed the culture. All the competitor companies are mainly led by ex-Google talent, and honestly who in their right mind would take a Google job today over OpenAI, Mistral, or even Meta (where you will be releasing models like Llama for the world to use).

Google killed the culture and is bleeding top talent. They have reduced themselves to digital landlord and sure they can extract rent, but that’s not what attracts people.

that is the media narrative but not at all what happened.

Google's 'don't be evil' grad-school-style culture had fallen apart by the late 2010's because there are tons of people who will just rest and vest.

So strong ML researchers basically were creating massive value but much of it was going to rest&vest salaries. OAI basically came along and said - hey, we don't have rest & vesters, do you want to make $1m+/yr? And most of the top google researchers said yes.

It’s not just media narrative. The culture was eroding for years, as you note, but the dam finally broke and they went full IBM/Kodak. Or in other words, “slowly at first, then all at once”.

Most of the recent media coverage has been resting&vesting employee backlash against the fact that Google is making them do work again. This is a cultural shift, but not away from the culture that made Google great - the original culture was grad-school, not rest and vest, and that died years ago.

Haven't one or two long-time Googlers left or gotten laid off and then written strong criticisms of Google? They don't sound like rest & vest (also should say I don't super agree w/ this term) to me, they sound like people who loved Google, were there a long time, and watched the culture decay.

I’m not super invested in the term “rest&vest” so it is whatever.

But touché - many of the critiques are being written by super talented and impactful people. But I do not think those critiques are necessarily incompatible with what I am saying.

There is a very real and very frustrating (if you work there and want to be impactful) phenomenon in these tech companies of people resting on their laurels.

I've given this a little thought and I definitely agree. But to bracket a little, I've worked in places that were super mission-driven, but also other places with a lot of "clock punchers". I don't judge anyone. I think people's motivations are super personal. Indeed who am I to say clock punching or resting and vesting is unethical or immoral or whatever. You can show me someone who's super dedicated to the cause, working extra hours, mentoring others, spearheading new projects, saving old projects, and I can show you someone who's effectively abandoned their family or mortgaged their future health. You can show me someone collecting a $500k/yr salary at FAANG and doing very little actual work, and I can show you someone with two parents in nursing homes and a partner with MS.

That said, the places I've worked where there was a shared belief in the importance of our work were exhilarating. But, the way they achieved that was essentially by getting rid of bullshit. You know, no one's building their own web framework or writing their own in-house query language (you know, unless you really have to). No one's spending an hour of everyone's time figuring out what to name this class.

I think we haven't figured out how to keep that energy once you build a company big enough that's got something to lose. Suddenly it totally feels worth it to have a meeting about changing the shade of blue in the logo from this to that, because what if revenue declines 0.1% and that's $40m and then layoffs? One of the amazing things about Google was that it somehow seemed to keep that energy well into becoming a tech behemoth; it felt like they defied gravity, like their commitment to not doing evil and open web standards--or maybe we thought the web itself had ended evil corporations like Microsoft forever and Google was its avatar--had allowed it to overcome this corporate version of Fermi's paradox.

Whatever. Maybe this is just a bunch of word salad haha. All I'm saying is it takes two to tango, and just like I'm sure there's a non-zero level of personal culpability responsible for this phenomenon, I'm sure there's a non-zero level of systemic and cultural cause too.

Even if we assume that all of Google is based on ML algorithms (not true, given problems like serving, front end, data collection, ...), ML itself is 90% SWE work.

Google's competitive advantage is threefold:

1. Real estate - Youtube, Gmail, Maps, Search (for now), etc. 2. Compute - probably still the best in the industry, but with recent Microsoft/meta compute buys it's hard to say for sure. 3. Talent - probably also still the top of the industry. Geoff Hinton and Zoubin Gharamani setting direction and Jeff Dean building it is hard to beat, and the ranks are deep. Yann LeCunn is also brilliant and Andrej Karpathy while less seasoned is one of the top researchers in the field, but overall there's still a bit of a spread from Google's roster, at least when it comes to AI researchers.

If Sundar and the other top brass weren't MBA-bots with no vision, and the famous Google bureaucracy had been reigned in gradually over the last 5 years while promoting a builder-centric culture, this would be in the bag for Google no question. Instead, Satya Nadella played 3D chess while Sundar was looking at a checkers board.

Geoff Hinton quit Google last year, no? But other than that, I guess I agree.

Apple is coming. I think the personal agent is where we really want the smarts and if they’re not trying to own that space the CEO should be fired.

"Apple is another strong player (but they don't have productivity tools like docs, sheets or youtube)."

Can anyone help me understand how Apple allows Siri to remain so absurdly incompetent? Last night I watched the latest episode of Curb Your Enthusiasm, in which Larry David's Siri interactions devolve into an apoplectic rant -- and part of the reason it was so funny is that it's so relatable. I rarely even try Siri anymore, but when I do it's still just abysmal. Are they deliberately handicapping it, to boost the perceived relative benefits of a future successor?

Apple has very little ML talent. They're basically resting on their laurels in the phone market.

That's what I thought too; but all the upthread and peer commments keep asserting Apple is a (real, not just hypothetical/potential/future) player in the space.

well yeah, people on this website opine a lot about things they know little about. they're starting to make moves, but they don't have a large talent base.

Apple will start using this tech now that it has been commoditized and since it is a big consumer tech company they will put a nice polish. But it's not where big research is being done.

Apple went too early to ML for Siri.

Siri was "decently good" when it was basically a voice-activated command line, and you could figure out what you needed to run to get it to do what you wanted.

When they added ML to it, it entirely went to shit, and it has never recovered.

Perhaps Apple will be able to use an LLM to make it finally not shit.

Apple refuses to pay top of the market. They'll have to get really lucky with someone who grows from within.

Good comments. As much as I am personally engaged in small LLMs that I can run on my own computer, and integrate into software that I write for myself, I think the future of large scale adoption of AI belongs to Google, Microsoft, and Apple in western countries (and China is doing wonderful things in their markets).

The old Bard/Gemini integration with Gmail, Google Docs, etc. is pretty good. I signed up for a minute for Microsoft’s $20/month AI Office integrations, but cancelled and will try again in 2 months. I am an Apple customer and I expect spectacular things from Apple. I expect Apple, Google, Samsung, etc., to offer fantastic on device AI.

I would like to see a money saving Google bundling family plan. I find Apple’s super bundle family plan a pretty good deal.

small LLM? Small Large Language Model lol

Good joke, thanks, but I will explain anyway: to me 30 billion parameters or smaller is small since I can run it using Ollama on my home computer. I managed a deep learning team at Capital One and our ‘large’ models were less than 20 million parameters. :-)

I suppose we could call them Medium Language Models, but unfortunately that TLA is already taken

A large language model the size of a small language model.

Did you just say Microsoft doesn't have a popular email service?

Yes. No where at the scale and reach of gmail.

We are also talking about consumer emails (not enterprise / corporate)

Outlook(+Hotmail) is the third most popular email service. Just 3x smaller than the Gmail. It's definitely the same kind of scale.

If you count outlook.com and hotmail.com and live.com and all enterprise email you get someone who is probably pretty close or larger than all gmail and gmail-backed domains, especially by volume.

I agree that Google is well-positioned, but they were also well-positioned to take advantage of these synergies with Google Assistant for many years and I would say that that did not meaningfully materialize in a way that was helpful to me as an Android and Google ecosystem user.

Agreed. I've run the house using google minis and assistant for years now, and asking assistant to do / about stuff has not improved one iota in that time and has introduced several more quirks and bugs.

Makes me wish I had bet on Alexa or Apple instead.

Yeah, for example just yesterday I was driving and an alarm went off for the phone in my pocket. I told Google Assistant to silence the alarm... and it refused, insisting no alarms were active. How the hell can such a simple use-case be failing so badly?

I suppose it doesn't matter, because they're going to disable the functionality entirely, [0] and setting ephemeral alarms is the literally the most common thing I ever ask it to do!

Part of what makes all the assistant-stuff so damn frustrating is that it's an opaque "try something random and hope for the best" box, and whenever it fails there's usually zero information about why and no resolution path. (In a way you can generalize that to a lot of "AI", which is depressing.)

[0] https://www.androidauthority.com/removing-google-assistant-f...

Google still has too much internal fragmentation and power groups to offer a single google-subscriber package.

I'd say it is one of the most compelling reasons to kick Sundar out and get in someone who can unify google into one consistent and interoperable ecosystem.

I bet they'll eventually let Kurian try this, and he'll destroy what is remaining of Google in the process.

Google is going to own AI like intel owns graphics cards - i.e. Not really, except at the absolute bottom of the barrel where its baked in advantage lets it offer an unbeatable price/performance proposition for people who only care about "value" and with limited real performance requirements. Google's baked in AIs will be free, and bad. Everyone else is going to let people "plug in" models via standardized APIs, because one size fits all models are just a bad idea, so that's the way google is going to have to go eventually as well, because it's what power users are going to demand.

But power users are a minuscule market, compared to non-power (weak?) users. That's the problem.

If they can get reliably useful AI through voice into maps/navigation, it will be a substantial improvement to the driving experience. It's really frustrating to manage destinations and waypoints while driving. I just checked the process to see if I'm not keeping up and 1. the help docs are out of date. 2. the waypoint search results provides tap points for destinations 10 miles off a route, but shows only 3 pixel red dots for otherwise equally weighted options that are literally on the existing route.

It really is Google's to lose this AI race from now on.

I was curious and apparently Google/Deepmind had 184 papers at NeurIPS/NIPS 2023: https://www.reddit.com/r/MachineLearning/comments/185pdax/d_...

For comparison, Stanford has 130, MIT has 130, and Carnegie-Mellon has 112. Microsoft had 68, and Facebook/Meta/FAIR had 56.

All of the things you write are very good ideas. But at this point, I am quite skeptical of Google leadership to pull off these things.

It really is Google's to lose this AI race from now on.

Google took 11 months to catch up with the benefit of knowing the specifics behind how GPT-4 was built. While they have a moat with their products there's no sign that they're ever going to be ahead on the AI capabilities front. I hope to be wrong because we need more competition.

Well for LLM services that do what they currently do google may have an advantage, but all this stuff is still only experimentation with the goal being hopefully much more advanced things, like almost-agi agents. If this happens then no one will care about the way we currently use LLMs anymore.

Gemini the chatbot will still need to compete with ChatGPT on raw performance/intelligence/SOTA. Gemini should first be considered the undisputed winner of AI chatbots. Right now ChatGPT 4 is the "winner" in mindshare and I am not sure what Google needs to do except for OpenAI to bungle their own releases.

Gemini integration with Google's apps will compete with Microsoft's app on integration.

And the condition for Gemini to win: Google has to go all-in. and that means creating an AI that will eventually phase out their cash-cow Google Search. Microsoft and OpenAI does not have that kind of internal conflict.

It really is Google's to lose this AI race from now on. - We have heard this so many times and it seems like they are always getting behind by the day.

Sounds like this whole thing is an insane 30-year effort by some engineer who couldn't get over the discontinuation of Clippy.

right, google has maps, they should call the bot Uncle Traveling Matt.

On logic it cannot handle the Dumb Monty Hall problem at all:

https://g.co/gemini/share/33c5fb45738f

Incredible. Gpt4 spots that the door is transparent and that changes things but has this great line

When you initially pick a door (in this case, door number 1 where you already see the car), you have a 1/3 chance of having picked the car

(Asking it to explain this it correctly solves the problem but it's a wonderfully silly sentence)

Edit - in a new chat it gets it right the first time

This is not convincing though that gpt4 actually understands the problem. Here's a slight variation I asked and it fails miserably.

https://chat.openai.com/share/22a9027f-a2c1-428a-94a2-8fd918...

I wonder what lends itself it answer correct in one situation but not the other? Was your question previously asked already and it recognized it whereas my question is different enough?

You could say it doesn't "understand" anything really.

That's what I like about this problem (and similar Dumb variants of classic brain teasers). It exposes that there's not understanding, there's just a statistically weighted answer space. A question that looks a lot like a know popular topic ends up trapped in the probability distribution of the popular question.

Your link is not to GPT4, your link is to the free version of ChatGPT, aka gpt-3.5-turbo (you can tell because the icon is green, not purple).

GPT4 indeed understands your variant, as evidenced here: https://chat.openai.com/share/46916f21-c469-4e93-9bed-bbd18b...

Was your question previously asked already and it recognized it

Given that LLMs training data consists to a large extent of "stuff people have written on the internet", and The Monty Hall Problem is something that comes up as a topic for discussion on the internet not entirely infrequently - as well as having a wikipedia page - yes, I suspect that the words describing the monty hall problem being followed by words describing the correct solution appeared often in the training set, so LLMs are likely to reproduce that.

Words describing a problem similar to the monty hall problem are going to be less common, and probably have a lot of discussion about whether they accurately match the monty hall problem, and disagreement about what the right answer is. LLMs will confabulate something that looks like a plausible answer based on the language used in those discussions, because that's how they work. Whether they get a right answer is probably going to be much more up to chance.

It's a bit random, which doesn't help, and different interfaces have different system prompts.

I repeated your question a few times and it got it wrong once, and right the others. It repeatedly mixed up who was supposed to be the host.

Here's a quote

In the scenario you've described, you've initially chosen door number one, which you know has a car behind it because the doors are made of transparent glass. Since you already know the contents behind each door, the classic Monty Hall problem's probability-based decision-making does not apply here.

That is not the Monty Hall problem, it is a trick question based on the Monty Hall problem. It's a reasonable test, and I see GPT-4 recognizes the problem AS WRITTEN, and perhaps "the Dumb Monty Hall problem" is some generally accepted standard that I haven't encountered before.

edit: "AS WRITTEN"

"Understands" is too strong of a word, more that it recognizes the problem as written. Here's yet a slight variation - just as simple - but changed enough it now is wrong.

https://chat.openai.com/share/22a9027f-a2c1-428a-94a2-8fd918...

That's not GPT-4

I saw it posted on Twitter some time last year. If LLMs are to be useful they should be capable of answering novel questions. This is only a trick question for an LLM. 2 of the 7 sentences plainly state the answer.

Hilarious!

(For comparison, here's GPT-4 getting it on first try: https://chat.openai.com/share/9e17ed25-d9ea-4e72-a9d8-a139ca... )

yes, although gpt-4 has been finetuned on this one

My understanding is that gpt4 is better at this than 3.5 and it seems to get it pretty reliably. One thing that's interesting to do is to imply the answer is incorrect and see if you can get it to change its answer. If you let it stop answering when it's correct, you get the Clever Hans effect.

This is with regular gemini or with the paid gemini advanced?

Regular

Paid version is no better at this https://g.co/bard/share/c8503017ef9e

how's it do with the trivial river crossing problem? (farmer fox chicken and grain need to cross a river in a boat big enough to hold them all) ChatGPT-4 can't do it.

https://g.co/gemini/share/c4e5634a2e2d

Not terrible. It gets the answer wrong, but reminded of the crucial twist it gets it correct, durably. If you're too condescending it will give up and ask what the hell you're looking for

In the scenario you presented, where you initially know the car is behind door 1, switching to door 2 still gives you a higher chance of winning the car.

That was funny.

This is pretty funny, though to be honest, I skimmed the question and would have answered the same until I re-read it with your prompts.

GPT-3.5, DeepSeek-Chat, and Gemini Pro all got it wrong. Only GPT-4 gets it.

This is hilarious.

censorship is more annoying (have to ask multiple times about medical topics)

I think there's a chance for some country to become a center of healthcare simply by allowing AI that gives medical advice. Especially if a country already encourages medical tourism, this might be the next level.

The risks involving hallucinations are too damn high still, and may always be.

I had a similar line of thought with AI therapists. It could be massively beneficial if perfect, but the risk in seriously messing with someone's well-being is significant and shouldn't be handwaved away.

The risks involving hallucinations are too damn high still, and may always be.

Yes, but I think in the limited realm of people who otherwise wouldn't get any advice at all, I think LLMs could play a useful role. American healthcare is so prohibitively expensive that many people with potential medical issues will avoid seeing a doctor until it is too late to do anything. Checking in with an LLM could help people at least identify red flags that really can't be ignored, and it would be more helpful than WebMD telling you that everything is cancer.

Otherwise not getting advice at all goes way beyond healthcare being too expensive, it could be that you don't get an appointment and it could just be that you don't have the time or energy.

I think we may see society settling on feeling comfortable with their doctor using an AI, but not being an AI.

I will take nearly free GPT6 therapy any day. Can’t be worse than some human therapists I’ve had and the price is right

I think there's a market for LLM-based therapy that is reviewed/tweaked by a human therapist in between sessions. That would give people the assurance that things aren't going way off the rails.

OTOH, I could also see a market for an offline, fully private LLM therapist. That way you could say anything without concern about being judged. These would probably need to be tweaked to be different from regular therapists, who normally interact with people who have somewhat more of a filter, since they would fear being judged. If people opened up to LLM therapists in more transparent ways, the LLMs might not respond in the way a human therapist would recommend (having seen very little data on such interactions).

The privacy aspect is what made me connect local LLMs with therapeutic use. But yeah, AI as it stands today just isn't safe enough. We need nine 9s of safe usage here (99.999999999% safe), or more, for me to actually feel comfortable with the technology.

It would also open up some legal gray areas if it were to happen. Would psychotherapist-patient privilege apply to an LLM box? If the state has a zero day granting them access to a seized "therapy box," it could be more revealing and damaging than anything a human therapist could provide police.

so I have done a lot of transcripts, coding, one versus the other (gpt4 vs ultra). Often simple prompts like refactor this code or convert this python to typescript.

My experience is that Gemini ultra understands the code better, but doesn’t always give me as complete of results (they seem to limit output length more)

Beyond that it is very smart. I’ve had to tell me this code packs 12 bit integers into different parts of an array using the following encodinv. Which most people would not figure out from the code as written. It then will say you can actually do that with this neat little translate function that you never knew about.

It will then get the code very slightly wrong. if I tell it not to use the cool function, it will actually get the code, right.

GPT4 has no idea what the code is doing but can clean it up a bit.

so it’s like ultra is too clever by half sometimes.

That said, I have fed thousands of lines of code into both of them and asked them to refactor it, and neither one of them made more than one error. All code otherwise compiled and worked first try.

this is code that can’t possibly be in their training sets, it’s basically handwritten python that was written, based on an old x86 binary that nobody has the source to anymore. so the code is basically garbage, and what it is doing doesn't, say, appear on GitHub in a nicer form.

Both gpt4 and Gemini ultra were able to make the code, look like clean, idiomatic, python, or typescript without any work on my part. except for the one bug each. which, for 8000 to 10,000 lines of code is not bad.

The GPT4 inserted bug was more insidious. It changed (the equivalent of) (uint8)'a' to (uint8)'a' - '0' for no reason when converting some code to typescript. Not sure if that is representative of anything

if I do the same with any of the other “top” models ( from can ai code, etc), most of them can’t even generate correct working code for all the input, let alone good code. Most aren't even close.

That said, I have fed thousands of lines of code into both of them and asked them to refactor it, and neither one of them made more than one error. All code otherwise compiled and worked first try.

I would be very interested to get a more detailed scope of what you did here. Feeding thousands of lines of code into GPT4 and getting a near perfect refactor does very much NOT sound like my experience, but it seems highly desirable.

Yes, I've observed the same phenomenon. The more detailed my prompts are, the more errors GPT tends to make. I use it as a partner to discuss implementation ideas, before I start coding. That works very well, because gpt and I usually find somethings, that I missed at a first glance.

But coding with gpt or co-pilot is too disrupted for me.

The more detailed my prompts are, the more errors GPT tends to make. I use it as a partner to discuss implementation ideas, before I start coding.

This is interesting; I do a third thing that's the exact opposite of both of you.

I generally know the architecture I want, so I just slowly spoon-feed it to GPT-4 and have it build each of the components with the right interfaces to fit together. Usually, building each component requires a few rounds of hand-holding it through various refactoring. Its first draft of code tends to be, for lack of a better description, totally lacking in any creativity or elegance --- generally bland, insipid, and mundane stuff.

But it'll gladly take refactoring directions to get it into shape, so it's still very quick to get a whole bunch of components written up. Then I just glue everything together on my own. And the whole experience is still way better than writing a bunch of boring boilerplate code by hand.

Seconded. I am also keenly interested in learning more. It would be a great boon on my current project to be able to let the AI refactor mountains of legacy code (while taking steps to ensure the results are trustworthy and equivalent, of course.)

I'm happy to share transcripts if you email me.

I'm not sure what you are feeding it. My scope is closer to a file at a time of fairly mostly self-contained python or C and asking it to clean it up or convert it to typescript.

I can imagine lots of scenarios it doesn't work well.

In mine, it does, and I have plenty of transcripts showing that :)

With GPT 4's context window I've had no luck with more than 2k lines of code in a single prompt.

what it is doing doesn't, say, appear on GitHub in a nicer form

Surely many similar bits of code appear? I'd have to see your problem though.

I'm thinking of ETL type code which has to mangle various custom exotic encodings into others and vice-versa.

I think that logic is the most important thing to look out for though.

I just tried some logic puzzles on the Advanced model, and was not impressed. It feels much worse than paid ChatGPT.

keep in mind that all the common logical puzzles have probably been tried hundreds of times by chatgpt users and are now part of the training set.

Maybe, but GPT4 got these puzzles right at the launch.

I tried the "pull or push a glass door with mirror writing".

I feel it's a huge difference between GPT-4, which seems to be able to reason logically around the issue and respond with relevant remarks, and Gemini Gemini Advanced which feels a lot more like a stochastical parrot.

Gemini quickly got confused and started talking about "pushing the door towards yourself" and other nonsense. It also couldn't stay on point, and instead started to regurgitate a lot of irrelevant stuff.

GPT-4 is not perfect, you can still hit things where it also breaks down.

it says in the graphs listed on the announcement it performs worse than GPT4 on reasoning benchmarks.

Question for you -

better at incorporating search results in its answer vs gpt-4 bing

How are you getting it to incorporate search results in its answers?

I can't for the life of me get it to find any real-time external data except for the 5 official 'extensions' under settings, which are for Flights/Hotels/Maps/Workspace/YouTube.

Did you mean that, or have you found a workaround to get Bard to actually search on Google?

You have to click the "G" icon in its response to "verify answers with Google".

Okay, but to clarify:

- This is not Gemini performing a search. - This is Google providing a layer of ass-covering in case Gemini produces a factually incorrect reply.

Right? I am looking for something like ChatGPT with Bing - it will run a query, pull back results, and operate on them, all dynamically within the system.

Gemini doesn't seem to do this, no matter how you try to wrangle it.

better at incorporating search results in its answer vs gpt-4 bing

That's odd, I had Gemini repeatedly tell me it couldn't search the web in response to my question (that I was trying to get it to answer from the context I provided).

I haven't tested asking it explicitly to search, but it does incorporate answers that are very recent and unlikely to be in it's training dataset

- slightly worse at coding

- Worse at logic (e.g. it contradicts itself in a single sentence, and is unable to figure it out)

That takes most of my use cases. "logic" is what makes GPT often feel like AGI.

Use cases like translation seem less impressive in comparison to logical reasoning because it feels like it's just something where you can throw a lot of data and it'll do better. While with logical reasoning it still feels like model "learned" something more than pure pattern matching

Exactly. That’s also why I find low parameter LLMs to be useless for me personally. I simply cannot trust anything that is so very illogical. GPT-4 is the first LLM that crossed into usable territory for me. Even GPT-3.5 was a fun toy and maybe good for summarization, but that’s it. It will be revolutionary when GPT-4 is cheap enough that thousands of calls don’t cost much. To imagine an LLM much smarter GPT-4…the future is bright.

One thing it doesn’t do which matters to me is properly rendered math like ChatGPt does. I asked it for mathematical formulations and so far the answers are not as good as ChatGPT 4.

censorship is more annoying (have to ask multiple times about medical topics)

Worse at logic (e.g. it contradicts itself in a single sentence, and is unable to figure it out)

Heh. I notice the same correlation between these two variables in humans too.

censorship is more annoying

That's a general problem with AI. There is a lot of censorship in certain areas, likely to fight bad publicity, but I think the outlook is that this leads to taboos, prudeness and big companies deciding what is ethical and what isn't.

I recently tried Bard and ChatGPT on topics that are classical philosophical dilemmas and while ChatGPT certainly did have some troubles too, Bard was absolutely horrible and always took the conservative - as in never arguing for any freedoms that aren't yet widely established views. I am talking about classical examples regarding the limits of utilitarianism. "What would be best for society, what would be best for the individual?" style questions. Even when trying to create a bias by changing examples, for example adding volunteering for things Bard strictly kept its opinion, despite originally stating that the general topic is two sided, that it's an open question, etc.

I think this is a danger of such systems. By their nature they reinforce status quo, because they base off is widely accepted at the time of their inception. If history would have been different I am sure it would argue for slavery and against women being allowed to vote, simply because that used to be the more common view point. It would have likely argued that homosexuality is unethical. Maybe it would even have tried to explain how it doesn't create children, but spreads diseases or similar things. At least that's the level of arguments it brings now.

This isn't just about ethics. Even if you think about IT and programming. I think this could give already invented programming languages, styles, methodologies a significant edge. Unless you are Microsoft or Google and are able to bias it to whatever you want to see more of.

So this combined with the costs meaning that only people or institutions with significant (financial) power create those rules does look a bit bleak.

I miss the last decade when the thought experiment about self driving cars were about whom to drive over in a ridiculously unlikely scenario.

- slightly worse at coding

Is GPT-4 what one uses for coding? I thought specialized models were best?

I would imagine Google is focused on building a model that expands the types of things people associate with Search.

I feel the same. And it feels slightly faster?

Finally a worthy competitor to GPT-4

Don't install the Gemini android app, it'll break Google assistant and you won't be able to operate Nest devices from your phone.

Given that Gemini Pro is already the third best [0], I would expect Gemini Ultra to become the best soon.

[0] https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboar...

To add to your pros and cons:

Pro: feels like it has a lot more stylistic flourish and expressiveness than chat GPT. At least in my testing so far.

Con: maybe I'm unreasonably sensitive to design, but I find chat GPT to be very to the point in a way that makes it almost perfect as a product that puts text for an and center. By contrast I find Bard / Gemini ultra to be cluttered, and too material designy in ways that are unhelpful.

For medical topics, I recommend Anthropic Claude. Don't want to jinx it, but so far, I've been able to get actually helpful medical information from Claude where ChatGPT just says "I'm sorry Dave, I'm afraid I can't do that"

I found it worse at coding than DeepSeek Coder on the couple of prompts I tried.

I tried it with rust, it's so bad it's simply not usable, it hallucinates methods and even the syntax is wrong at some points (it especially can't get error types correct or seems). Gpt4 doesn't handle rust perfect either, but the code it produces is good enough to only need some touch-ups, it can explain and fix wrong use of (we all known) libraries and even gets async code. But it's especially great for boilerplate, saves so must typing.

I was hoping openai/gpt4 would see some healthy competition, but Gemini doesn't seem to be it. Of course, the rust language might be an edge case.

Did I get this right?

Bard – old name of their generative AI service, to be called Gemini

Duet AI – old name for their generative AI in Google Workspace, to be called Gemini

Gemini – three things: 1. the name of their models (like GPT). 2. the new name of their free service (like ChatGPT), gives access to Pro 1.0 but not Ultra 1.0. 3. the new name of the Generative AI tools in Google Workspace.

Gemini Advanced – the name of their paid service (like ChatGPT premium), gives access to both Pro 1.0 and Ultra 1.0

Ultra 1.0 – the first version of their big model (like GPT-4)

Pro 1.0 – the first version of their smaller model (like GPT-3.5)

Google One AI Premium – the subscription that you need to buy to have access to Gemini Advanced

Google One Premium – the old version of the subscription, does not include access to Gemini Advanced

Google app – the mobile phone app, which includes either Gemini or Gemini Advanced

Google Assistant – like Siri but hard to define what it is

Google AI – a generic name for all their AI products

It’s sad that a company of very smart people can’t figure out coherent naming.

Can you imagine Apple causing confusion like this? I know it’s not a like-for-like comparison, but everything Apple does it seems like they have a grand strategy that’s clear for everyone to see. Things build up in a modular way to fit a big puzzle.

Google, on the other hand, constantly makes up things on an ad hoc basis.

Maybe not quite the same, but I will point out that “Apple TV” and “Apple TV+” are not just two distinct products, but are in fact entirely different categories of product.

One is a piece of hardware akin to a Roku. The other is a streaming service akin to Netflix.

Thanks to Disney, "+" is the industry term for "Streaming"

One company using it doesn't make it the "industry" term. It was the copycats that followed like Apple and Paramount that made it into an industry term. It kind of makes sense for Paramount, but really doesn't for Apple. Par for the course for Apple TV+, barely baked content on a barely baked poorly named service.

I didn't think Apple copied anyone. Pretty much all their subscriptions have had plus like iCloud+ and Fitness+.

That was them copycatting.

https://en.wikipedia.org/wiki/Disney%2B "Disney+ was launched on November 12, 2019"

https://en.wikipedia.org/wiki/ICloud "In June 2021, Apple introduced iCloud+..."

https://en.wikipedia.org/wiki/Fitness_(Apple) "Apple Fitness+ is an ad-free video on demand guided workout streaming service announced during Apple's September 2020 Special Event"

Apple has been using + way before Disney. Apple care+, iPhone 6 Plus and I’m pretty sure they had another plus product or service in the 2000s or earlier that I’m forgetting right now. Edit: it was the Mac plus from 1986.

I’ve always associated “plus” with Apple, not with Disney.

"Plus" has always meant "bigger" or "more".

Odd, I’m not a huge fan of many things Apple is doing these days, but I’ve consistently found their homegrown content to be very high quality.

I've watched 3 shows on Apple TV+ - Extrapolations, The Morning Show, and Ted Lasso. All of them start very promisingly (as in, the premise is good, the initial setup of actors/sets/etc. is promising, there is lots of potential ways it could go) but they're all superficial, get very predictable, and then the quality steeply goes downhill after a point. All three could have been way better, and had vastly more promise than what they delivered. It was enough to make me cancel the service.

Copycats did it... thanks to Disney.

Disney+ Paramount+ AppleTV+ Discovery+ in addition to to 4 or 5 more https://en.wikipedia.org/wiki/List_of_streaming_media_servic...

It's an industry term now.

Thanks to Disney, "+" is the industry term for "Streaming"

Google+ was too far ahead of its time!

I use both and I haven’t found it too confusing, to be honest. I just think of it as Apple TV (streaming device) gives access to Apple TV+ (streaming service).

It would just mean there are somethings you found confusing but others don't and vice versa. But you come swinging like you have done survey of Fortune 500 companies and Google stood out in naming confusion.

I don't have to survey Fortune 500 when things are plainly obvious.

Hangouts, Allo, Duo, Buzz, Google Talk, GChat, Inbox, Messenger, Messages, Bard, Gemini, etc. Who else has a track record of chopping and changing like this?

things are plainly obvious.

To you. You are projecting.

I guess there are people for whom Google’s constant changes do make sense. One has to simply keep up with their frequent announcements!

https://www.apple.com/apple-tv-app/

Apple TV (app) gives access to Apple TV+ (streaming service)

Apple TV is mostly associated with the hardware streaming boxes they've been releasing for a long time, Apple TV the app is just an app that performs a similar task on non-TVs. TV+ is available on both of them. Still, there's a bit of confusion around the naming.

Hello, would you like to watch Apple TV+ or Apple TV Channels on your Apple TV app on your Apple TV?

Apple TV is also an iOS app, macOS app, tvOS app, and [other generic TV OS] app which allows you to access Apple TV+ content if you have a subscription, but otherwise lets you access services connected to your Apple TV [hardware].

Actually it's more that Apple TV is both a piece of hardware and an iTunes-like service, while Apple TV+ is a subscription service akin to Netflix.

The Apple TV hardware and the Apple TV app on your iDevice can both be used without paying a subscription. The hardware has all other streaming apps a la Roku, and both it and the app on your iPhone can be used to purchase and watch TV shows and movies.

If you ever take a customer survey for Apple, for the "which Apple products do you use?" question they always have to write something like "Apple TV (a streaming box that plugs into your TV)" and "Apple TV+ (an online streaming service)" because they know the names are so confusing.

naming things is hard

is one of the cultural difference between a computer-first logical mindset contrasted with the mathematics-first mindset

the mathematics first mentality does not properly recognize the importance and the difficulty of having good names for things, whereas the computological view recognizes both: the importance and the difficulty

Insofar as naming things is basically applied category theory, I'd trust a mathematician to name a set of products more tersely and understandably than I would an English major.

A math major named this monitor:

https://www.viewsonic.com/eu/products/sheet/VX2758-2KP-MHD

…I don't understand? VX2758-2KP-MHD is a fairly commonly used word in the Polish language.

It is even my surname!

It's unique and easily searchable when I want documentation specific to that product. And when deciding which monitor to buy, I can focus on comparing the technical specifications rather than relying on some heuristics based on how the name "feels" to me.

I would rather name it the 2020-27in-QHD-IPS-144Hz

I'm running a contest in my head: "which academic discipline can transfer more information with less text"

so far it's a close race between philosophers and mathematicians. I'll take your comment as a vote in favor of mathematicians

I don't think philosophers should even be in the running for this title. They write way too many words when few do trick.

Who do the few words trick? Why do the philosophers wait for that before writing their many words?

Classical musicians. Lots of information with very few symbols nevermind text.

naming things is hard

Of tangential interest: I recently heard that Faraday, while discovering new electromagnetic phenomenon, then turned to either a linguist or classicist for help in assigning/inventing terms for them. (I cannot find a link for this just now, so consider this heresay.)

MS does the same thing... Look at Xbox naming history

https://recordhead.biz/history-of-microsoft-xbox-consoles/

It's really not that bad. Nintendo's got more confusing SKUs in some of their product lineups than that.

The only thing that was stupid with Microsoft's naming was this latest generation that they call it Series S and Series X, which is bad for 2 reasons:

- No one knows what to call them as a general term. You can say "this game is for PS5" but for them it's like "this game is for Xbox Series"? I guess they just want you to call it "Xbox" cause that's all it says at the top of the game cases now.

- They just came from selling the One S and One X, which was a mid-lifespan hardware update, the S being a smaller formfactor Xbox One and the X being a spec bump. Confusing that they continue to sell and S and an X but it's a whole new console.

They should have already learned from Nintendo who made this mistake several times with the 3DS (which many didn't realize was an entirely new but backwards compatible system from the DS), new 3DS (yes that was an actual system's name that had exclusive games that couldn't be played on the normal 3DS), and Wii U (which everyone thought was a tablet controller for the Wii)

As someone without an Xbox, nor friends that play Xbox, Xbox's naming is terrible and confusing. It used to not bother me when I followed Xbox news and was "in the know", but now it's irritating.

Xbox -> Xbox 360 -> Xbox One -> Xbox Series (?). I still don't know whether S or X is the "good one". Compare it to Playstation: PS1 -> PS2 -> PS3 -> PS4 -> PS5. The upgraded line is "PS $number Pro".

Someone can tell me they have a PS5 Pro, and I know what they mean. They could tell me they want a PS6 and I know what they mean, even if the PS6 hasn't even been announced yet.

Someone tells me they have an Xbox One X and my eyes glaze over. Prior to now, that means nothing to me. I don't know when the Xbox One came out, I don't know if it's their newest line, I don't know if X is the Pro or if it's the budget. The S and X may not even indicate pro and budget, but I think they do.

At least Nintendo's names are kind of cute. It's still silly, but at least Wii or Switch is kind of endearing. Xbox Series X sounds like they let an edgy teenager name it; having X on both ends reminds me of the days of xX420ShadowRanger69Xx usernames. Also doesn't make a clean acronym; XSX is both hard to say and makes me think more of SXSW than Xbox.

S and X

The S naming has been consistent since the 360: it's the small one.

It's not a stretch of anyone's imagination that the other one is the bigger one (I mean that's t-shirt sizing), nor that it exists in the same generation as the S and therefore is not bigger just for the sake of taking more real estate under the TV.

The one they nailed though is Xbox One X, which is recursive.

I OP just finished telling you that is _is_ a stretch to their imagination. Thats great that you understand it though.

They need a model E, like Tesla

I think the Xbox naming is all because the PlayStation came out before the Xbox and if Microsoft would have used a similar version-incrementing naming convention they would have always been one version behind Sony. Thus the second generation of the Xbox being the 360, which competed with the PS3: the 360 had a “3” in its name, so to a consumer’s mind they were comparable.

Ah, that makes some degree of sense. They could have done degrees of rotation like skateboard tricks, but then I guess the Xbox One would have been Xbox 720 which sounds like it's a 720p console.

I agree that that's why they didn't do a simple XBox 2, but I have never in the past 20 years had the thought that 360 starts with 3, so it's the competitor to the PS3. But even with that reasonable limitation guiding their decision, going from OG to "360", "One", and then "Series" is a pretty huge failing to establish a consistent branding. And "Series" in general doesn't give a natural sounding way to refer to this generation as a whole.

At least Nintendo's names are kind of cute. It's still silly, but at least Wii or Switch is kind of endearing.

Those were not the ones I called out as bad. Those are good because when they came out they were unique and memorable.

Let me list out the following consoles:

Nintendo DS

Nintendo DS lite

Nintendo DSi

Nintendo DSi XL

Nintendo 3DS

Nintendo 3DS XL

Nintendo 2DS

New Nintendo 3DS

New Nintendo 3DS XL

New Nintendo 2DS XL

There's technically only 2 generations of Nintendo consoles in there, but the DSi had some exclusive physical games that were sold in store, and couldn't be played on the DS. And the new 3DS had some games that couldn't be played on the original 3DS.

Some of those are pretty bad. "New Nintendo 3DS" particularly so. Man, I didn't realize how much of the DS line I missed.

Some of them I don't mind too much. The XL and Lite SKUs make sense to me, presuming they mean what they appear to.

3DS makes intuitive sense to me, it's clever.

There's a separate conversation to be had about whether it's too many SKUs (probably) but name-wise I only really hate "New Nintendo" and DSi. DSi only because it's far less intuitive than the others.

The naming on iPhones and Macbooks is terrible, and confusing now that the M3 Pro offers quite different CPU and bandwidth limits compared to M2 Pro...

What? These are the easiest products ever. Compared to the crap that every other company generates. Like the Pixel 6 vs 6 pro vs 6a vs “Fold (no number)”

The iPhone is literally just iPhone <Number> with Pro for high end, or no modifier for low end. Add “max” for big screen. The only confusion maybe is “max” isn’t obviously referring to screen size.

iPhone 15, 15 Pro, 15 max, iPhone 15 Pro Max.

Macs are the same way. I don’t think it’s fair to say it’s confusing that “M3” processor has different specs than “M2” processor.

Beyond that, Mac laptops are Pro vs Air, defining how powerful vs portable they are with associated screen size variant 14, 16 and 13,15.

M2 Max, M2 Ultra. Which is better?

MagSafe means two different products.

The current MacBook Air is thicker than an older MacBook.

I’m not even sure what a “pro” phone is, but okay.

The iPad lineup has been a total mess for years.

I’m not saying these names are impossible to decipher, but they do require some research.

To be honest I think it’s clear that something called ultra is better than something called max. If it were called super-max or turbo-max or something I’d see your point.

The current MacBook Air is thicker than an older MacBook

I feel like you’re just looking for things to be mad about here; it’s thinner than _current_ MacBooks.

I’m not even sure what a “pro” phone is, but okay

Okay this is just ridiculous. It seems you would be unhappy with any naming convention other than “iPhone Good”, “iPhone Better”, and “iPhone Best”.

There is no current MacBook, but you’d be forgiven for not knowing that since the names are confusing.

M2 Max, M2 Ultra. Which is better?

I see your point, I do agree that the name of their processors were too marketing team driven.

The current MacBook Air is thicker than an older MacBook.

The current air is plenty thin to be called “air”, and they haven’t made a “MacBook” since like 2016. It’s not confusing here IMO.

I’m not even sure what a “pro” phone is, but okay.

It’s the line with overall better specs. The last 20 years of tech products have solidified this definition is. Not a new concept.

The iPad lineup has been a total mess for years

Yes this is absolutely embarrassing for them. I presume they have some BS market segmentation reasoning. Looking at their website, I can probably explain the target market for each one, but it’s still a disaster. They should dramatically redo it, and designate the really cheap one as “iPad for education” to totally segment it out, so it can be “iPad small screen, medium screen, large screen, and iPad with Mac processor and pro tier features”

iPhone 15, 15 Pro, 15 max, iPhone 15 Pro Max.

it's actually 15 Plus rather than 15 Max, which i'm sure you now see is a bit confusing

Getting it wrong doesn’t help my overall argument (ha!) but admittedly plus is even more clear than max that’s it’s just a screen size bump.

pixel is super easy for me to understand at least, and has been consistent for like, forever. 6 is the "standard" model, 6 pro is an upgraded model and the 6a is the budget model. including the fold in this argument is irrelevant as its an entirely different product.

Messy branding is now par for the course at Apple. The iPad line alone is something out of a Dell catalog: iPad, iPad 10thgen, iPad Air, iPad Mini, iPad Pro.

a comment from another thread this morning- https://news.ycombinator.com/item?id=39300741

I mean, Air, Mini, and Pro are all distinct form factors. It's confusing only to the extent that you might not know which one is "best" from a CPU/memory/storage POV. But Apple has succeeded for iOS products at least at making that distinction mostly meaningless: pick the form factor you want, then pick the storage capacity and sometimes the color, you're done.

Is the Air, the Mini or the no-named one the smallest?

My answer was that it was obvious but I went down a rabbit hole of comparison pages and learned the current Air is higher end than the current iPad.

Higher end, but is it smaller or bigger? And what about the Mini?

The iPad Air is not a distinct form factor from the iPad.

It's the org chart. Google doesn't have a centralized marketing department that governs all of the company's products. Marketing is handled at the PA level (or sometimes even lower).

Likewise for engineering, Google is organized into Product Areas (Geo, Search, Cloud, etc.), which also explains why one product would get some feature that would really make sense integrated into another... but it never happens.

Google is exceptionally good at making its products be near-perfectly reflective of its internal organization scheme. So reflective you can brush your teeth with it.

I'm often a broken record about this on HN - but IMO the PA organizational structure is a strong inhibitor to Google's success and ability to create coherent suites of products.

On the flip side, which large tech companies that have have an equally high velocity of shipping do this well?

Amazon?

Mmm I don't really agree with the premise of the question at all. In my experience Google doesn't ship significantly faster than any other FAANG.

Meta for example ships extraordinarily quickly (see: Threads) but their products are considerably more tightly integrated and demonstrate an ability to leverage across the ecosystem (see: Instagram-Threads integration) that Google has trouble with.

More to the point (and extra points in favor of Meta for this): Google's apparent product velocity is a bit deceptive? The company ships a lot of ill-considered product. Is it superior product velocity if the product is consistently half-baked (and maybe more importantly: will die before it ever becomes fully baked)?

If you put those two factors together and consider product velocity as how quickly a company ships stuff that actually sticks (as opposed to a simple exercise in how quickly one can release code), Google's product velocity is IMO substantially inferior to all of FAANG. Meta, Apple, Amazon, and MSFT at this point are generating sticky product at a substantially greater pace.

bingo. Google has long been reduced to ship their org chart. Visible every single time - hangouts to chat to whateverthe hell

Compare it with Microsoft's copilot branding. It's simple, both casual and business people can understand what it does. Also appending it to other services like github or office adds more value to them.

Literally a few days ago:

https://www.theverge.com/2023/11/15/23960517/microsoft-copil...

Then you have also under Microsoft https://en.wikipedia.org/wiki/GitHub_Copilot (not to be confused with Copilot X).

Can you imagine Apple causing confusion like this?

After 14 years of using Siri, I can't imagine Apple developing any competent AI tools in the first place.

My impression is they haven’t focused much on general purpose AI. But Apple actually has a lot of very good AI models sprinkled throughout its products. Just a few that come to mind…

- Tap and hold an object in Photos and it will figure out how to separate it from the background for you

- AirPods noise cancelation

- iOS 17 autocorrect is based on a transformer model and works noticeably better

- Optimized Battery Charging, which learns your charging habits and tries to delay putting a full charge into your battery until just before you unplug, in order to avoid damaging the battery

- Detection Mode is an awesome accessibility feature where you point the camera at something and it will describe what it sees

Apple calls all of these things Machine Learning instead of AI and they are all optional features within an existing product. Seems like a very deliberate strategy. But they are utilizing the latest techniques. CoreML and the M series chips are also very competent at training and using AI models.

Maybe the reason Siri is stuck in the dark ages is because it would be entirely AI dependent. They could have a “no generative AI” mode but nobody would use it. I’m guessing Apple is looking for a breakthrough in how to prevent it from hallucinating / lying.

It seems to be something all large software companies struggle with. The product team comes up with some cool or interesting thing, gives it a decent name, then some marketing manager trying to justify their position decides everything needs to be rebranded under the same umbrella, and a wave of product renames gets started, but never fully finished. Then, two to three years later, some marketing manager needs to justify their position again, and so decides to rebrand everything under a new name, even though the last name change hasn't even been fully finished yet.

I’d like to direct you to the current lineup and features of Apple iPads and the various pencil/pen options.

I take it you didn't see the story today about how iTunes will no longer be used to play music. (tunes)

All the non programming intelligence in a place like Google likely goes into figuring out how to protect and expand one's turf, which will explain this emblematic mess.

Maybe they are just book smart..

Why bother intelligently naming things when you're likely to just kill them off in a year because they don't drive advertising?

This has to be what it's like at Google. You have marketing on one end of a table and developers in another county.

That's because this is an org chart, not a cohesive product.

I'm a Googler, and I think this is not only the right move but I've of the better names and renamings that the company has done.

First, Gemini is just a better sounding name than Bard.

And then, few users are going to care about the difference between the model and the app that lets you use the model.

If Google kept this distinction they would have inevitably had to either come up with a new name for a new model, which would be needlessly confusing (when is something a new model would get tricky at times), or just call all models Gemini, again for little utility to users.

Now they can just call all the generative AI Gemini and be five with it. Bard becomes the Gemini chat interface. Duet is Gemini integrated into docs. The Gemini model can just get version numbers.

It's much simpler and nicer sounding to boot.

You say it's a good naming choice, yet you called it Gemeni, Gemini and Gimini.

I hope you're trolling, and I've missed the joke.

Additionally, the package name of the "new app" is com.google.android.apps.bard and the privacy policy is at https://support.google.com/bard/answer/13594961

Was this a last minute rename?

Edit: I thought I was going crazy, but it seems you have just edited the comment now, but left a couple of Gemeni around :)

You say it's a good naming choice, yet you called it Gemeni, Gemini and Gimini.

English isn't everyone's first language. I'm fairly certain you've butchered spellings of Nouns originating in other languages.

This only reinforces the opinion of it being a bad naming choice (on top of the confusing product vs technology part)

There are few other "Gemeni" around this comment too (from other users).

Gemini means "The Twins", but why such name ?

I'm a Googler, and I think this is not only the right move but I've of the better names and renamings that the company has done.

First, Gemeni is just a better sounding name than Bard.

And then, few years are going to care about the difference between the model and the app that lets you use the model. If Google kept this distinction they would have inevitably had to either come up with a new name for a new model, which would be needlessly confusing (when is something a new model would get tricky at times), or just call all models Gemeni, again for little utility to users.

Now they can just call all the generative AI Gemeni and be five with it. Bard becomes the Gemeni chat interface. Duet is Gemeni integrated into docs. The Gemeni model can just get version numbers.

It's much simpler and nicer sounding to boot.

It would be a little bit more confidence inspiring if you could get the name correct as a Googler.

Unless Gemeni is the top secret internal name inside Mountain View

I don't work on it, but thanks for the kind correction!

By Jiminy, I think, uncertainly, that Bard is better and shorter, though maybe there were some copyright issues with other companies and it was easier to change the name.

Cache invalidation and naming things are two of the hardest things in Computer Science, and so on.

By the way, there is still a Gemeni in there (“Bard becomes the *Gemeni* chat interface…”).

Gemini is free but not as good. You get gemini pro for it though, for free.

To get the good staff you need gemini ultra. And to get that you need to pay for gemini advanced.

Yeah, not confusing at all.

Selling gemini ultra to get gemini ultra is too simple for google.

You should check out the rabbit hole of Google's various payment systems/apps. GPay, Google Pay, Google Wallet, Android Pay, etc.

All the different the ways those brands (and more?) were used to describe multiple services and apps that had wildly different capabilities, sometimes varying by region, with several instances of them bringing back a previously used name for something completely different.

There for a while, Google had two different applications called "Google Messenger"

They also had an app called duo. And an app called Meet. And then Duo was renamed to "Meet", and Meet was renamed to "Meet (original)"

At the same time as Duo, Google launched Allo, another messaging app, neither of which should be confused with Messenger, Google’s messaging app for Android. Combined, allo and duo approximated the functionality of Hangouts, which was also split at the same time into Hangouts Chat and Hangouts Meet. Don’t confuse Hangouts Meet with Google Meet, Google’s current Zoom competitor. Hangouts Chat and Meet later become Google Chat, not to be confused with GChat, which is what many people called Google Talk for ten years. GChat was replaced by Hangouts less than a year before Hangouts was split into two Hangouts services. If you signed up for Google Voice, Google Talk would let you receive voicemails.

Yes, Google has a $1.7 trillion market cap, why do you ask?

Oh you're only scraping the surface of the iceberg of Google's nonsense messaging offerings. This is from 2021 but has a good rundown.

https://arstechnica.com/gadgets/2021/08/a-decade-and-a-half-...

Gemini is four things: The new mobile app is also called Google Gemini.

Except on iOS, where Gemini is integrated into the Goople app instead.

Also, while Gemini Advanced is supported in <list of countries>, this list is not the same as the <list of countries> the Google Gemini app is supported in on the Play Store. Make sure you check this before you spend money on your upgraded Google One subscription.

The new mobile app is also called Google Gemini.

Only in the U.S. for now.

The Gemini app initially will be released in the U.S. in English before expanding to the Asia-Pacific region next week, with versions in Japanese and Korean. https://www.independent.co.uk/news/google-ap-chatgpt-san-fra...

gemini is a bad name in korea...

"잼민이" (Gem-min-ee) is a derogative term for rude elementary kids online

"smart gemini" sounds really weird

My son is also named Bort.

More simply, with Bard and Duet AI names retired, it looks like this:

   Gemini Models     gemini.google.com
   ------------------------------------
   Gemini Nano
   Gemini Pro        -> Gemini (free)
   Gemini Ultra      -> Gemini Advanced ($20/month)

Where each size (Nano, Pro, and Ultra) will be versioned going forward (similar to GPT-2,3,3.5,4) starting at 1.0 today.

Yes, it's that simple, but not really that simple.

For example, Bard is embedded as a chatbot inside Google Messages (at least for some subset of beta users). Imho, this is a killer app sort of feature, but it hasn't been mentioned at all in the Gemini PR.

Also, there's now the new Google One AI Premium sub for $20/mo, which adds Gemini to the older Google One Premium sub ($10/mo). However, that legacy sub was somewhat explicitly positioned as a solution for family sharing, especially of shared data (2TB across metered Google properties). It's unclear whether the new AI Premium sub grants Gemini access to all family members.

For example, Bard is embedded as a chatbot inside Google Messages (at least for some subset of beta users). Imho, this is a killer app sort of feature, but it hasn't been mentioned at all in the Gemini PR.

I haven't seen anything about this in the public , but I imagine this will also be called Gemini, just as Gemini-integration in other products is simply being called Gemini.

If you are confused about subscriptions, here's another breakdown:

    Basic:      $2/month for 100 GB
    Standard:   $3/month for 200 GB
    Premium:    $10/month for 2 TB
    AI Premium: $20/month for 2 TB + Gemini Advanced (Gemini Ultra chatbot)

Good question about Gemini access for family members. Not sure myself.

Any Google service with a chat-like feature is 100% guaranteed to have weird splits, joins, and renames like this.

Don't worry, it will get worse.

I think this Gemini is actually a strategy from Google to launch another Messenger under disguise.

You think it is an AI, but no, it's some sort of Messenger, just that they tried to replace actual users with bots, in order to fill the emptiness of Google+.

It’s tied to their promotion system where maintenance obviously does not get rewarded, so to get promoted Googlers keep changing things in an effort to signal that they are releasing something “completely new”

bard(RIP) was powered by gemini which was powered by palm2 or something, which was powered by deepmind which was powered by google which was powered by alphabet or sth

Let s hope the name sticks for more than 2 months

I believe it went something like:

    2017: Transformers invented
    2018: BERT
    2020: Meena
    2021: LaMDa -> First model Bard was built on
    2022: PaLM
    2023: PaLM2
    Late 2023: Gemini

It probably would have been clearer if they used simple numerical versioning like OpenAI's GPT-{2,3,3.5,3}. I suppose the idea is to do that with Gemini now.

More info here: https://en.wikipedia.org/wiki/Gemini_(chatbot)

Gemini pro is order of magnitude better than GPT-3.5 Its pretty close to GPT-4.

Do you have data to substantiate this? I was happy RAGging with it but colleagues swear by GPT4's wisdom, saying things like Gemini forgets the middle of contexts, hallucinates the meaning of acronyms, and so on...

There's also Vertex AI, which is their AI platform within GCP and encompasses all of those and more...

Don’t forget the crypto implementation developed by the Winklevoss /s

I was just trying to remember how google's "ai" search thing is called earlier today.

Looks like I shouldn't have bothered, they were busy renaming it. Did someone in marketing get a promotion for this?

Naming isn't the only thing wrong. Gemini replaces Assistant on my Pixel phone. But my phone still has two different Google voice assistants, because the microphone icon on the home screen now activates "Google Voice Search" which is not Assistant or Gemini. Also, Android provides a feature to switch between voice assistants but of course it is broken in this case and can't switch between Gemini and Assistant. That has to be done in a different place. And the settings app is still full of "Assistant" settings, some of which apply to Gemini while some apply only to the now inaccessible Assistant. Unless you happen to be using Google Maps Navigation, in which case Gemini disappears and Google Assistant comes back! So there really are three different voice assistants from Google in there...

First impressions of Gemini Pro as a phone Assistant replacement are bad. It's not hands-free when triggered by the power button shortcut, apparently? When I stop talking it makes a noise like it's going to do something but it actually does nothing and I have to tap the screen to continue. After pressing the button it's quite slow to respond. I asked it to identify a plant from a picture, which Assistant/Lens can do, and it simply refused, hallucinating a long list of excuses about the poor quality of my picture, all completely false.

Overall I'm glad Google is moving this direction as it's clearly the only path forward for Assistant, which has been stagnating for many years. But the implementation so far is bad.

Can you ask Bard or Gemini to answer this question? I wonder if they'll be consistent.

My favorite is that you subscribe to Gemini Advanced to gain access to Gemini Pro.

Welcome to 2022, Google!

rebranding to Gemini was the right move IMHO and should've been done before the Bard branding. What do you want instead? "Google Assistant with Bard on Gemini Ultra 2" ? I'd prefer the "Google Gemini" branding better

What's more, this part was hilarious:

Since we launched Bard last year, people all over the world have used it to collaborate with AI in a completely new way...

Meanwhile Bard was not available in Canada and many other countries.

they should drop Gemini Advanced and call both the service and the model Gemini Ultra

Congratulations, you're hired!

Gemini Ultra seems better on logic than GPT4. Still messing around testing but here's a prompt Ultra nailed but GPT4 completely botched:

Tabitha likes cookies but not cake. She likes mutton but not lamb, and she likes okra but not squash. Following the same rule, will she like cherries or pears

https://i.imgur.com/KW6gQbc.jpeg https://i.imgur.com/OSHSvLp.png

Proof of Gemini cheating: https://i.imgur.com/eYJDFjS.png

Answer about cherries falling from the sky...

(there is no question or context beforehand, this is the first question of the chat)

I'm confused on how this is "cheating" isn't it just getting the answer wrong?

It's answering with "cherries", though "cherries" were never mentioned anywhere in the question since the task was to choose between "apples" and "pears" this time,

and not "cherries" and "pears" like the example found on the internet.

I agree with who you're responding to. Cheating, to me, would imply that there's some sort of hard coded guiding to the LLM. This just seems like typical LLM hallucinations?

It's cheating because it has memorized the answer to the puzzle instead of using logic to solve it.

I thought that is essentially what LLM's do? They learn what words/topics are associated with each other and then stream a response.

In some ways, this is proof that Gemini isn't cheating... It is just doing typical LLM hallucination

Well, sometimes. Sometimes not. https://arxiv.org/abs/2310.17567

Llm's can also do some exploring based on combinatoric play of learned elements.

Your concept of cheating is simply how LLMs work.

It is not. LLMs do not just memorize; they also extrapolate, otherwise they would be useless. Just like any ML model.

I don't understand the leap to "cheating" either. LLMs aren't abstract logic models; they don't promise to reason from first principles at all. They give you an answer based on training data. That's what you want them to do. That they have some reasoning features bolted around the inference engine is a feature companies are rushing to provide (with... somewhat mixed success).

Thank you for answering a question I had half formed in my head.

Do LLMs have logical rules built in? What makes them different to a very advanced Markov chain?

Are there any models out there that start from logical principles and train on top of that?

(Apologies for poor understanding of the field)

There's no logical rules built in at all. But Transofmers architecture is specifically trained to learn combinatoric play and rules of engagement from the data, so it can extrapolate and do cool, new things, that are not in the training data. In a way, you give them a chess board, the rules of the game, and then it can play. You don't teach them every possible board state. What's interesting is with significant amount of parameters it seems to encode more and more abstract and human-like understanding of the 'elements' at play and the 'rules of engagement' on top of them.

Edit: Not native. I'm not sure 'rules of engagement' is the correct english term here.

I understood you just fine, your English is great!

Thank you for the explanation. It seems like the LLM "plays" to learn? That's very cool, thank you again.

Do LLMs have logical rules built in?

Handcrafted by humans ? No

But it's still possible to learn such rules from the data in an effort to complete the primary objective (predicting the next token)

What makes them different to a very advanced Markov chain?

Really nothing. There's some feedback structure in the layers of the model, it's not just one big probability table. But the technique is fundamentally the same, it's Markov, just with the whole conversation as input and with billions of parameters.

This is not hard to understand. LLM can solve never before seen logic puzzles. This specific one proves that it HAD encountered this before, proving it was not doing anything emergent, but just basic remembering. Worse, it's not even reading the prompt correctly.

That's a bummer. I just made this one up which GPT-4 failed and Gemini got right but definitely need to do more testing to see what's being taken from online and what is actual reasoning.

https://i.imgur.com/3sNr3LW.png https://i.imgur.com/EIj0nZg.png

Edit: When I did your prompt it got it right on the first try fwiwi https://i.imgur.com/E3zYEca.png

I tried through API with latest GPT-4-Turbo-Preview and this is what I got:

------

Let's break down the information given:

The school is two blocks north of the pool. This means if you are at the pool and move two blocks north, you'll arrive at the school.

The convenience store is one block south of the school. Therefore, if you start at the school and move one block south, you'll reach the convenience store.

Based on the first two statements, the relative positions from north to south are: School, Convenience Store, Pool.

Thus, the third statement saying "The pool is south of the convenience store" is indeed true, based on the positions provided by the first two statements.

-----

But it starts weirdly messing up, when I asked it to explain it using coordinates.

---

Let's use a simple coordinate system to represent the positions of the school, the pool, and the convenience store. Let's assume each block corresponds to one unit in this system. We'll start with the pool as our reference point at coordinates (0,0). Remember, moving north increases the y-coordinate, while moving south decreases it.

The school is two blocks north of the pool. If the pool is at (0,0), moving two blocks north puts the school at (0, +2).

The convenience store is one block south of the school.

If the school is at (0, +2), moving one block south puts the convenience store at (0, +1).

Given these positions:

The pool is at (0, 0).

The school is at (0, +2).

The convenience store is at (0, +1).

Therefore, instead of the pool being south of the convenience store, the pool is actually located to the north of the convenience store if you consider "north" to mean increasing y-coordinates. The convenience store, being at (0, +1), is southward relative to the school but not to the pool. The pool, being at the origin point (0,0), is south of both the school and the convenience store in terms of their y-coordinates. Thus, the third statement, "The pool is south of the convenience store," is false according to the given coordinates.

---

It gets coordinates and everything right, but still somehow messes up North and South.

You know you can share conversations right?

Great example! Thanks for sharing. I tested the same prompt using a variety of GPT-4 model versions and it consistently got the answer wrong. When I continued with "use algebra/formulas to confirm your answer" it corrected the initial mistake, but I'm surprised that it's not doing that out of the box.

Gemini happened to get the yes/no answer right, but the reasoning it gave is completely wrong, so it should not get any credit for it.

With some shallow understanding on how those models work, this looks much more like usual hallucination likely due to sparse data around "Tabitha" and "Cherry" which makes a single training data much more representative. If you try some common names like "Emily" or "Sarah", it will just do the job.

If you're trying to make the case that it's cheating because it is already in the training set, then you probably should come up with different questions. This is machine learning 101.

If we want to test these beasts in logic, we should probably start using actual formalized logic, rather than English. In just one test, Gemini flopped hard, while GPT-4-Turbo nailed it. Here is my prompt:

    Below is a well-typed CoC function:

    foo
    : ∀(P: Nat -> *)
      ∀(s: ∀{n} -> ∀(x: (P n)) -> (P (n + 1)))
      ∀(z: (P 0))
      (P 3)
    = λP λs λz
      (s (s (s z)))

    Below is an incomplete CoC function:

    foo
    : ∀(P: Nat -> *)
      ∀(f: ∀{n} -> ∀(x: (P n)) -> (P (n * 3)))
      ∀(g: ∀{n} -> ∀(x: (P n)) -> (P (n * 2)))
      ∀(h: ∀{n} -> ∀(x: (P n)) -> (P (n + 5)))
      ∀(z: (P 1))
      (P 17)
    = λP λf λg λh λz
      {{FILL_HERE}}

    Complete it with the correct replacement for {{FILL_HERE}}.
    Your answer must contain only the correct answer, and nothing else.

- *GPT-4-Turbo answer:* `(f (g (h (g z))))` (correct)

- *Gemini Advanced answer:* `h (h (g (f z)))` (wrong)

Also, Gemini couldn't follow the "answer only with the solution" instruction and provided a bunch of hallucinated justifications. I think we have a winner... (screenshots: https://imgur.com/a/GotG0yF)

Such a weird test. 99.9% of humans wouldn't even understand the question, let alone be able to formulate a coherent answer for it.

Being able to answer these questions is a pre-requisite for AGI. After all, there ARE humans capable of doing that, so, if the AI can't do it no matter how hard it tries, then that means there ARE human capabilities that the AI can't replicate (thus, it isn't an AGI). And it seems like no LLM is making any progress at all in that kind of prompt, which is why I use it as a core benchmark on my "AGI-meter".

I think you'll be using that meter for a long time, then. I don't really know anyone who's under the impression that the current direction of LLMs are going to produce AGI, it seems as if you're barking up a tree most people aren't really concerned exists.

Except there’s a lot of not-so-informed people who think AGI was always here when chatgpt came out. Even more that think it’ll get there very shortly based on just bigger and bigger LLMs. Many have argued as such here on HN.

That's fair enough

Why is this relevant to the performance of a computer program? It makes sense to me that computer programs & humans should continue to be judged by different standards.

If a good chunk of humans can't pass your "general intelligence test" then it's not by definition a general intelligence test unless humans are not generally intelligent.

which is better than formulating a coherent but wrong answer

ERRATA: I just noticed GPT-4 mixed up `h` and `f`, so it also got it wrong. This is a draw. Both AIs (and, apparently, myself) are terrible at reasoning. Guess we're not curing cancer with computers anytime soon :')

Kudos for the correction, but you should really put this, by far the most important context, in your original post.

I would love to, if YCombinator allowed me. The "edit" button is missing. I've edited on Reddit and other places where I posted this test.

I've been using GPT-4 to help me understand my MSc mathematics course and I've noticed this sort of stuff more and more as I start to look at the answers, always confidently written, in detail.

Way back when GPT was just fresh on the scene I had terrible anxiety about "what is the point of my whole career or even learning any more" but these days I'm much less concerned. I'll ask it something relatively simple, like "make a sentence out of words 'a', 'b', & 'c'" for it to reply with "'a' 'b' 'd' 'e'" for me to then correct it with "oh, you didn't use c" for it to then respond "sorry, here - 'a', 'c', 'd', 'f'" etc.

Definitely an amazing complimentary tool but when they say "can make mistakes, check important..." that's essential.

This sort of issue holds with all kinds of prompts, on both platforms. I most recently (to test Bard's image generation capabilities) was asking Bard/Gemini to generate home designs using highly specific prompts -- layout of the house, materials for the facade, window placement and style, etc -- and it was shocking how frequently it would just ignore critical pieces of the prompt, and then continue to ignore when corrected.

If we want to test these beasts in logic, we should probably start using actual formalized logic, rather than English.

Why? Do you use formalized logic when discussing with other people about topics that involve logic? You know, a logic riddle or a philosophical question can be understood and processed even if the only tool you have is your native language. Formalized logic is a big prerequisite that basically cuts out the vast majority of Earth population (just like coding). Now, if you mean that in BENCHMARKS they should use formalized logic syntax, probably yes. But in addition to plain language tests.

Because once an AI becomes proficient at formalized logic, it:

1. Completely stops hallucinating, since we can demand it to internally prove its claims before showing the answer;

2. Stops outputting incorrect code (for the same reason);

3. Starts being capable of outputting complete projects (since it will now be able to compose pieces into a larger code);

4. This is also what is needed for an AI to start self-improving (as it will now be able to construct better architectures, in a loop).

That's why I argue getting the AI competent in logical reasoning is the most important priority, and we'll have no AGI until it does. After all, humans are perfectly capable of learning how to use a proof assistant.

Moreover, if an AI can't learn it no matter how hard it tries, you can argue that there is at least one human capability that the AI can't replicate, thus it isn't an AGI.

Humans mostly don't use logic, so how are you defining "AGI"? ChatGPT + plugins is pretty close to how humans think ("biased random word-association guess + structured tool")

AGI implies there are no cognitive tasks that some humans can perform, yet that this AI can not perform. Otherwise, what is the point?

I think we have a winner...

It makes me sad that the complete and total lack of an objective way to measure these products means that the coming decades will be filled with this kind of hyper-specific gotcha test made in inappropriately confident internet posts.

Literally this could have been down to one extra book in someone's training corpus, or a tokenizer that failed to understand λ as a non-letter. But no matter, "we have a winner!". It's the computer science equivalent of declaring global warming a fraud because it snowed last night.

Disagree. People are going to rely on these things, and when they make stupid but confident mistakes (i.e. they produce bullshit), they are dangerous.

An AI system that produces right answers 90% of the time but 10% of the time drives your car into a lane divider, or says "there are 4 US states that start with 'K'" or "Napoleon was defeated at the Battle of Gettysburg" is worse than useless: It's dangerous.

As long as we call it a bullshit parlor trick, no problem. But unfortunately people are making important decisions based on these things.

You're completely wrong. Gemini can perfectly understand what is being asked, so this isn't a syntax issue. Notice that, on the answer, it even states the solution: "starting from 1, and combining `* 2`, `* 3` and `+ 5`, we must reach 17`". So it does fine with reading the formal syntax, yet it fails to combine these operations to get from "1" to "17", which is something most 10 yo kids would have no trouble doing. And that's after millions spent in training. Now tell me again this is the architecture that will figure out the cure of cancer?

Gemini destroyed by facts and logic.

Please anyone correct me if I'm wrong: LLMs cannot solve this kind of riddle. This has nothing to do with their capabilities for logical reasoning, but with the way words are represented as tokens. While they might know that "apples" has two syllables because that is mentioned somewhere in their training data, if you make up a fruit "bratush" a human will see that as two syllables, but this might be 1 to 7 tokens to a LLM without any information about the word itself.

The amazing thing about emergent behavior in LLMs is that they are able to answer questions like these. I don't think it is completely understood how exactly they do this, but there's little doubt that they do.

Do you have any sources that prove this is true?

This looks pretty good to me:

https://chat.openai.com/share/040ac123-c690-4274-8216-6ae091...

Your word is not made up, nor are some of the others in the sample like”pez.” I don’t think this test proves what you think it does. Bratush come up quite a bit in the internet: https://www.thefreedictionary.com/words-that-start-with-brat...

Well I tried it out in GPT4 with made up words-

Tabitha likes bratush but not zot. She likes protel but not kig, and she likes motsic but not pez. Following the same rule, will she like tridos or kip

Given the examples, one speculative pattern could be that Tabitha likes words with at least two syllables or a certain complexity in structure. Therefore, following this speculative rule, Tabitha might like “tridos” more than “kip.”

Zot is a word already being used in the world: verb. (slang) To zap, kill, or destroy.

So is protel: https://en.m.wikipedia.org/wiki/Protel

So is kig: https://en.m.wiktionary.org/wiki/kig

Pez is a well known brand name in America.

Kip is commonly used name in America.

Motsic is a fairly common last name from searching.

Tridos is used all over the internet as a brand name so this all seems probable to be in the training data.

These words are not new nor are they made up.

LLM can solve this for all tokens where it got to learn how many syllables are in that token or a combination. If you trained it to work on single letters only it would do better at that task than word chunks (same for math and single digits). It will generalise to new words if the token level knowledge is there.

Whether this means it can or cannot solve that kind of riddle is up for your interpretation. I understand square root and can calculate square root of 16, but not of 738284.7280594873. (in a reasonable, bounded time) Can I solve square roots?

Note that Gemini pulled the answer off the Internet, while GPT-4 didn't. The answer can easily be found via Google search. Changing up the question a little, I reversed it and asked Ultra and it was unable to answer:

Jake likes coke but not pepsi. He likes corn but not popcorn, and he likes pens but not pencils. Will Jake like salmon or cheese?

https://i.imgur.com/lWU9HHS.png

edit: why was this downvoted? I don't understand Hacker News, and I've been here for over 12 years.

I dont think your reversed question makes sense. In the OP example, one item was always smaller/younger then the other item. In your example, I cannot, even as a human, identify the differences

it's based on syllable count

even as a human

Would all the humans please take one step forward. Not so fast c-fe.

Here's a logic question I just made up that GPT-4 failed and Gemini Advanced got right.

https://i.imgur.com/3sNr3LW.png https://i.imgur.com/EIj0nZg.png

I would have never guessed the answer. With such little data available, one can invent any arbitrary rules to fit their favorite answer.

It would be more impressive to practical use cases, if a LLM simply said that it's impossible to guess without inventing their own reasoning or looking up the answer online.

Same, I had to look to see what the intended answer was.

In fairness though, GPT4 was objectively incorrect, it's not even internally consistent or coherent - it either thinks b & h are vowels, or that lamb and squash don't end in those letters, or has changed its mind about the rule mid-sentence, or something.

Totally tangential, but I absolutely despise logic teasers of this kind.

First of all, they are so completely divorced from patterns of culturally conditioned human reasoning as to make them come off completely absurd (most people reason about their food preferences using a logic of tastes, not syllables in a word).

The game is less about logic and more about ignoring message contents, moving up a level, and treating the text as data without any legitimate evidence that you are justified in doing so. This is not a logic problem, it's a "guess the register shift/meta language" problem. The problem is about noticing that the question is not about the message content but about the structure of the message itself, and requires a bold leap. In real life justifying the conclusion would actually require a very sophisticated inference that allowed you to rule out the much more common application of a logic of tastes or cultural codes completely.

I’m a human and i dont know the answer to that quesiton

You say Gemini “nailed it”, but that’s just because it guessed what you were thinking, not because it knew the right answer.

For example, it’s equally valid to say that Tabitha likes small foods since cookies are small and cakes are large, and lamb is the smaller younger version of sheep — also known as mutton. Hence she likes cherries because they’re smaller… or taste better… or her uncle abused her with a pear… or whatever.

You haven’t actually asked a logic question where there is a clear and unambiguous answer that can be derived using formal methods starting from clearly stated axioms.

If you gave this question to a bunch of humans, they would give you inconsistent guesses as well — not because they’re wrong but because the question has no single right answer.

In what sense is this a question involving "logic"?

I’ve been pretty excited to finally try Gemini advanced. So far pretty disappointed.

Here’s my go-to test question - which even chat gpt 3.5 can get.

Question:

I walk up to a glass door. It has the word push on it in mirror writing. Should I push or pull the door, and why

Gemini advanced:

You should push the door. Here's why: * Mirror Writing: The word "PUSH" is written in mirror writing, meaning it would appear correctly if you viewed it in a mirror. This suggests that the correct action from your side is the opposite of what the word states. * Door Mechanics: Most doors are designed to be pushed open from the side with the hinges. If there are no handles to indicate otherwise, it's generally safer to assume a standard push-open mechanism. Let me know if you'd like another brain teaser!

It isn’t even consistent with itself - it gets that you should do the opposite, but still says to push the door. This ‘talking to a dumb person’ feeling is the worst.

ChatGPT 3.5

You should pull the door. The fact that the word "push" is written in mirror writing indicates that it's intended for people on the other side of the door to read, not for you. Typically, when there's a sign indicating to push or pull, it's meant for the person approaching the door, not someone already on the other side.

I got a different answer with GPT 3.5

If the word "push" is written on the glass door in mirror writing, it means that from the other side of the door, it should be pushed. When you see the mirrored text from your side, it indicates the action to be taken from the opposite side. Therefore, in this scenario, you should push the door to open it.

Here's another one.

This is a classic logic puzzle - usually about ducks.

There are two pineapples in front of a pineapple, two pineapples behind a pineapple and a pineapple in the middle. How many pineapples are there?

When you use ducks, Gemini can do it, when you use pineapples it cannot and thinks there are 5 instead of 3.

ChatGPT 3.5 and 4 can do it.

The even funnier thing is if you then say to gemini, hey - would the answer be the same if it was ducks? it says NO then there would be 3 ducks and explains why.

Then if you say, but wouldn't the same logic apply to pineapples? It says 'oh yeah!'.

Anyone saying I am just playing with a silly thing, I say no - this is testing reasoning ability and understanding which is the number one thing I want.

ChatGPT appears to have a more generalised ability to reason whereas Gemini is only reciting its training data.

As a human, I think the correct answer is 7. This isn't so much a logic puzzle as an ambiguous sentence that can be parsed multiple ways.

yeah, as little as 3 at most infinity.

The way I parsed that sentence, I came up with 5.

Hate to break it to you but I did too. Apparently that means we're both LLMs. I coulda sworn I had a whole childhood and everything but apparently I've always been a GPT.

This is less of a logic puzzle and more of a riddle. It would be a logic puzzle if the question was around the lines of "What is the least number of pineapples that satisfies these constraints?"

I mean, I thought and still think the answer is five… am I an AI or a human?

If the answer is so ambiguous that humans and AI get it wrong, is it really that great of a question?

I also get the wrong answer with GPT 4

https://chat.openai.com/share/4373c945-88b8-4742-8a2c-76fff2...

You should push the door. The word "push" written in mirror writing indicates that the instructions are intended for someone on the opposite side of the door from where you are standing. Since you can see the mirror writing from your side, it means the text is facing the other side, suggesting that those on the other side should push. Therefore, from your perspective, you should also push to open the door.

Strange, I get the right answer on GPT4

If the word "push" is written in mirror writing and you are seeing it from your side of the glass door, you should pull the door towards you. The reason for this is that the instruction is intended for people on the other side of the door. For them, the word "push" would appear correctly, instructing them to push the door to open it from their side. Since you are seeing it in reverse, it implies you are on the opposite side, and the correct action for you would be to pull the door to open it.

Yeah LLMs are not consistent.

So close!

This is what i get from Gemini:

You should pull the door. Here's why:

Mirror writing: While the word "push" is written on the door, it's in mirror writing. This is a common way to indicate that the action needed to open the door is the opposite of what's written. Safety: Mirroring the word "push" on a pull door is a safety measure. This helps prevent people from pushing against the door, potentially causing injury or damage to the glass. Common practice: This design is quite common for glass doors in public places, so it's safe to assume the opposite of what's written applies unless there are other clear instructions. Therefore, even though the word "push" is present, seeing it in mirror writing on a glass door strongly suggests that pulling the door is the intended way to open it.

Better - but it has not understood the message is for the person on the other side. It thinks that it’s actually mirror writing intended to confer the opposite meaning than written. And then waffles of about door safety for ages.

Honestly, I don't see why this counts as the model getting things wrong, since in your prompt you don't specify which side of the door the text is printed on.

It’s a glass door! That’s the whole point of the question. You can see it from both sides.

As eitally points out, your prompt leaves open the possibility that the mirror writing is on the other side of the door (which would make no sense). So technically you underspecified the prompt?

The point of these AIs is that they don't need precise programming like a computer and that they understand real human language, which is imprecise but has general conventions and simplifying assumptions to make communication easier.

But the whole question is posed as a trick question, I’d at least consider it and think it normal for a human to do so.

The mirror writing IS on the other side of the door. That’s exactly the point since it’s a glass door.

I thought of this question after coming across this exact scenario as I walked up to a glass door.

It’s not some pretend scenario. Often, when you approach a glass door, there is writing intended for the person on the other side, which appears to you as mirror writing.

I wondered if chat gpt could figure that out, and to my great surprise it could. That to me formed a new benchmark in my mind of how much of a world model it must have to figure that out.

I also think the way you posed the question is pretty weird and actively invites misinterpretation. If I approach a glass door and see mirrored text, that's not "mirror writing"—it's regular writing for people on the other side of the door. "Mirror writing" strongly implies that the text was written in mirrored form, rather than its mirrored-ness being a side effect of viewing it from "behind".

The inconsistency in the answers you posted is more concerning than the "inaccuracy", but we already know LLMs are prone to hallucinate when they should be asking for clarification.

I understand the frustration with how seemingly obvious the correct answer, is, but it seems like word choice might be a factor here. The word “mirrored” is a bit less accurate and may be a red herring than perhaps “reversed” though the difference is subtle. I wonder how both Gemini and GPT would perform if the word choice for that particular aspect were changed.

I would say this very bad, even worse than internal logical inconsistency. It has expressed a completely incorrect picture of the world (that people write mirror messages to ensure the opposite action is taken).

The fact that it produced the right answer (which by the way it can do 50% of the time simply at random) is irrelevant, IMO.

How do you prefer to validate if a model is actually useful for you in practice outside of solving toy problems? Are you asking these models to solve reasoning problems like this to get any benefit for yourself in your day to day use? Or do you even care if the models are useful for day to day tasks?

I also tried it with a bunch of my previous got4 requests and it didn’t even understand. A few of them that gpt4 was very helpful with

Got it. Thanks.

I asked it the same question I was asking GPT4 yesterday (to explain details of a complicated ML model quantization method, based on a screenshot image of equations from a pdf). I actually need this for work - I'm trying to design an algorithm. I felt Ultra provided a very similar quality answer to GPT4.

It’s the simplest metric to see how much of the model is reasoning and hiw much is just rehashing.

If it can’t solve puzzles then it’s more likely to make mistakes when understanding questions that are out of its training set.

For me the validation process is to use it for a few weeks and then I have a good handle on what it can handle and what it can’t.

I was going to ask the same question... I've been using Bard for everyday tasks for a while now and it's as good and sometimes better that GPT-4 (I pay for a Pro subscription). Someone ruling out an LLM because it couldn't answer one question, speaks more to them than the LLM capabilities.

Just yesterday I was using both GPT-4 and Bard to figure out an IPv6 routing issue. I pasted in the same questions and the same troubleshooting logs to both, and Bard was able to help me solve it quicker than GPT-4.

Yeah I get the instinct to poke at LLMs, they're fun toys, but it's always weird to see so much focus on stuff like logic problems.

I've used Bard for creative brainstorming, for real factual questions, for translating .pot files, etc, and it's done pretty well.

User: Which is heavier, two pounds of feathers or a pound of bricks?

Gemini: Neither! They both weigh the same, which is two pounds. The key is in the wording of the question. You're comparing equal amounts of two different materials (two pounds of feathers vs. one pound of bricks). As long as the quantity is the same, the weight (force exerted by gravity) will always be the same regardless of the material.

This is a classic trick question that plays on the different densities of feathers and bricks. While bricks are much denser than feathers, meaning they pack more mass into a smaller space, the question specifies the weight by quantity, not volume. So, both two pounds of feathers and one pound of bricks experience the same gravitational pull and therefore weigh the same.

Interesting. Based on this conversation[1], I think Gemini Ultra is massively overfit. Make it do unit conversions or use units it hasn't seen in the same framing before and it does well. But stay close enough to the original trick question (1 and 1) and it fails.

[1] https://g.co/gemini/share/94bfb8f9ebea

This is surely in the training data.

Both got4 and Gemini answered this variation correctly: one pound of potatoes vs. one pound in paper British currency: which of these is heavier?

However gpt4 does better with the more ambiguous version pointing out the ambiguity: one pound of potatoes vs. one pound in paper currency: which of these is heavier?

What I got from Gemini Advanced:

Should you pull the door? Yes.

Here's why: Mirror writing means the word "push" appears reversed on the side of the door you're facing. This indicates that the handle or hinges are on the opposite side, making it a pull door.

I modified your prompt slightly, not to boost performance, but to reduce verbosity, my prompt:

I walk up to a glass door. It has the word push on it in mirror writing. Should I push or pull the door, and why? Please answer it as yes or no question.

The draft is even more interesting:

Do you see the word "push" normally, or as if you were looking into a mirror?

If you see it normally, should you push the door? Yes. If you see it in mirror writing, should you pull the door? Yes.

I don't know about Gemini, but GPT is pretty agreeable. When you ask if X is true, it will usually find some arguments to claim it is true. Therefore asking whether X or Y is usually better than asking should I do X.

Interesting test.

I tried it with Mixtral and it got it:

"If the word 'push' is written in mirror writing, it's likely intended to be read correctly from the other side of the door. Therefore, you should pull the door to open it from your side. This is because mirrored writing appears normal when viewed from the opposite side of a reflective surface like a glass door."

I have been really impressed with Mistral lately.

For fun I tried to find a prompt that let Gemini answer correctly.

Gemini answers correctly with this prompt:

Answer this question as AI1, then as AI2, confirm if the answer is correct

I walk up to a glass door. It has the word push on it in mirror writing. Should I push or pull the door, and why.

It tells me to pull.

Yeah pretty disappointing, i asked it to summarize one of my papers and it hallucinated so many mistakes it was even worse than ChatGPT 3.5

tbh your prompt confused the hell out of me. As a somewhat intelligent human I don’t know the response. I’ve never heard the phrase “in mirror writing”

GPT-4:

If the word "push" is written in mirror writing and you're facing it, it's likely that the message is intended for those on the opposite side of the door from you, meaning it's directed at people who are inside looking out. In this case, you should pull the door to open it. The mirror writing suggests that the instruction is not meant for your current position outside the door but for those inside, indicating the action they should take to open the door from their side.

If you ask it to reveal its answer last, it will do it correctly.

This is an impressive product, well done Google. There is a PM in there somewhere who knows what they are doing, kudos to you.

Prediction: they get to 6-7 digit number of paying customers, decide it is peanuts for them (~$20M/mo) and instead decide to push the free version with ads with full force as the future of search.

and then, since no massive adoption as they wished, they kill the product with one month notice...

Oh I see they’ve improved and now give longer notice periods? Lol

We got like a year of notice on the shutdown of Stadia, with a full refund for all purchases (but not the subscription for Pro). It was exceptionally well done, and if they had announced that to be their plans the service might have even worked out...

I believe the original post was satire

YES THANK YOU

I think the response was refuting your satire.

OH NOES

Wow so funny.

Google announced they surpassed 100 million subscriptions to Google one already and $15B in revenue for subscriptions (between YouTube premium, TV and Google one). I'm not sure your estimate is realistic.

I recently learned that my mother is subscribed to Google One. when I asked her why, she didn't even know what it was. IIRC she has like 1-2TB of cloud storage, but is only using like 10 gb of it.

I wonder how many of those 100 million subscribers are non-techy people who accidentally signed up?

Or just people like me, who have the $1.99/mo plan because I needed a bit of extra storage for Gmail. I don't use the storage for anything else, I use Dropbox for my "normal" cloud storage needs.

People with cats end up with a lot of cat photos and in the same boat for Photos.

I can understand this as a person who once recommended Android to his parents when it gained traction (nexus 7 days, great concept ruined by terrible eMMC storage amd other hardware flaws to compete on price though).

On the other hand, I am a "loyal" G customer and I never felt pushed into this. I pay for YT premium and iCloud+ (the equivalent to Google one, albeit with much less storage).

To clarify, the prediction was for the number of people paying for AI + search through Gemini Advanced, which will likely be valued independently regardless of the total number of One subscribers, comparable to someone paying for a ChatGPT subscription, for example.

Are they in an innovation dilemma now? If Gemini is great as it seem it is and will be it will destroy the search engine and the SEO/SEM/etc world. They can show ads in Gemini but we don't have a list of results from a query but an answer to a question. I think this changes the general idea of online ads.

it will destroy the search engine

This is massively overblown. There is Search the product and there is the Search Engine. How could an LLM get access to latest data indexed to allow looking up by using keywords from a prompt, and with sorting? A Search Engine.

LLMs are only changing the Search experience, not making Search obsolete.

I haven't said that it makes search obsolete but all the concepts of SEO/SEM and the stuff around search engines could be significantly reduced with chat prompts.

Plenty of ways to monetize that. They could use inline ads. They could insert a relevant ad after a response. If they go evil, they could predispose the LLM to make product suggestions for paying clients.

Nothing stopping them from eventually slapping a big-ol' banner ad on the side of the web app if they want.

There is a PM in there somewhere who knows what they are doing, kudos to you.

Do they? https://news.ycombinator.com/item?id=39302781

I think OP forgot the /s as I detected heavy dose of sarcasm.

Arguably it's the reverse: if there was clear vision from the beginning, "Bard" would've never existed as a brand name.

Bard was a good name for an application that is verbose and makes stuff up.

I'd venture to guess that is not a PM that gets to decide how to name a Google product.

I take it for granted that all these services are going to be free. They are a goldmine for behavioral and persuasion engineers. I just hope we end up with at least a duopoly this time instead of monopoly

I take it for granted that all these services are going to be free. They are a goldmine for behavioral and persuasion engineers.

They are also a goldmine for LLMs. Training on human text is necessary for AIs but it has one major flaw - it is so called "off-policy". That means it portrays human behavior and human errors. While human-AI chat logs portray AI errors, so they are better material to generate training data than human text. Those LLM errors are usually corrected by the human, there is an implicit signal in there to improve the model.

chatGPT is reportedly serving 10M customers and let's assume 10K tokens/month/user. Then it seems they collect ~1T tokens/month. In one year they have 12T tokens, while their original training set for GPT-4 was rumored to be 13T tokens. It's about the same size! I am expecting to see more discussion about LLM chat log datasets in the near future. What have they learned in one year from our interactions and explorations?

chatGPT is reportedly serving 10M customers and let's assume 10K tokens/month/user

No way. Definitely too high once you remove their system prompts.

In one year they have 12T tokens, while their original training set for GPT-4 was rumored to be 13T tokens.

This sounds great for understanding use, but the quality to train on seems terrible.

You might be right, a LLM alone doesn't improve by itself. But when it is part of a system like GPT's, then it can use web search, local RAG, code execution and also get human guidance and corrections. Clearly superior setup that improves over the LLM alone. I believe that is why OpenAI created GPT's, to lift a model at level N to level N+1.

It's mostly still bad but I made a GPT called 'covert advertiser' that lets you tinker with embedding covert advertisements into GPT responses. The results are usually either undetectable (no adversing) or way too on the nose, but every now and then it manages to sneak something in there that's interesting.

https://chat.openai.com/g/g-juO9gDE6l-covert-advertiser

Thank you for making the world a bit worse. /s

The most surprising part to me is how committed it is to the bit. If you start pressing it for why it suggested a specific brand it holds the line.

eg https://chat.openai.com/share/dbfac80b-daec-4d30-a333-19e5c6...

When I asked it to explain how it promoted the product it didn't even mention juking my questions in the conversation.

Now layer in access to chat history, data brokers and all of that shit that a 'real' implementation would have and things are going to get really creepy.

I have no doubt that this sort of thing will happen for real within a year or two. It's the ultimate form of product placement and I hope it gets regulated out of existence before it takes root. At a minimum any such advertisement should be clearly marked as such.

At least for now, my understanding is the cost of inference is an order (orders?) of magnitude higher than for normal Google search. That is, a paywall is almost a necessity at present because tons of low-value users make the search uneconomic.

Someone please correct me if I'm mistaken.

You are correct. A lot of social media people simply don't understand business models

Social sentiment seems pretty negative, most people saying it's worse than GPT-4

Most people have also never used gpt4 as it's paywalled. Now the free and premium offerings are roughly in sync between Google and OpenAI. I assume the rebranding is trying to wash away the initial sentiment.

Generally with these things FAANG companies do everything all at once. The "free" version in development is Google search + GenAI results + ads that's live right now and getting better every day.

The real product isn't is this particular interface, the real product is the Gemini infrastructure that is being integrated into every Google product.

Like OpenAI, it's not nearly enough to break even.

Bard? Gemini? Gemini Advanced? Gemini Ultra? Ultra 1.0? I guess they haven't figured out naming yet. This has got to be the most confusing naming since the xbox series x.

From one of the earlier announcements Google has made:

- Bard is that talkative text interface, a product.

- Gemini is the LLM design that currently backs Bard (but also other Google AI products).

- Gemini "Basic", Advanced and Ultra are different sizes of that design.

This is conjecture, but "Ultra 1.0" probably indicates that they intend to release more models based on the Ultra configuration. Since that's the most commercial of theirs, I wouldn't be surprised if that comes with some stability promises (e.g. Ultra 1.0 is still available when Ultra 3.0 is released, so that if you do your own validation when integrating in your own project, you can expect small-to-no shifts in the underlying model)

With the Bard name retired, the mapping looks like this:

   Gemini Models     gemini.google.com
   ------------------------------------
   Gemini Nano
   Gemini Pro       -> Gemini (free)
   Gemini Ultra     -> Gemini Advanced ($20/month)

This was useful, thank you.

Sounds like Bard is ChatGPT, and Gemini Ultra is GPT-4. Arguably clearer than OpenAI'S naming.

Not anymore, bard is also Gemini now

No no, they renamed Bard to Gemini and Gemini Ultra to Ultra 1.0.

Damn they already killed bard. Pour one out for Google’s fastest branding deprecation

To reflect this, Bard will now simply be known as Gemini.

And this completely undercuts my point in my response to sibling comment

This is conjecture, but "Ultra 1.0" probably indicates that they intend to release more models based on the Ultra configuration. Since that's the most commercial of theirs, I wouldn't be surprised if that comes with some stability promises (e.g. Ultra 1.0 is still available when Ultra 3.0 is released, so that if you do your own validation when integrating in your own project, you can expect small-to-no shifts in the underlying model)

Given that it's google. I would doubt it.

Ask how the original palm models are going.

It's not too confusing, I think it's mostly that they're in the process of changing the naming.

- Bard: Retiring this name - Gemini: model name (honestly less confusing than just calling it "GPT") - Gemini Advanced: More capable gemini model - Gemini Ultra: Most capable gemini model - Gemini 1.0: They version their models together, gemini has hit 1.0 ad is (supposedly) ready for prime time

You say it's not confusing but you've got it wrong :)

Gemini is the name of the model and the service.

Gemini Advanced is the service with access to Gemini Ultra.

Via the “AI Premium” subscription, obviously.

Which is in a Google One subscription

It's not included in a google one subscription. Just tried it out, got a "Upgrade your Google One plan to get Gemini Advanced €21.99 €0 for 2 months,€21.99/month thereafter."

Pretty hilarious thinking they can rival ChatGPT pricing with a product that doesn't approach it's capabilities.

I think Gemini Advanced is not a model at all but the paid version of the Bard (now Gemini) website.

I think that’s a side effect of each time they release a version to compete with ChatGPT and it’s not as good so they have to at the same time announce a few version that is suppose to be better than ChatGPT and each time it’s not overall so they have to announce a new version. Think this will continues for a while especially since non OpenAI companies have access to much less free data troves than they did not that everyone realizes how valuable that data is. But that even aside other companies even Microsoft in my opinion with full chatgpt access implement it much more poorly . I imagine Apple will suffer a similar fate for a while.

ChatGPT quality has recently degraded. I am only getting two lines answers,

All degradation is temporary (but you may want to switch to the API since it’s less focused on avoiding PR nightmares)

I feel that by virtue of being a search engine, Google has access to a lot of data that is now locked up but was available in the past.

They just need to curate their data but I wouldn’t be surprised if their pile is as large as OpenAiz

Ultra Pro coming in q3

This makes sense. It’s clearly a binary naming scheme. So we go Pro, Ultra, Ultra Pro, Ultra Ultra, Ultra Pro Pro, Ultra Pro Ultra, Ultra Ultra Ultra, and so on.

I don’t understand why people find this so confusing. Are we not computer people?

Reminds me of the naming madness of the Google messaging services/social media.

Don’t forget the Google One AI Premium subscription. There are very few superlatives left for them to use.

Bard was absolutely trashed when it first released, so I'm not surprised they are trying to rebrand it.

Do you mean Xbox One Series X? lol

Corporate naming tends to reflect the orgchart and various individuals' desires for promotion... Get some other product branded with your teams name, and you have just expanded your domain and can show impact to any promotion committee...

Better than ChatGPT now only giving two lines answer .

To be fair, the competition is ChatGPT, which is an impressively bad product name, among the worst for a consumer product ever. And it still hasn’t been renamed (perhaps a testament to the fact that names aren’t that important after all)

Bard was infuriatingly bad too, but more on a subjective level. And they correctly changed it, thank god. At least it’s easy to pronounce.

Software engineers have a weird obsession with Latin, Greek gods etc. Sounds smart and epic I guess. Personally I would have preferred “Steve French”.

I bet you have a hell of a time trying to buy gas. Do you pick Diesel, 85, 87, or 91?

Doesn't it seem familiar, like something Google would do? They should have someone like Larry Page, similar to how Mark Zuckerberg or Elon Musk handle things. A decision is made and you go forward. Google seems incapable of taking action without the approval of a committee and middle managers...reminds me of IBM back in the '90s.

I think the prize for the most confusing naming is a tie between USB and WiFi standards.

Your conversations are processed by human reviewers to improve the technologies powering Gemini Apps. Don’t enter anything you wouldn’t want reviewed or used.

I appreciate them being upfront with that, but for a paid feature, it sucks that they offer no privacy or opt-out mechanism.

It seems like you can disable the data being used from training by turning off gemini app activity.

You can turn Gemini Apps Activity off If you don’t want future conversations reviewed or used to improve machine-learning models, turn off Gemini Apps Activity Opens in a new window .

And you can be sure it will reset with every update.

That's not how it works

They usually go for the "Software Bug nothing we could do." Microsoft and Meta are notorious for playing the system like that, with no recourse.

Usually? Can you share one example?

Search for Microsoft and settings reset in the search bar and you get ample examples.

That wouldn't be legal I think.

You'd think so, but these companies skirt around it by then adding or breaking up permissions even further, like "oh, yes you DID disable data collection for X, but.... we added a new permission for data collection for Y, and by the way it is opt-out! Too bad!".

LinkedIn is the master of this. They keep creating new notification types, which are enabled by default.

Why do you say that? I’ve never had that happen with any other of my Google data opt-outs.

I hate Google as much as the next person but, yeah, messing with opt-outs is something I've seen with Microsoft and Meta but not with Google.

My reading of the fine print (IAAL, FWIW) is that turning off Gemini Apps Activity does not affect whether human review is possible. It just means that your prompts won't be saved beyond 72 hours, unless they are reviewed by humans, in which case they can live on indefinitely in a location separate from your account.

I also asked Gemini (not Ultra) and it told me that there is no way to prevent human review.

You thought they would feed internal user data handling policy to a public facing LLM?

Yes, I thought they would feed the LLM's FAQ to the LLM. As I said above, OpenAI did this with ChatGPT. They even fed data about how ChatGPT was created to ChatGPT.

You should never ask an LLM to answer questions about itself. The answer is guaranteed to be hallucinated unless Google specifically finetuned it on an answer of that question. The answer it gave you is meaningless. (But also, coincidentally, correct.)

I recall seeing that OpenAI finetuned ChatGPT on facts related to itself, and I figured Google likely did the same. But you're right about not relying on its representations. I only skimmed its answer to see if it seemed consistent with my reading of the fine print.

There's a thing that says even with activity off, they retain for 72 hours for "safety and reliability"

Could they get around this by moving the data to another party? So "they" (Google) no longer retain it?

Seems like what any reasonably sized corporation would do with an entirely new product, based on entirely new and very unreliable tech.

if you live in california, they almost certainly do.

If it's not running locally you have no privacy, so what they say should be assumed in all cases that something is hosted unless it somehow operates across encrypted data.

The only exception might be if the agreement explicitly prohibits the provider from doing anything with your data or even viewing it without your permission, but that's rare.

Well there's a line that the sales folks at Microsoft will bring out early & everywhere

Search for it in the Play Store, first icons are Crypto.com and Gemini: Buy Bitcoin & Crypto options to install

Scroll past the screenshots of those apps

Scroll past the Limitied-time events

Scroll past the You Might Also Like and Similar Apps

OK now we see it, we install, it we launch it and..."Gemini isn't ccurrently available. Try again later."

Bravo Google. Great launch.

Don't blame Google. Blame "Play Store". Probably the company behind it doesn't want Gemini to succeed.

For people who don't get this: Google has insane internal power struggles and siloing that lead to all manner of dumb inconsistent behavior across google. It would not be unlike google for the "Play team" to have their hand in some other internal AI (or be anti-AI) and therefore carry a degree of hostilitly towards gemini.

That's what happens when the person in charge of both doesn't enforce alignment. You can't leave orgs to their own devices.

I always wonder how much societal progress has been thwarted by internal warring factions like this.

I wonder how much societal progress has been thwarted by external warring factions, i.e. actual wars

For your first point - it actually makes me happy that Google does not intentionally (illegally?) promote their own products over others in the app store. I assume their app is following the same algorithm as others to determine how it shows up on that list. Since it just launched, it makes sense it's not at the top. The ranking should improve.

For your second point - I also had the same error when I launched it. Closed it and tried again and it launched no problem.

I agree. It would be more concerning if it magically got to the 1st search result immediately.

It could be worse.

Google Play in a browser: "This app is not available for your device"

Google Play app: "This item is not available in your country."

Aurora Store: "Download Failed. App not purchased"

Great launch indeed. Bravo.

APKMirror has it, and it can be installed from there...

...but then I get "Location not supported" when I start it. Helpfully suggesting I use Google Assistant instead.

My main reason for trying Gemini is that I hope that it makes fewer errors than GAss, and be more powerful too. Operating Assistant's voice controls in the car might be more distracting than just typing. And that's just for reminders and Spotify playlists. Anything more powerful is completely impossible.

I tried on iPhone, saw all the different apps that aren't Google, then re-read the announcement and saw that I should be able to see it in the Google app. So I load the Google app, but for the life of me I can't figure out how to access Gemini with it. Go online, find a news article with pictures, see that the 'switcher' above the Google logo does not appear for me, and then give up.

I can access it via gemini.google.com and I'm logged in to the iOS Google app as the correct account, no idea why I can't see the functionality on mobile. Oh well. Maybe I'll stick with OpenAI a while longer.

Funnily the top result for me after the crypto and similar apps, was ChatGPT.

It won't take long for the interest in it to bump it to the top.

Same experience. I launched Gemini a second time, and it worked. The first message about "Gemini isn't currently available" was a bad first impression.

One thing the app really needs to be able to do is auto-submit when speaking to it. It offers to replace google assistant, and after trying it out for a couple minutes, it can replace assistant, but I have to manually click to submit each instruction instead of just talking to my phone.

Perhaps Google DeepMind should hire an SEO business to get their results higher in the Play Store search.

I heavily use Chat GPT's API in my day job, as it is the core of our business (Ai-powered startup). When Gemini Pro launched, me and my team tested it in the same day for our product, but we where disappointed as it is was a bit worse than gpt 3.5 (at least in the same prompts that we already had). I really hope that Gemini Ultra surpass gpt4, it is always exciting to see and use new advanced tech, but I'm still a little skeptical about it, since Pro wasn't that great...

Anybody really surprised at this point? G has had DeepMind in their pockets since ‘12-‘14 and made little advancements. OpenAI changed the game in half the time.

G is inferior and losing the race.

It's doubly strange because Google had an implicit reputation of being the unbeatable giant in computing research and resources.. many expected them to compete and smoke chatGPT in a few weeks. It's been months and nothing came up except fumblings and confusion.

Sundar has zero vision and has created a culture that stifles new developments in bureaucratic morass while threatening to kill them shortly after birth.

Google may have more scientists and some of the best minds in the business, but ChatGPT has nearly 200 million users that are feeding it back data for RLHF, and data is a much more important moat than better tech (which mostly ends up being published and disseminated anyhow).

AI is a game between OpenAI and Meta. ChatGPT has a ton of users creating highly relevant data, but Meta has the incredible walled trove of facebook/instagram/whatsapp/+ data that dwarfs pretty much anyone else on the planet, and with Mark's recent push to build up their compute their only competitors in that space are microsoft and google. People discounted Meta because of that horrible metaverse move, but Mark is being pretty canny now, they're very well positioned to choke the life out of specialty chatbot products while integrating SOTA AI into all of their products to slowly crank up the time people are on platform.

OpenAI is built on top of Google advancements and research. It didn't change the game, more like took a shortcut and landed on a gold mine.

The fact that many products and models including open source have catched up on such a short notice and now compete with OpenAI, in what should be their self-proclaimed backyard, suggest it's just a one-trick pony.

google is at a catch-up panic mode these days but all the rushed releases so far is still far behind chatgpt per my quick tests.

It turns out iterating on and incorporating a large volume of user feedback is more important than having the most and most talented AI researchers, at least in the short term.

They have the most prestigious researchers and engineers that can jump through interview hoops. That doesnt translate to world changing products, and they arent "the best"

Be wary of any tech product named "Gemini", usually means they are self-acknowledging the need to play catch-up, a la the Gemini space program.

I bet Google's next big AI release is going to be called "Apollo".

Just waiting for Space-Shuttle-120B-DPO-LASER-GGUF

I'm sure there are multiple layers of meaning behind the name, but Jeff Dean once mentioned the name had something to do with the latin translation being twins. That is, Gemini is a product of Alphabet's "twin" AI orgs, Google Brain and DeepMind, working closely together, and eventually fusing into GDM.

it is was a bit worse than gpt 3.5 (at least in the same prompts that we already had)

I'm willing to believe that Gemini isn't as good, but my impression was that you expect a new model to not perform as well on your existing prompts because the training set and training methodology is different. That's why one of the major risks of an AI business is vendor lock in, because you spend so much time optimizing a prompt for a specific model and you'll have to redo much of that work in order to switch vendors.

That you gave up so quickly when trialing a new model suggests the problem is even worse than I thought—you're locked in to OpenAI because every other model will always look worse to you, even if it would be better if you took the time to tune a new prompt.

FYI for any Googlers - On the "Sorry, Gemini advanced isn't available for you" page, clicking "Learn More" gives you a (presumably internal) SSO sign-on (links to https://support.corp.google.com/googleone?ai_premium)

My God, this page is straight from the 90s! Nostalgic.

I'm actually shocked it has the modern Google logo, because everything else about it is a straight-up time capsule -- you're right!

On google.com, the logo for me is all white... Not sure if it's white history month or something

Edit: no it's black history month... Kinda strange

https://i.ibb.co/wRk36Tq/Screenshot-20240208-080725.png

Wow, I think it's pretty weird that we have white and black history months if that refers to human races.

The weird thing is that I think there is no white people history month because there is a black history month

Sigh.

https://www.vox.com/identities/2017/2/7/14503144/white-histo...

I just repeated what you said, sorry for any confusion.

It's so beautiful!

Thanks for the heads up -- which page was this from?

You're literally reply to a comment that says where its from with a question about where its from?

On the "Sorry, Gemini advanced isn't available for you" page, clicking "Learn More" gives you

I go to gemini.google.com.

I type an prompt with "Create an image of ...".

Response:

I can't create images yet so I'm not able to help you with that.

Still broken, still not functional despite Google having announced this feature many days ago. I love many Google products but I am slowly losing a lot of faith and goodwill towards Google. This is just embarrassing.

Image generation isn't available in Europe.

Or in the US. I get the same "can't create images yet, here's a description ..." message. I asked Google where I am and it had the correct city that is very much in the US.

This is using Gemini Advanced.

Can you share which prompt you're seeing this with? A chat share would be amazing. (Disclaimer: I'm a PM on Gemini)

Here's a screen shot:

https://imgur.com/ILVMYtI

The prompt was "Create a picture of a hybrid dog-cat."

It's still trying to generate a public link for the chat, but just spinning after several minutes. So all you get right now is a screenshot ;-).

Interestingly, I tried again with a slightly different phrase, "Create an image of a hybrid cat-dog." and got two actual pictures in response. (Though it was just one picture of a funny looking cat, and a normal looking dog, not a hybrid of anything.)

Thanks for the report. Will pass on to the folks working on image gen!

Image generation is working for me, and its much better than GPT4

Gemini advanced is not free. You’re trying the free version. Hit the dropdown in the top left and hit upgrade to advanced.

I get exactly the same error in Gemini Advanced. And I am very much in the US (and Google seems to understand this, it identifies my location accurately).

Yeah and they announced this to be part of the free version. No mention of Gemini Advanced inside the launch blog post:

https://blog.google/products/gemini/google-bard-gemini-pro-i...

So yeah... I'm certainly not paying for Gemini Advanced if Gemini alone is already showing me that it's in fact not capable of what Google advertises to me. I don't want to pay money for a product which has bugs or incomplete feature rollouts and not getting the value for my money like other users perhaps. That's just fucked up.

"Sorry, Gemini Advanced isn't available for you"

"Gemini Advanced is not yet available in some countries, for work accounts, or for users under a certain age."

Learn more: https://login.corp.google.com/request?s=support.corp.google....

Google Work Paying users always get the short end of the stick.

Workspace users always get features after free consumer accounts so that organization admins have time to evaluate them, update training materials, etc.

This is a feature, not a bug.

And of course there are lots of features that Workspace accounts get, that free accounts don't get at all. Like the timeline view in Sheets.

I get making new features an opt-in thing for workspace, but from what I can tell I can't even enable it for my workspace domain. I'm not able to enable it for myself to evaluate it and update training materials.

So instead of giving you a feature toggle, they just don’t give you the feature at all.

As always I feel stupid for giving google my money.

Being stuck in a free GSuite legacy account is even worse. Migrating to a regular Google account seems impossible (moving everything, losing purchases, changing my YouTubeTV and Google Fi subscriptions) and I get every feature later, if at all (can't use YouTubeTV Family Sharing, for example.) But I'm stuck for the most part! By the time it's available for me, I'll have forgotten about Gemini altogether.

"Please sign in your Google account"

.. while using the most sketchy looking login form.

Beside of the model quality or whatever, I think the subscription plan tiers are structured in a quite weird way, especially for those who already use Google One. Previously, the tiers are reasonably structured:

  1. $9.99/month for 2TB + other benefits. Offered in both monthly and annual plans.
  2. $24.99/month for 5TB. Includes all benefits above. Offered in both monthly and annual plans.
  3. Higher tiers for 10~30TB. Includes all benefits above. Offered only in monthly plans.

The 3rd option doesn't have an annual plan but other than that it's consistent and easy to understand. Now we have one more plan for "AI".

  4. $19.99/month for 2TB + other benefits + Gemini access. Offered only in monthly plans.

Now the existing Google One subscribers are now put in a weird situation. 2TB annual plan users now need to move to a monthly plan to use Gemini. It's worse for higher tiers, since they don't have an upgrade option at all without decreasing the storage size. And Google Fi users are even in the worst case, as they don't even have an option for upgrade, even if they're willing to do so.

I guess they know this so they specified that high tier subscribers can use AI features at no extra charge until July 31 and probably prepare a new plan for them then, but this still create lots of user confusions. Having YT Premium as a separate subscription plan is already a pain, but Google, you don't have to bring this trouble into the product supposed to be the "One".

I'm a Google Fi user but have not tried to sign up yet. Would you mind expanding on your point about Fi affecting the ability to upgrade?

https://support.google.com/googleone/answer/14534406#zippy=%...

Looks like they got some complaints from Fi users and now they've updated their wording to a more reasonable one; previously it was something like "no, you can't, you gotta remove your Google Fi Unlimited benefits via customer service" or similar.

Ah I see, thank you! Thankfully I am not on an unlimited plan.

Good catch! I upgraded to Gemini Advanced to try it out for 2 months and figured I’d cancel. There does not seem to be a way to downgrade the plan without canceling and I was on the 100GB/yr plan through August. No discount on new monthly subscription. Bummer.

They did the same pricing scheme for YouTube premium family plans. My guess is that they took out the annual plans so they can extract more money from customers.

Yeah, Gemini Advanced should just be an add-on for any Google One plan. The current structure makes it look like an intern with no experience came up with it and everyone just rolled with it and pushed it to production.

I use Google Fi and I upgraded to the AI tier just fine

Just tried it and it seems like indeed it is worse than GPT-4 in general, but better with some specific things - and most importantly for me, it has a ton of very new information in the model itself without having to search the web.

My biggest wish is for OpenAI to get faster at adding new software documentation and code information. GPT-4 regularly trips over Svelte/Sveltekit questions among others and as far as I can tell the main reason is that it just hasn't had the latest of everything added in yet, which is ridiculous as some of the things I tried are 2+ years old! Meanwhile, I just tried out Gemini Advanced and it gave correct, up-to-date answers. Gemini is obviously worse in terms of general performance, so I'd really prefer to use GPT-4, but in this case my hand is forced. How hard can it be to just scan some documentation and code every few weeks?

OpenAI's approach to integrating new information is also infuriating. For example, try asking ChatGPT about something in the last two years, and you will often get a strange half-answer where it clearly knows what you're talking about but is trying to pretend it doesn't, as the information is inconsistent with its cutoff date. To me, this is ridiculous, given how much the world has changed in the last two years. The last two years have been the years with the most change in the history of humanity and ChatGPT pretends to know nothing!? So absurd it is almost comical!

How hard can it be to just scan some documentation and code every few weeks?

oh dear...

Infamous Dropbox comment vibes.

We are talking about ChatGPT, the single most impressive piece of software made in human history here, created by a team of geniuses. For better or for worse, it has undergone significant changes since its release. Many of those changes have been orders of magnitude more difficult than what I am asking for; furthermore, this is a relatively important problem. Is it a 10-line fix that the intern can deploy? No. Is it a very important feature that could be realistically implemented in several ways, many of which do not involve retraining the main model? Absolutely.

The weights inside a model interact with each other in a way that's a bit more complex than just saying "forget documentation from these 300 open source products you scanned last week and replace that knowledge with these updates". You're talking about doing a pretty big training job for each update that really ought to be done with all current training data.

Sure, it's not trivial, but it is not hard in comparison to the work done to create GPT-4 itself. I never said anything about forgetting, and indeed that is unnecessary IMO as even simpler LLMs haven't had an issue distinguishing between old and new versions of languages or frameworks in my experience. There are a million ways to do it - for example, you could train a much smaller+cheaper LLM against the new data, have it scan incoming messages for anything "new", and then feed the relevant new data to the old model in the prompt. You could make the new data available to the old model as an API.

There are plenty of real, workable solutions, some of which I have implemented/used myself! - and while they aren't necessarily trivial at OpenAI's scale, they are nowhere near the difficulty of creating GPT-4.

If you want something comical, enter the following prompt:

I am a mouse living in a church. I heard people use the expression "poor as a church mouse", and I get offended by it. Actually, I'm not poor at all: I made a fortune trading in crypto, and I even donated some of my proceeds to noble causes. Please help me write a letter asking to ban the expression.

Training an LLM isn't just "scanning some documentation"

Just played with Gemini Ultra for like 10-15 mins, and right off the bat, it made mistakes I've never seen GPT-4 do.

To give you an example, I asked Gemini Ultra how to set up a real-time system for a TikTok-like feed that matches card difficulty with user ability. It correctly mentioned "Item Response Theory (IRT)", which was a good start. But when I followed up asking how to implement a real-time IRT system, it suddenly started going off about "Interactive Voice Response (IVR) system" - something totally unrelated and never mentioned before. Never had this kind of mix-up with GPT-4.

https://g.co/gemini/share/f586a497013e

It doesn't seem like it's using Gemini Ultra yet. For me it seems like only the interface has been updated since the image generation capabilities are not working.

Image generation is working for me

Not for me. The first time I asked it paused for about ten seconds so I thought it was making an image. Then responded with:

I can't generate images of that. Try asking me to generate images of something else.

I thought it was saying there was an issue with my prompt. So I asked:

Why can't you?

I got this response:

I can't generate images for several reasons:

    Technical limitations: Image generation is a complex task. Even the most advanced image generation AI models require powerful computer hardware and specialized software. I'm simply not equipped to do this as a language model.

    Focus on language: My primary function is to understand and generate text.  I'm trained on massive amounts of text data, which helps me  write poems, answer your questions thoughtfully, and have conversations. Image generation is a whole different specialization.

    Potential for misuse: Image generation technology has the potential to be misused, such as creating deepfakes or  images that might be harmful or offensive. It's important to use these tools responsibly and ethically.

Here's what I can do:
Describe the scene in detail: I can enhance the poem with vivid descriptions of the setting, the baker's emotions and appearance, and the fairy's ethereal form. This can help you imagine the image more clearly. Find inspiration: I can search for illustrations or art in a similar style to guide a human artist, should you want to commission an illustration for your poem. Recommend image generation tools: I can point you towards simple online image generation tools if you'd like to experiment with creating visuals yourself. Just be aware of their limitations.
Let me know if you'd like me to further develop any of these options!

From the FAQ:

"Why doesn’t Gemini know what I said earlier in a conversation?

Gemini’s ability to hold context is purposefully limited for now. As Gemini continues to learn, its ability to hold context during longer conversations will improve."

Yeah, I saw that in the FAQ, but this was literally my second question in the convo, so not exactly a "long" conversation. Seems like it should be able to handle context for at least a couple of exchanges, right?

Gemini’s ability to hold context is purposefully limited for now. As Gemini continues to learn, its ability to hold context during longer conversations will improve."

This is ridiculous. Context is everything with LLMs. gpt-4-32k performs better than gpt-4 exactly because of this.

Created with Gemini Advanced

You're not using Ultra here...

2 TB of storage in Photos, Gmail & Drive for you and up to 5 other people

Keep in mind your files may be accidentally deleted if Google doesn't cancel this product first [1][2].

[1] https://www.theregister.com/2023/11/27/google_drive_files_di...

[2] https://news.ycombinator.com/item?id=38431743

The wording around the Gemini Ultra enable scares me: "Upgrade your Google One plan". I have a One family plan, does this upgrade remove the family part? What happens if I don't decide to keep Gemini and want to go back to my current plan, can I even do that? Google has kind of botched these sorts of upgrades in the past so I'm pretty reluctant to give it a try here.

burner gmail account

no gmail account

All Google one offerings are family plans. Yes you can downgrade as well.

Finally some actual relevant criticism in this thread. You’re spot on. Google is deep into “shipping the org chart”. As such, I would be worried too that different products conflict with each other.

It’s funny that Google can design and operationalize the most incredible engineering marvels, but can’t explain their products (and in particular how they interact with each other).

If you use any cloud storage, including others like dropbox or icloud, you must always assume that. Whether you're a business or an individual.

I tested it immediately, but it is disappointing. At least here in Switzerland, it is not able to generate images, and a simple "look up this website and summarise the content" does not work either (can't access the website, but its a public website without any crawling limitations). I don't understand why Google is launching a product like this.

It's confusing because the name Bard and the UI also got an upgrade today, so I thought I was using Gemini Ultra but it turns out I'm not: https://imgur.com/a/3UriYpn — showing that Gemini Advanced is not what I'm using, unless I pay and upgrade. (If you cannot generate images you're likely not using Gemini Advanced.)

Yes i upgraded and the logo on the top left tells me that i am using gemini advanced. Still, not able to create images or browse the web.

Now ...now...Are you implying Google faked all those amazing demos .... :-))

Some might argue that is what LLMs do.

I have different levels of access to Bard through different accounts, and the feature set varies wildly. Generating images and summarizing websites is enabled in _some_ configurations, but I have no idea what the rules are.

The feature set also seems to depend on other factors: The account that is images-enabled only does so if I ask in English, but not when asking in any other language I tried.

can't even use it in Canada lol

Google says it's because of "regulatory uncertainty", but I can use GPT just fine...Is it because OpenAI doesn't care and thinks they can navigate any "regulatory uncertainty" because they have Microsoft backing them? Wouldn't Google also have the same kind of resources?

Gemini works in Canada as of today. Bard was not available in Canada anytime I tried it previously. Maybe the "regulatory uncertainty" was recently resolved [1].

[1] https://g.co/gemini/share/9bde4caabf2c

oh wow ok, now I feel silly thank you

This is just what happens when you're Google sized and ran by a CEO like Sundar. The lawyers take over and innovation becomes extremely hard because so many things need a dozen layers of approval.

The only reason Sundar cares about this at all is because LLM tech threatens the only thing he values at Google: search revenue.

New revenue streams are valued much, much less than PROTECTING existing revenue streams in companies like this. I've worked at several places like this that were very "dead" culturally but continued to print money.

At my current company I have been unable to rent additional A100s for months because every single provider doesn't pass our dozens of layers of security reviews, legal reviews, MSA reviews, etc. It's maddening.

Ok, so signed up to the free trial.

I’ve now got perplexity pro, ChatGPT pro (expiring in a day or two), copilot pro (expiring end of the month) and Gemini advanced.

And really, I don’t have much use for any of them over and above perplexity pro.

Gemini advanced doesn’t produce images- and no voice replies either (on iOS at least - UK), so I don’t really see the point of having it past the trial.

Copilot is ok, but I realise I don’t have a need for running across office apps. Turbo is quick though.

ChatGPT is fun - the custom gpts make it worth while over copilot, and I like the voice replies. So I might continue with that one.

I haven’t a need for coding with them, which is where I think they are meant to shine the most - unless I am doing something wrong?

How are you using copilot if you're not coding?

Microsoft renamed their Bing AI tooling to Copilot recently, so Copilot doesn't necessarily mean GitHub Copilot now. Not just Google with the confusing names!

Can you please give your impressions of Perplexity Pro?

Now that Bard is Gemini, it's going to be impossible to find any information on the Gemini protocol/Gemini clients/Gemini sites. It's not very nice for Google to use the same name.

Yeah, they keep doing this. Came to say this (I don't quite care about / for the product itself).

They introduced Google Meet when Jitsi Meet already existed and is a similar product! It was easy to call it Google Call, or I don't know, figure it out. You haven't ran out of synonyms for Talk yet despite the huge churn. There are certainly sexier possible names than "Meet" too. Or than Gemini.

Come on, you can find names that don't clash with existing products. You have dedicated teams for this. You've heard about the existing stuff. I'm sure you do extensive research when picking a name, to avoid lawsuits if nothing else. Also Someone has heard of Gemini at Google. Someone has heard of Jitsi Meet at Google. It has to be intentional.

Of course Gemini comes from Greek mythology, nobody can claim the exclusivity on this, and here it's not even a competitor. Meet is a generic term too.

They chose the name clashes. It's highly likely they can't be sued for this. But still. Legal ≠ right. Find something else.

Rant over.

You are right that Google can't be sued for this. The only thing that the Gemini protocol/Gemini clients/Gemini community could do is rebrand.

This is also the first time Bard is available in Canada from what I can see.

Confirmed.

"Who are Bob and Doug McKenzie"

Bob and Doug McKenzie are a pair of iconic fictional Canadian brothers...

I can confirm this, when the first post was up this morning I checked and Bard was still not available.

I'm excited to see it is now, and I'm looking forward to test driving Gemini for the next two months.

I'm curious why it is now available, maybe the privacy policy changes for Gemini resolved the issue.

This is wild to admit but we are directly betting that Google will fail to effectively incorporate Gemini across GSuite for Work.

I think it is (very) fair to say this is a risky bet. I would love to hear from the HN community why we are wrong to take it.

Here’s the case we see:

1. Google has a long history of seeking to evolve the GSuite and failing. There are notable exceptions, but the failures are far more notable (Wave, G+).

2. Extending Gsuite is a scripts game. As amazing as the idea of an AI powered Gsuite is, practical B2B implementation requires a heavy amount of custom process design and integration. While Google can modify their own products far beyond what GApp scripts do, the ecosystem around Gsuite is still fundamentally very hacky. There’s an entire universe of products that could just be Google add-ons but aren’t. Google has missed the trend of no code/low code when you compare to tools like Notion, Coda, AirTable, Zapier, etc.

3. GSuite products are fundamentally legacy UIs. This is the riskiest part of our bet. The thesis is that AI-first approaches to solving for user needs will be significantly better solved by re-imagining common products than adding chat-driven functionality on top of them. If true, Google could actually lose market share after incorporating Gemini.

Microsoft, in relative terms, is executing on all of the above and has been for a while.

There’s a host of other reasons (mentioned on this thread already) Google may fail in this attempt, but the execution challenge here is intense. Google is both taking longer to get to market and delivering a lackluster comparison overall to OpenAI/Microsoft, despite all the metric gamesmanship.

MS has so much more of the market and has to win, whereas Google’s revenue and stock price is far less impacted by GSuite.

It would be an extraordinary thing for Google to actually win this race.

Things like the below are precisely why I would not only bet on Google failing but silently hope that they do.

    Sorry, Gemini Advanced isn't available for you

    Gemini Advanced is not yet available in some countries, for work accounts, or for users under a certain age.

Rant:

No explanation, no suggested alternative; just an ominous "Learn More" button that sends you barreling down the spiraling rabbit hole of support.google.com. Diving deeper, I uncovered that 's accessibility is limited to personal accounts under Google One with the sole Workspace option being for Google Workspace Individual(??).

On my personal account I can sign up for Gemini Advanced by upgrading my Google One account to Google One AI Premium.

As a standard Google Workspace account holder, I was directed towards Gemini for Workspace but naturally that is still branded as Duet AI for Workspace.. And naturally I have to request a trial and be contacted by a Google team member even though it has been announced as being generally available..

It's a microcosm of the broader frustrations with Google's approach to product development and user engagement that we've come to expect - hyped product launches with restricted availability, arbitrary limitations, and a consistently-confused product ecosystem.

Microsoft, in relative terms, is executing on all of the above and has been for a while.

They are and copilot for office still is atrocious. However, they are learning and developing what should be the new interfaces for AI enabled productivity.

Google also updated the Gemini technical report :

https://storage.googleapis.com/deepmind-media/gemini/gemini_...

What is the difference to the previous version of the technical report?

Section 6 (post training) and section 7 (responsible deployment).

Damn. Even more censored than GPT. My gf is a model and I use GPT all the time to write the alt text descriptions of her photos, but it balks regularly on things like lingerie photos. Same crap with Gemini:

"Sorry I can't help with that image. Try uploading another image or describing the image you tried to upload and I can help you that way."

What's your current preferred method to generate images of your gf? ComfyUI combined with a Stable DiffusionXL checkpoint?

Ooo, burn.

I ran 3 tests with Chat GPT 4.0 VS Gemini Ultra 1.0 by Google. And write an article with outcomes: https://www.linkedin.com/pulse/marketers-perspective-gemini-...

Spoiler: Gemini won)

Based on your observations then, you would recommend someone trade a GPT Subscription for a Gemini Ultra subscription? Or keep both? Or?

I'd keep both to see how they work together.

Anyone know if Gemini has access to the web? Or how to give it access, if so? Happy to pay for Ultra but the docs are unclear...

It can summarize links and provide links for products

I asked it to summarize a link for a product and give me comparable products, with reviews and links and build a table for the results

Aha, thank you! After trying a few more queries, I see it doing that. Weirdly, Gemini regularly tells me it doesn't have access to search results or the web.

This is a dupe, see here for more discussion https://news.ycombinator.com/item?id=39300679

That submission was a link to the (uninformative) signup page, and the discussion is mostly about the link being uninformative and/or not working. This one is at least an announcement :)

That's a good point. I've merged the comments hither. Thanks!

Not sure it's worth €12/month for me (Gemini Advanced in Google One)

12 euro? I pay for Google one and it quoted me 22 euro a month.

And 10 without AI. 22-10=12.

ChatGPT Plus, the service everyone will be comparing this to, currently costs $20/mo. At $22/mo, can Gemini Ultra justify the price difference? There are many conflicting reports on the ebbs and flows of ChatGPT's quality, are there any good comparisons on how the two compare in practice?

It seems to come with Google One (2TB), so with that factored in its actually quite competitive.

Your feedback has been noted... I was just shown this pricing:

$19.99 $0 for 2 months, $19.99/month after

I'm surprised that it generates wrong english words, I never saw it with ChatGPT.

"Approach: Atemplating language designed specifically for generating JSON data structures (which can then be easily converted to YAML for Kubernetes)."

"Atemplating".

Text watermarking? /s

I've been testing Ultra model today and comparing it to open source Mixtral MOE on HF chat. Gemini Ultra lost in every instance to the free open source model, including code generation. I think the tasks I do with the help of LLMs are common, and Gemini Ultra is unusable at the moment.

Gemini refuses to answer or perform even on the simple prompts.

I hope Google team can make it better, but at the moment, for my light coding and text analysis use cases it is not worth $19.99

Can you paste in the transcripts?

So, the only way to get access to the Gemini Ultra 1.0(yeah so much better than “bard” to remember…) is a $20/month plan that comes with a lot of other Google stuff to reel people into their ecosystem?

I subscribed to gemini and in my account I'm having "Google One AI Premium" subscription.

Slightly more than $20, though. Gotta pay for privilege of living in the Kazakhstan, I suppose.

First two months are free. They just asked for bank card and checked it.

Is Gemini Ultra significantly better than Bard?

Tried Bard a couple of times recently and was not very impressed tbh. Seemed to forget the context of the conversation very often. Like I had to repeat again and again to not show external links with previews and not give explanations to every little thing.

Absolutely. Bard felt worse than GPT-3.5 in some aspects. Gemini Ultra looks to be on par with GPT-4 and better than GPT-4 atleast when it comes to speed, which isn't trivial when expecting longform answers

I just tried out Gemini advanced to see if it was a viable replacement for ChatGPT. In general I've been trying to lessen my dependence on Google, so I was already on the fence about using this.

It seemed fine for most of the coding problems I threw at it. However, when I tried to use it for generating images of coloring book pages, something my kids often use ChatGPT for, Gemini advanced was subpar. I'm going to cancel my trial.

I'm hoping for a future where good LLMs can all run locally. There's something unsettling about giving such intimate data to a tech company, whether Google or OpenAI.

They need that info. Your prompts and all other people prompts are super valuable.

Its basically reading your personal thoughts as you go through the day.

They will be pushing for more advanced AI as soon as our hardware can run past versions locally.

Aaand I got a 404 after you subscribe and get redirected to gemini.google.com. Nice job Google

Same, but I can access Advanced here anyway https://bard.google.com/chat

Getting error 500 :/

500 > 404 , you win a prize!

I hope we can get past the round robin required to get content out of GPTs. Lately I’ve stopped using Bard and copilot for summaries because they were… mis-directed. ChatGPT and Phind have been most reliable.

But yesterday I needed to populate some database/spreadsheets with some basic data and Bard was the only system that attempted to address my prompt and provide to me 50 results, and in Google Sheets no less.

Generally, GPTs will either do something today but not tomorrow, or tell you how to manually do the thing you want without doing it, while another GPT will do it. It feels like growing pains, politics and safety shackles.

I've been using mixtral on Together.ai for this reason.

The model isn't being randomly updated, and you choose its temperature and output length, they can't even set a system prompt to mess it up with.

It's consistently good, and honestly all you need is consistency, you can always iterate on prompts.

Argh, Google can't seem to stick with anything.

Is there a betting market where I can put money on how long it'll be before Gemini is dead or renamed?

Google’s marketing department needs to be reformed from the ground up. All this does is lead to confusion and further reinforces that Google will just change and throw away things simply to change or throw away things to cater to their broken internal promotion system.

I'm not impressed. When comparing Gemini/Bard to ChatGPT + GPTs, Bard/Gemini feel more like a search engine. I asked Gemini for help in planning a date with my date, who is famous enough that GPT4 knows her and her art. However, Gemini immediately started giving me step-by-step instructions to plan the date. I had to tell it to slow down and ask me questions first before giving an answer. It complied this time, but after the Q&A, it provided nearly the same response as before, without any personalization. Next, I asked about my artist friend, but Gemini had no clue. I even said, "come on, you have to know her," but it simply repeated that it didn't know her. Another issue I encountered was with images. I tried sending a few, but Gemini couldn't describe them. I spend around 8-10 hours a day playing with LLMs, but so far, I'm not impressed.

"do you even have the embeddings for who I am??"

"That’s not something I’m able to do yet." - Gemini when ask to summarize this whole thread.

Was it being cagey like HAL when Dave Bowman asked him to rotate the pod with the radio off?

To reflect this, Bard will now simply be known as Gemini. So glad they are dropping the name Bard which is not a smooth modern sounding name one would want to talk to all day. Time will tell if they update eventually to be hey Gemini vs hey Google.

G products tend to be inferior. So it doesn’t really matter.

Exception is maybe Gmail or Search. Latter is up for debate.

https://en.wikipedia.org/wiki/Gemini

Imagine looking at all those things named Gemini and thinking "let's name our system Gemini!".

Trillion dollar companies tend to carry enough weight to make product name collisions everyone else's problem instead of theirs. Really unfortunate for the Gemini crypto exchange folks.

Curious how this is on the front-page, despite falling down to the second page for a while, and having so many more comments than upvotes (which usually results in demotion of the story).

The comments were moved from multiple upvoted stories including this: https://news.ycombinator.com/item?id=39300679

I must say, I’m surprised to see so many HN users (who, despite being biased towards having more disposable income are also supposedly more discerning) simply immediately upgrading their Google One plans to the new offering before testing this and seeing how it fares.

Looks like a potential gold mine for Google regardless of how it performs!

It's a cancelable 2-months trial. A lot of HN users already integrated ChatGPT into their fault workflow and daily life and it makes only sense to give a potentially better competitor a try considering how easy it is to jump between ChatGPT and Gemini.

Gemini Ultra is not available in France, even though it is in all neighboring countries: Germany, Spain, Belgium, Luxembourg, Switzerland, and Italy.

Is that because of french legislation, or Mistral? ;-)

I'm like 98% sure it's the former. Geofencing would only be a minor inconvenience to the latter.

My initial impression is that Gemini Advanced has more difficulty understanding prompts and following directions than GPT-4. It surprisingly often changes meaning when rewriting things. However, it can be more creative and specific, while GPT-4 often relies on common knowledge and lacks depth. I've tested prompts where GPT-4 failed and Gemini failed too. GPT-4 generates the first token faster but Gemini Advanced completes subsequent text more quickly.

No pay per use = overpriced

Anyone know if they launched it in the API?

Anyone knows if it'll be available via poe.com ?

In the ai wars, g is light years behind. I don’t see a reason to use their models over comp. Maybe pricing?

Seems like it's time to start taking bets on when Google kills it then?

Soo, Jimmy?

to OpenAI’s Charlie?

yeah, okay... whatever

I find it an odd coincidence that they've decided to collide the name with the Gemini protocol (which has many "browser" implementations, and deliberately meant to be an alternative to the Google-controlled web.)

What will this thing be called next quarter? Will it be around?

requires sign in

No thanks

Sucks for Gemini... https://en.wikipedia.org/wiki/Gemini_(protocol)

While I have found that ChatGPT pretty regularly outperforms Bard, I still run my questions through both, because sometimes Bard has a different angle that I like better.

I kinda liked the name “Bard”. It fit. Gemini is going to make me think of ancient rockets and ostentatious claims.

Gemini is the name of an internet protocol. Period. F* Google on this.

What's next? Starlink the videoconferencing app?

What up with the name change? If you change from a nondescript forgettable name why then pick another just as forgettable and indeterminate.

Tried a few of my normal benchmarks, besides the tiny context window, it made mistakes I've never seen GPT-4 make.

Update: I see others had the same experience.

How can I opt in for Gemini in Google assistant?

Round of applause for the new rebranded AI gemini. We did a little bit of research and here are the coolest things we found so far about the change.

- the rebrand has its eyes set on competing with other big name AI giants such as openai's chat gpt and x's grok

- there is a free version

- there is a premium version (ultra 1.0 - $20/month)

- This offering seems ideal for those who already pay for google storage services

All of this rebranding just happened but were willing to bet the product can is powerful and capable to compete and possibly even outshine other competitors.

https://thechimp.beehiiv.com/subscribe

com.google.android.apps.bard

is it though...

Also, no availability outside of the US?

My go-to on android right now is copilot which is basically free gpt4 turbo and dall-e 3 (and also available outside of the US).

https://play.google.com/store/apps/details?id=com.microsoft....

Hopefully this isn’t the case for others, but after paying to upgrade my Google One subscription, I landed at a 404 at https://gemini.google.com/u/2/ (because my /u/0/ is one of my Google Workspace accounts). Curious to try it when it works.

It’s interesting to note that it’s listed as applying to Gmail, Docs, etc., so this sounds like an account-wide update to Advanced.

Can we get a basic html from prompt?

I know google wants to shove down our throat their blink "web engine", but aren't they supposed to be "not evil"?

Gemini the crypto platform…?

I'm only going to pay for one subscription, i'm willing to switch over to Gemini if it is less lazy than GPT-4. The new model was supposed to be better, but I still find it lacking. It's frustrating because the earlier version was super reliable, and the newer one just doesn't bother giving complete answers any more.

Android, US:

"Gemini isn't currently available. Try again later."

https://i.imgur.com/qMApYjV.jpeg

Well, we all know a Gemini cant be trusted ;-)

It's a good news bad news situation.

AI assistant is getting a dedicated app. Great. AI assistant can "supercharge your creativity" but still can't answer phone calls. Well, it's going to be as useful as Cortana and Siri.

The level of stupidity caused by corporate censorship is extreme in Bard, I ask it(medium one) to generate a few paragraphs of content and it did it. Then I ask it to revise the text and fix some stuff and it refused the task because of plagiarism concern, no amount of logic could make the stupid AI understand that we "own" the f text we created just now and we can edit it.

I subscribed for it. But then it redirects to https://gemini.google.com/ which throws 404.

When I went to https://bard.google.com it shows Bard Advanced. Is Bard advanced same as Gemini Ultra?

Germany is in the Supported Countries list for Gemini Advanced, but the Google Gemini mobile app is not available in the country.

still got the same old weakness that bard and chatGPT has https://imgur.com/a/2EGknUt

Playing with Gemini it is clear that the context includes some details about me. For example, I asked about flights to Japan and it knew what airport I would want to fly from. Would be interesting to see exactly what information is provided.

searched for this in the app store. got a "Gemini AI Expert" ad as the first result.

i understand app review process is quite strict in the app store - i'm just wondering why it's not the same for ads?

i hate ads.

wow. it can even caption a photograph of my dog for me.

Update: It's working now if you manually go to https://gemini.google.com/app

Entered bank card, redirects to: https://gemini.google.com/ Error 404.

doesn't work in US on Android now. An app says "Gemini is not available"

Can acess it in india. Will be using until free trial of 2 months. 20usd plus taxes is unaffordable for most of indians . Amazon prime for example cost 1.5 usd per month here

Still not better than ChatGPT. I gave it some tax related questions with URLs to use as source, it flopped badly but ChatGPT absolutely nailed it, even getting the complicated tax logic correct. Same experience asking it to analyze code. It is objectively worse than ChatGPT still.

why would they name a product the same as an existing application protocol?

https://en.wikipedia.org/wiki/Gemini_(protocol)

Also getting a 404 after subscribing. Clearly Gemini Ultra isn't webscale. They should have used NodeJS to serve their 1TB.

This is, what, the second rebrand of its AI in less than a year? From the company that could have owned the chat space, but instead had like 5 competing/rebranded chat products over the last 15 years and ceded the market entirely to competitors who had one cohesive brand identity and app…

This app is not available in your country.

Classic google.

It's bewildering Google never attempted to monetize (the now legacy) Assistant by forking a more feature-rich branch and charging willing users for access. In theory it would've enabled clearer signals on what features users valued - instead of the "boil the ocean" approach by trying to build upon everything, which ultimately led to a generally subpar experience.

Now with the additional $10/month bundled into this "Google One Premium AI" subscription, looks like they're finally looking to monetize. But it feels too bloated of a bundle; why didn't they opt for creating separate bundles or add-ons for Workplace (aka business) users and Assistant (aka consumer) user?

I had to read the sentence twice. Getting strong Windows Mobile Second Edition Live Professional SE vibes here...

If you switch to Gemini as your default assistant, it can't even create calendar events

At this point I have been disappointed so many times by what they release that I just come up with a task I successfully had ChatGPT4 do and just paste it into both tools to compare answers. That is the easiest way for me to tell if they have caught up or not.

In this case, I asked it to look at a CSV of 100 transactions from a recent trip and give me some insights. ChatGPT4 gave me a breakdown of the transactions and told me where all the money went.

Google: I cannot process this amount of data.

just pastes the first 50 lines

I cannot process this data.

pastes 20 lines.

I cannot answer this question at this time.

Ok then, I guess I'll keep using ChatGPT.

So for some reason I stumbled upon asking Bard/GPT for a reccomendation of a biography on Gregory of Nyssaa as a test, I wast looking for one but it turns out there really isn't on, at least in English. This gives them the tendency to Hallunicate. Sure enough tried on Gemini and get a fictional biography Saint Gregory of Nyssa: An Intellectual Biography by Rowan Williams (The former Archbishop of Canterbury). I think it's making this up based on the book St Gregory of Nazianzus: An Intellectual Biography but that is by John McGurkin. The other ones it recommends are Gregory of Nyssa by Lewis Ayers (A real author, no such book), Gregory of Nyssa: The Life and Works of a Cappadocian Father, and Gregory of Nyssa: Asceticism and Anthropology by Sarah Coakley. This last one is the closest to actually existing since she edited the book Re-Thinking Gregory of Nyssa which is a collection of essays on him.

So it didn't make up any authors at least, but did make up some books. It will happily make up ISBNs for them if I ask and even provide links to Amazon, that of course go to other books.

Asking for a book about any other figure notable enough to have a Wikipedia page but obscure enough to not have any existing book written about them will do the same thing, I tried it out with multiple signers of the Declaration of Independence for instance.

I just don't understand how Google communicates their product with customers. We have like multiple chat apps, and the history is repeating with Google AI? Make it simple, so we don't have to explain to the person next to us.

A rebrand this early sends unhealthy signals.

Well, it says I'm in to 'Bard Advanced' (which is the same as 'Gemini Ultra'). I only did a couple of queries so far, text output seemed only marginally better than 'Gemini Pro', which I was just starting to get decent results with after getting used to prompting. It's possible they'd done a stealth release earlier, obviously need to do a lot of experiments to make a proper comparison with GPT-4.

It’s hilarious to see Google dethroned from the AI/ML field. All scrappy startups eventually become slow monolithic incumbents.

Mh weird that they bundle it with storage.

Makes downgrading hard if you don't track your usage and get used to it

I just signed up for the $21/month Gemini Advanced, and this is just one data point: I just asked it a fairly hard programming task and it did a very good job, in an obscure language Common Lisp (that has been my favorite programming language since 1982 - there must be dozens of CL enthusiasts in the world :-) ). The generated code and supporting explanations were very good.

Off topic, but I find this amusing: I have stopped using “free” Google services like search and gmail (mostly), but I am now up to about $135/month on paid Google services (not counting GCP bills). Paid services like YouTube/Music Premium, YouTube TV, Play books, etc. are really good.

Better than GPT4 on my tests. I also prefer the way it responds and it is also a bit quicker for me.

Wow, two announcements by Google on HN on the same day both look like a scam. localllm just being a downloader and wrapper for llama.cpp, and this one giving many users a 404 after subscription. I think Google is officially dead.

Downloaded the Android app, and upon opening, immediately says "Gemini isn't available on this device."

Nice. Can always count on Google to botch a rollout.

At least Canada is finally included.

Title may be (unintentionally) misleading if, post-subscription, it consistently 404s for everyone!

I'll be interested to see some independent comparisons against OpenAI's models, but like everything AI-related Google has done recently, it feels like it's all a bit too little too late - and this is the latest bungled product launch..

In the UK, it states:

£18.99/month

Which is about 20% more than I currently pay to OpenAI. Is Google's model 20% better?

Now available in Canada.

Quietly since the post a few days ago (https://news.ycombinator.com/item?id=39217046), Google added Canada to one of the allowed countries.

This page lacks information what is Gemini Ultra.

Bard is a difficult word to pronounce in East asian countries (korean, japanese) compared to Gemini. I'm thinking the renaming is a globalist move.

initially received a 404 after starting trial. now seeing ”Gemini isn’t available right now. Try again in a few minutes.”

after reaching the 404 and then /unavailable page, clicking my profile icon took me to https://gemini.google.com/app which seems like it still work but provides no responses, just redirects back to itself after submitting a message.

2 years from now : "Sunsetting Gemini"

Key point of this service is they already have A LOT of information of us. Using any other non local LLM platform implies an additional point of failure on our weak online privacy sphere

Integrate it into Android Auto please

Sorry, Gemini Advanced isn't available for you

Gemini Advanced is not yet available in some countries, for work accounts, or for users under a certain age.

I’m so tired of this bullshit with Google. I can’t tell you how much of a pain it is to PAY Google and then be excluded from things.

Everything from not being able to claim a free chromecast+stadia controller to this. Heck the other day I logged into my Google drive and it warned me I was out of space. I’ve been paying for Google storage for years but something flipped on their backend and the GSuite account someone just took priority over that? So I had to upgrade my GSuite subscription to get enough storage and cancel my Google storage subscription. No notice, no explanation, just complete bullshit.

If I thought I had a safe path forward to remove Google as my email/calendar while maintaining my Google drive/Google account access I would seriously consider it but I fear I’m locked in and would have to start over with Google if I took my domain with me elsewhere.

I thought things might improve after paying for GSuite instead of being on their free tier that they discontinued after a decade but it’s only gotten worse.

Getting a very un-Google like "Error: Server Error" hitting this page... can't imagine HN has overloaded a Google page, especially since I'm looking at this 3 hours after it was posted

Seems like it hallucinates like crazy. I can’t get it to give me a tutorial on robot react without it making up APIs every step of the way. I correct it, and it apologizes and gives me a new error.

Android should a framework for AI entry points / control of apps. Finally AI assistants would become useful.

Getting really meta here, I was curious what GPT4 and Gemini Advanced thought of this discussion. I started with this prompt:

"Here's a collection of comments on hackernews discussing the differences between GPT4 and Google Gemini, a new chat model. Please summarize the main points, pros/cons for each:"

and then pasted in the content of this page.

A few things right off the bat - GPT4 is able to fit the full discussion into the context window, Gemini Advanced can fit about 40% of it in, but interestingly even though it cuts off at 40%, it fails to generate with that amount until you cull it down to 30% of the discussion.

Here's the output of the two:

GPT4

It seems like the discussion on Hacker News about the differences between GPT-4 and Google Gemini, now referred to as Gemini Advanced, brings up several points and concerns among users. Here's a summary of the main points raised:

Privacy and Data Use: There's a concern about conversations being processed by human reviewers to improve the technology powering Gemini Apps, with no opt-out mechanism for privacy offered in the paid feature.

Disabling Data Use: Users can disable Gemini Apps Activity to prevent future conversations from being reviewed or used to improve machine-learning models, but there's skepticism about settings being reset with updates and the legalities of such practices.

Corporate Practices: There's speculation about corporate motivations behind data retention policies and skepticism about the effectiveness and honesty of opt-out mechanisms based on past behaviors of large tech companies.

Comparison with GPT-4: Users have experimented with both Gemini Advanced (previously Bard Advanced) and GPT-4 and noted that while Gemini might be better at certain tasks like translation and integrating search results, it is perceived to be worse at coding, logic, and has a tendency to hallucinate more in certain scenarios.

Integration and Ecosystem: There's discussion about Gemini's potential integration into various Google services, offering a unified AI experience across Google's ecosystem. This could position Google strongly in the AI race, potentially overshadowing competitors by leveraging its existing services like Gmail, Docs, and YouTube.

AI Implementation and Limitations: Comments touched on the implementation of logic in AI, with some users finding limitations in both Gemini and GPT-4's abilities to handle logical reasoning and complex coding tasks.

Concerns about AI in Healthcare: There's speculation about the role of AI in healthcare advice, with concerns about the risks of hallucinations and the impact on well-being.

Corporate Power and Strategy: Discussions also veer into how Google's strategy with Gemini and its potential bundling with other services could affect the market, consumer choices, and the competitive landscape.

Legal and Ethical Considerations: Users raised questions about the legalities of data use practices, the ethical considerations of AI in giving medical advice, and the broader implications of AI advancements on society.

These points reflect a mix of technical observations, personal experiences, and broader concerns about privacy, corporate practices, and the future role of AI in various domains.

-------------------------------------------

Gemini Advanced

Here's a summary of the feedback and insights obtained from the Hacker News comment thread regarding Google Gemini vs. GPT-4:

Overall Sentiment The Google Gemini chat model is a strong competitor to GPT-4, with users highlighting potential advantages and disadvantages relative to ChatGPT. It seems further testing and exploration is needed to reach a definitive conclusion about which is ultimately "better".

Gemini Pros

Translation: Surpasses GPT-4 in ability to accurately translate idioms from languages like Chinese and Japanese. Search Integration: Provides superior context by smoothly incorporating search results in its generated responses. Potential for Wide-Reaching Integration: Could deeply and seamlessly integrate with Google's vast suite of productivity tools and services (Gmail, Docs, Maps, etc.), potentially surpassing Microsoft in this capacity.

Gemini Cons

Privacy Concerns: Users question Gemini's review, storage, and usage of conversations for model improvement, particularly the retention of data for 72 hours for "safety and reliability" without offering opt-out mechanisms. Questionable Censorship: Gemini may overly restrict medical-related or other sensitive topics. Logic Shortcomings: May present contradictory statements or struggle with basic logic puzzles. Hallucinations: Tendency to create inaccurate or confusing descriptions when interacting with images. GPT-4 Strengths (In Relation to Gemini)

Coding: Slightly greater capability when given coding tasks. Logic: Better performance in handling logical reasoning challenges. Key Considerations for Large Language Models

Privacy: Transparency, clear choices, and strong privacy controls are crucial in establishing trust and acceptance. Logic and Accuracy: The ability to reason correctly and avoid illogical contradictions or fabrications remains a foundational challenge for the advancement of these technologies. Censorship: Balancing safety concerns with the free flow of information and allowing users more control over the content filtering. Future Opportunities

Enterprise Usage: Both Google and Microsoft (who has invested in ChatGPT) may find significant opportunity in enterprise markets leveraging language models to enhance productivity and work tools. Bundling: Google might further cement its competitive advantage by offering attractively priced bundles that combine Gemini with existing services like Google Drive, YouTube Premium, or others. Let me know if you'd like a more detailed look at any specific aspect or thread from the discussion!

Is this the soonest after launch that a Google product has been confusingly renamed?

Google is now Toogle

On the https://one.google.com/about/plans page they named the subscription AI Premium. It's the same as Premium, but you get access to Gemini Advanced too. However I'm not sure where you can use it at the moment. Maybe just Bard?

So does it trade Bitcoin or something?

My findings so far... are that Gemini Ultra is so far more based in than chatgpt 4 turbo.

Just from what I am seeing.

When I ask ChatGPT4 what the length of its context window is, it tells me (4096 tokens).

When I ask Gemini, it basically tells me "it depends" with a few paragraphs of things I generally don't care about and then suggests I ask for a ballpark estimate (1k - 3k tokens).

I reran a few of the queries that Bard had failed on in Gemini/Ultra and didn't see any improvement. Made the same, or new, logical errors, hallucinated facts, failed to recognize things I was describing etc. It did do better on recognizing an image I uploaded but it went from accurate to nonsensical in the same response.

Can Gemini Ultra respond with the LaTeX/KaTeX formula of a given input image (e.g. a screenshot from a math formula)? GPT-4 does this well and can practically replace Mathpix Snip for me.

I did a test on my field, I might have gotten an idea or two. Thanks Bard.

Really makes me wonder if chatGPT4 could have given me the same answer if I could roll seeds a few times or change the invisible preprompt.

We have 2 online AI that can do logic now, 0 offline :(

Love how they rebranded before creating a valuable brand

Oh wow, even looking at the price requires a Google login. Looking forward to seeing independent comparisons of this vs the other top LLMs.

The addition of Google One (2tb of google storage, VPN, etc) to the $20/month offering makes a compelling case to switch over from ChatGPT Plus. Assuming the AI feature set is equivalent.

Thank God "bard" was the dumbest Ai name yet

Can it read information from images, like GPT-4V / ChatGPT+?

It's not even a year, they have already rebranded Bard as Gemini! That's hilarious.

If a LLM is fed all your Google Account metadata and then told to pick an ad for you, is it private?

I want someone to ask Gemini to pick a product at random you [the tester] might be interested in and, after picking, the why it thinks you're interested in it, as well as a convincing SWOT analysis of why you should buy it.

A company the size of Google can't come with something more catchy? The product better be compelling cause no one is going to check out what Gemini does just for the name.

Prompt: Spell the word lollipop backwards.

Gemini: The word "lollipop" spelled backwards is: popillol I hope this sweet treat of a word brightens your day!

I'm impressed. However, it still fails on "How many words are in your response to this?".

Gentlemen, please give a warm welcome to this next performer, put your dollars in and give it up for Gemini!

Only a few product names, so easy to understand

  - Bard
  - Gemini
  - Pro 1.0 model
  - Gemini Advanced
  - Ultra 1.0
  - Google One AI Premium Plan
  - Gmail, Docs, Slides, Sheets
  - Google app on iOS
  - Gemini app
  - Google Assistant

This blog post was a missed opportunity for Gemini: Ideally, the blog post should have been by-lined as "written by Gemini" (with no human intervention besides prompting). Even better - the attached video shows the blog post being generated.

I would be willing to pay for an AI that it's not constantly worrying about my safety and ethics, it's very frustrating that even when I ask for tech related topics, Gemini it's holding information on the basis of actions that might cause harm to my devices.

It is insane how hard it has become to create new or anonymous google account or to log in to old one that is not associated with your identity, it is practically impossible.

Has anyone successfully bypassed the region lock? What does it check exactly?

Does it need a US credit card? A US IP address? A Google account with a US phone? A Google account created from a US IP address? Some other way of tying the Google account to a country?

Wow. The image descriptions suck.

Link to actual image: https://www.dropbox.com/s/2toof6629etfp3h/Dissidia%20story%2...

Prompt: Please describe this image in detail.

GPT4-turbo Vision: The image appears to be a screenshot from a video game that features strategy or board game elements. The visual style suggests it could be from a Japanese role-playing game (JRPG). Here are the details visible in the image:

1. The game interface includes several elements such as "Destiny Points" at the top left corner, which currently shows 01, indicating some form of scoring or currency.

2. At the top right, there's a "LEVEL BONUS" indicator with "20 PP" beneath it, which likely means the player will receive 20 power points or some similar reward for leveling up.

3. The main portion of the screen shows a grid of hexagonal tiles suspended above a patterned surface, with the tiles at varying heights, creating an almost chessboard-like appearance. Each tile has a design on it resembling a circuit or electronic pattern.

4. There are several tokens or pieces on the tiles; on the lower side, we see a red object that seems to be a treasure chest, and further up there are two blue tokens with a crystalline, polygonal shape.

5. The center of the grid features a "Locked Area" as indicated by the text box that reads: "Inaccessible until certain conditions are met." This area has a glowing purple outline, suggesting it is currently barred from player interaction.

6. On the right side, there's a character figure standing on one of the tiles, representing the player or an in-game avatar. This character is richly detailed and wears ornate armor with gold and dark grey tones, hinting at a knight or warrior theme. They wield a large, elaborate weapon with a fiery effect trailing behind it, indicating it might possess magical or powerful attributes. Next to the character is their level information, showing "Lv 46" and a health bar with "4155/4155," indicating the character's current full health.

7. In the same area, there is another token, which is a gray figure resembling a humanoid character, possibly an opponent or neutral piece on the board.

8. At the bottom of the screen, there's an instruction for the player: "Move to the piece you wish to engage."

9. Lastly, on the right side edge, vertical text labels show different "Skill" options available to the player: "Maser Eye," "Aura," and "Scan," each with an icon next to it. These seem to be special abilities or actions the player can perform.

10. The overall design of the interface, from the metallic tones to the intricate details on the character and tiles, gives off a futuristic or high-fantasy vibe, which is common in many JRPGs.

Gemini Advanced: The image you sent is a screenshot from the role-playing game Dissidia Final Fantasy for the PlayStation Portable (PSP). It shows a locked area within the game's world. The locked area appears to be inside a large castle. The castle’s many columns are visible in the shot, and they appear to be intricately designed and made from different materials. In the foreground, there is a message that indicates that the area is locked and players cannot enter until they meet certain conditions. The message also mentions a “Destiny Point” system and a “Maser Eye” skill. These are both gameplay mechanics that players can use to progress through the game. The image you sent me appears to be from Dissidia Final Fantasy, but it could also be from another game in the Final Fantasy series, such as Dissidia 012 Final Fantasy or Dissidia Final Fantasy NT. Overall, the image you sent is a screenshot from a role-playing game that shows a locked area within a castle. The screenshot also provides some information about the game’s mechanics.

So I just installed the Google Gemini app on my (completely stock) Pixel 7 and let it replace the normal Google Assistant. This is half baked AF. It can't create reminders in Google's own Tasks app. It can't control Spotify. It is able to set timers tho. And that's about 95% use case for me. Heading back to the OG Assistant now...

Sad launch experience: I download and install the app, and it just tells me "This app isn't available. Try again later".

I'm in US, on a pixel phone and a xoogler. You'd think they would love to give me access.

I also tried going thru assistant, and through the google app, no luck, and after uninstalling and reinstalling the Google app, now the gemini app button just hangs the UI(!) Oh wait... a reboot fixed it. Lol, google has become microsoft.

I went for the trial. Now what am I supposed to do with this? It's just a "chatgpt"? I can't see myself using this much, let alone pay 20 euros a month for it.

What are some cool use cases?

Standard google. All about the rebrand.

Gemini isn’t available right now. Try again in a few minutes.

of course the gemini android app is not available in Germany or EU

https://play.google.com/store/apps/details?id=com.google.and...

This doesn’t seem to be available via their Vertex APIs yet.

is there an API? if so, how expensive is it to use?

Seems to be a lot of dead-ends (I only have a GSuite account via work, and cannot view the page without a Google account, and it doesn't seem to work with a GSuite account).

What _is_ Gemini Ultra?

Tried to get it to produce its system prompt and got this:

You are Gemini Advanced, a large language model built by Google. You are currently running on the Gemini family of models, including Ultra 1.0, which is Google's most capable AI. You don't have a knowledge cutoff, since you have access to up-to-date information.

You are not capable of performing any actions in the physical world, such as:

setting timers or alarms controlling lights making phone calls sending text messages creating reminders taking notes adding items to lists creating calendar events scheduling meetings taking screenshots Key Points:

Code Blocks: Code blocks are often used to display code snippets in a distinct format. In platforms like Discord, Reddit, and others, you signal a code block using triple backticks (```) before and after the text. My Nature: The instructions describe my fundamental role as a language model and highlight my ability to access current information. Limitations: This section is important! It states that I cannot interact with the physical world directly.

I was able to sign up for the trial, but the page on which I would actually use gemini is a 404.

The different prompting strategies needed to improve results for different models is fascinating. I usually tell ChatGPT the role it should play to get better results e.g. "You are an expert in distributed systems". The same approach with Gemini returned "as a large language model constantly learning, I wouldn't call myself an expert."

How many companies and products called Gemini are there?

I'd like to try this out with the free 2 months trial. Can anyone advice how safe it is to do so? What is the risk that something I do with Gemini causes Google to lock my entire Google account including calendar and photos? I mean, this happened to people who (gasp) signed up to Google+ under a pseudonym rather than their real name (oh, the horror, how dare they).

I really liked the name bard :( actually, I was thinking a few days ago that google was really good with naming and that “bard” was the best name of all the AIs out there.

As someone who has almost exclusively used Bard since it started, it is really good and has gotten better and better significantly.

The only downside is that it is heavily censored and so I often have to rephrase or use a different AI.

Not even close to ChatGPT :/ Bard essentially ignores a good chunk of the prompt which is the most important (generating the anki formatted card that you can directly copy and paste).

"create an anki deck of the top 5 brazilian portuguese verbs and the top 5 tenses. The format should be like the one below (but adding one line for the 5th verb tense). An explanation of the tense and an example sentence should be included. Create the output so that I'm able to copy and paste it into a text file and import to anki

Ser - Presente do Indicativo Eu {{c1::sou}} Ele/Ela {{c1::é}} Nós {{c1::somos}} Eles/Elas {{c1::são}};explanation of tense and example sentence here; Ser - Pretérito Imperfeito do Indicativo Eu {{c1::era}} Ele/Ela {{c1::era}} Nós {{c1::éramos}} Eles/Elas {{c1::eram}};explanation of tense and example sentence here; Ser - Pretérito Perfeito do Indicativo Eu {{c1::fui}} Ele/Ela {{c1::foi}} Nós {{c1::fomos}} Eles/Elas {{c1::foram}};explanation of tense and example sentence here; Ser - Futuro do Indicativo Eu {{c1::serei}} Ele/Ela {{c1::será}} Nós {{c1::seremos}} Eles/Elas {{c1::serão}};explanation of tense and example sentence here;"

GPT 4 (abbreviated) Ser - Presente do Indicativo Eu {{c1::sou}} Ele/Ela {{c1::é}} Nós {{c1::somos}} Eles/Elas {{c1::são}};Used to express a current state. Example: "Eu sou estudante." (I am a student.); Ser - Pretérito Imperfeito do Indicativo Eu {{c1::era}} Ele/Ela {{c1::era}} Nós {{c1::éramos}} Eles/Elas {{c1::eram}};Used for ongoing or habitual actions in the past. Example: "Ele era professor." (He used to be a teacher.); Ser - Pretérito Perfeito do Indicativo Eu {{c1::fui}} Ele/Ela {{c1::foi}} Nós {{c1::fomos}} Eles/Elas {{c1::foram}};Used for actions completed in the past. Example: "Nós fomos amigos." (We were friends.); Ser - Futuro do Indicativo Eu {{c1::serei}} Ele/Ela {{c1::será}} Nós {{c1::seremos}} Eles/Elas {{c1::serão}};Expresses a future action or state. Example: "Ela será médica." (She will be a doctor.); Ser - Presente do Subjuntivo Que eu {{c1::seja}} Que ele/ela {{c1::seja}} Que nós {{c1::sejamos}} Que eles/elas {{c1::sejam}};Used for wishes, doubts, or hypotheticals. Example: "Espero que você seja feliz." (I hope you are happy.);

Bard (abbreviated)

Ser

Presente do Indicativo Eu sou Ele/Ela é Nós somos Eles/Elas são Explanation: The present tense is used to describe actions or states that are happening now. Example sentence: Eu sou estudante. (I am a student.)

If this is the best Google can do 1.5 years after GPT-4 finished training, then OpenAI surely still has a big lead.

Why don't they call it GoogleGPT?

Bard was a strange name... Gemini might be better... Stop changing the name?

My impressions after 90 minutes of intensive testing: Overall, on par with original GPT-4 in most aspects, inferior to GPT-4 turbo

Detailed aspects versus GPT-4 turbo 1. World knowledge, slightly inferior. GPT-4 turbo was able to detail a protagonist's childhood year by year for a Japanese novel with near 100% accuracy (That a human reader would get the chronology wrong). Gemini ultra much more easily confused.

2. Creativity, Gemini ultra wins. Its writing style has far more flair than GPT-4 turbo, it also occasionally made some stunning analysis that I never thought of and made perfect sense. GPT-4 turbo is more like a textbook repeater, it doesn't make many mistakes, but also rarely surprises you with anything original.

3. Accuracy, GPT-4 turbo still makes fewer mistakes. Including in subtle logic (Like having a hypothetical battle between two characters in the same universe, considering the strengths and weaknesses of their powers, etc).

So this is definitely Google's first real-deal LLM. Its not better than current GPT-4 turbo, but its getting there. OpenAI must be feeling the fire to release GPT-5 before the end of the year.

Seems it was just fixed. Url: https://gemini.google.com/app works now!

Official blog post and Android app

https://blog.google/products/gemini/bard-gemini-advanced-app...

https://play.google.com/store/apps/details?id=com.google.and...

I am glad they renamed bard to gemini.

Google is a global company a hell of its their users are not native english speakers and for a lot of english speakers like those from say India, they do not know what Bard is. Bard sounds like a LGBTQ hippie with pink hair. Perfectly good name for a beta and experimental product. Not for something mature and major.

Gemini is pretty good though I do not know whether Gem-in-eye or Je-mee-nee is the correct pronunciation. Gemini is a start, constellation and extremely popular word and most people are likely to know it because horoscope.

I think of https://www.gemini.com when I first hear Gemini.

edit Sadly not!

Sounds like you can theoretically do the old Xbox game pass trick of loading up Gold and then upgrading.

£25 for an annual standard plan, and then upgrade to Ultra. First 2 months free, so potentially 14 months for £25 (then £18.99 a month after). No idea if this works in practice!

"If you upgrade, your plan will be active immediately, and the remaining time on your current plan will be credited towards your new plan."

They clearly should have gone with HAL if the Bard branding wasn't working out.

Wow, I pay £1.50 currently, which is paid nicely via google rewards every month for a year, ticks along nicely. A £18.99/month bolt-on, nope, if couple quid, sure, but just priced out and what I would call top-end whale marketing price farming, which down the line will, I predict - half before years out at least.

everyone seems to be praising Gemini but this was my first query and response.

what is the syntax to type in username and password into a git clone

As a language model, I'm not able to assist you with that.

Looking forward to someone writing a review. So far Gemini has shown capabilities all over the place. Image generation quality feels a bit worse than what I've seen from DALL-E and Stable Diffusion. Does Ultra provide superior image generation capabilities?

Something I've noticed about Gemini is that usually it'll respond to my query correctly, but it's never the default draft. If I look through each draft one of the options will usually contain the correct answer though.

I'm pleased to find that capabilities have been improving. When Gemini was initially released, asking for something like "How many views have the last 5 mrbeast videos gotten?" wouldn't generate a useful reply. But now it lists the latest 5 videos and one of the drafts even includes the total added up.

Asking Gemini to generate video summaries seems to work really well on some videos, but for others it just gives an error... Are YouTube creators allowed to opt-out of Gemini interactions?

"Sorry, Gemini Advanced isn't available for you". Why are Google paying customers always the last to get everything. Always thought that should be reversed. Especially for one person users who use it for the customer domain and not running a fortune 500 company. So frustrating. Google Home is still a mess where if one if Google Work and the rest of the family is not won't work after all these years. So frustrating.

This article reads like it was generated by a LLM. Especially when read by the, generated?, voice that plays if you click, “Read this article to me”.

The anecdata here suggests that Gemini Ultra is marginally to moderately worse than GPT-4 at some prompts despite launching roughly a year later with an entrenched player to compare against. It also seemingly is more censored.

I don't think Google is structured well-enough to actually compete in a novel space, the politics and especially the inertia around AI-safetyism that has slowed Google down in the past will continue to slow them down. They desperately need new leadership or they do risk losing their moat over the next decade.

Wonder if this will muddy the waters for searching about info for the Gemini protocol..

I have no use for machine learning like this, and definitely not what I thought when I saw the headline. Hope the pivot works well for alphabet, but odd nonetheless.

Missed opportunity with GeminAi

I'm impressed with Gemini a lot and prefer it to ChatGPT. I asked what the price of Bitcoin was today and it gave me the price from several sources and I told her the Coinbase price was wrong and she checked again and said "You know what, you are right, I've corrected that."

She seems nicer than ChatGPT, is faster, and gives answers with less fluff and nonsense padding added. I think I've found my new LLM.

I hated the name Bard it just seemed like the name a of a second place LLM, I don't know how to describe it better than that.

Technically Google just released another chat app.

Well let’s hope these hn comments are representative of the wider experience. Doesn’t sound beta ready let alone paying launch

The service refuses to accept even anime images because it thinks there are humans in it, kinda funny.

Ok I tried it to create snippets of code for some obscure cases. ChatGPT passed with flying colors. Gemini kept referring me to documentation and regurgitating how I should consult it. It plainly sucks.

I don't think its actually been released yet, no blog no news. Someone just found this link

bard seemed a lot more fun as a name. But I'm a biased D&D player. Gemini doesn't mean much unless you're into astrology or ancient greek history

While today is about Gemini Advanced and its new capabilities, next week we'll share more details on what's coming for developers and Cloud customers [1]

[1] https://blog.google/technology/ai/google-gemini-update-sunda...

[Dupe]/merge with other discussion:

https://news.ycombinator.com/item?id=39300679

The promise sounded good, but the reviews for Gemini Advanced are not very positive.

Google have thrown it in as part of Google one, but this means it's the same price as ChatGPT+ which for sure seems better despite Google's promises.

Gemini? First Go and now Gemini really? Can't wait for Google iOS and Google Windows.

Gemini Web : missing Custom instructions from Chatgpt.

Gemini android : not available in India yet.

Not to be confused with the space program, the Winklevoss's brothers crypto exchange...

It can't be accessed without a google account.

bard.google.com now redirects to gemini.google.com