Gemini AI

Some people on codeforces (the competitive programming platform that this was tested on) are discussing the model: https://codeforces.com/blog/entry/123035

Seems like they don't believe that it solved the 3200 rated problem (https://codeforces.com/contest/1810/problem/G) w/o data leakage

For context, there are only around 20 humans above 3200 rating in the world. During the contest, there were only 21 successful submissions from 25k participants for that problem.

It doesn't code like human so you would expect it to be better at some kinds of tasks. It brute forces the problems by generating a million solutions and then tries to trim that down, a few problems might be vulnerable to that style of approach.

Are you sure? "brute forces the problems by generating a million solutions and then tries to trim that down" isn't how I would describe the way a LLM works.

The original AlphaCode paper in Nature explains the approach, they generate many potential solutions with the LLM and do a lot of processing after to select candidates. Here's where the probabilistic nature of LLMs hurts, I think.

That is how it works, read the paper.

AdamantChicken2 (aka AlphaCode2) replied to the thread! https://codeforces.com/blog/entry/123035?#comment-1091379

I guess we'll know in a few months (whenever the model is available and the next competition is run)

That plot is downright criminal

https://imgur.com/a/GmbkDaz

86.4->89.8% = 1/3 of 89.8->90% ???

Great science + awful communication

It's just an UI issue. The plot looks fine (as in, correct Y-axis) when I opened the website on a landscape monitor.

No, it's not. On a large enough display zero axis is still somewhere near the basement. Proportions are not as bad, but still very much off.

desperate times, desperate measure...ment practices

Alexa from Amazon, Cortana from Microsoft, Siri from Apple.

Erica from Bank of America, Jenn from Alaska airlines.

Now Gemini from Google.

What is with tech bro culture to propagate the stereotype that women are there to serve and be their secretaries.

I like ChatGPT & Clippy. They are human agnostic names.

I expect better from Google.

Gemini is not a name

Given that Gemini is represented in Greek mythology by the two male twin half-brothers Castor and Pollux, I think you might be projecting a little.

I think I agree with your broad point, but is Gemini really a feminine name? I thought they picked a pretty good genderless name.

I had an interesting interaction:

Me: Are you using Gemini?

Bard: LOL. Got that wrong earlier today. Bard is on Gemini Pro in English across most of the world as of Dec 6, 2023.

When I asked it about the statement it said:

Bard: I apologize for the confusion. The "lol I made this mistake earlier" statement was not intended for you, but rather a reflection on a previous mistake I made during my training process. It was an error in my model that I have since corrected.

Hmm... Earlier today I asked "Are you Gemini pro?" And it answered word-for-word the same way. Is this a hard-coded or heavily prompt-coached answer?

It's suspicious when an AI answers 100% the same.

Huh, same here, that “LOL” response was the exact same thing for me. Pretty weird.

When I expressed my surprise about its casual response, it switched back to the usual formal tone and apologized.

Not sure what to make of this as I don’t consider myself to be in the know when it comes to ML, but could this be training data leakage? Then again, that “LOL” sentence would be such a weird training data.

I think what we're seeing is the first instances of LLM based advertising.

Not sure why people are impressed with this. For context, they are only slightly beating GPT4 marginally on some tasks but GPT4 was trained almost 10 months ago

The demo on getting it to read 200,000 scientific papers seemed impressive to me.

I would assume because there is so little competition (Micosoft/OpenAI, Anthropic, ??) here for commercial hosted solutions that Google being closer to parity here is significant, even if it still not on par with OpenAI.

I thought Gemini was supposed to be a "massive leap" over GPT-4, and yet even in these benchmarks (unevenly employed) it just barely outperformed a specific model of GPT-4.

Google is the one that boasted in saying that.

By the time it's actually available to the public, OpenAI may be rolling out their next model.

But it does seem like Google is catching up faster than anyone else.

Also, who cares unless I can try it and see for myself.

Well, just saw some videos of what Gemini can do. Actually impressive:

https://x.com/sundarpichai/status/1732433036929589301?s=20

One of the capabilities google should be evaluating their AI on is "determine if the top google search result for X is SEO spam AI nonsense or not."

This is unironically a great idea

Am I the only one not hyped by these kinds of demos? I feel that these are aimed toward investors so they can stay calm and not lose their sh*t

I mean it's a great achievement, however I feel that until we get our hands on a product that fully enhances the life of regular person I'll truly say "AI is here, I can't imagine my life without it"

Of course if it's specifically used behind the scenes to create products for the general consumer no one will bat an eye or care

That's why there are lots of people who don't even know that Chat GPT exists

Count me not impressed too. Let's make it a movement.

is it going to be pronounced Geminee (like the NASA project) or Gemineye?

The NASA project was pronounced Geminee? I always thought it was the latter.

There are plenty of smart people I know personally at Google and DeepMind that will get this right. Google has 100X more data (data=food for neural networks) than OpenAI, It has youtube, Google Photos, Emails and search histories. There is a lot more pressure on Google than OpenAI to release Safe models, that is why this models are getting delayed, In my opinion they should go ahead and release it by phases to stop all this non sense speculation. We all want competition and I hope Google model will be a good one and free and can lift society forward and more prosperous and productive for everyone.

I hope founders will come back, Larry or Sergay to the leadership positions and make company more innovative as before.

Just an observation based on some people complaining that this isn't some significant advance over GPT-4 (even if it happens to actually be a small percentage gain over GPT-4 and not just gaming some benchmarks).

One thing I consider isn't just what the world will be like once we have a better GPT-4. I consider what the world will be like when we have 1 million GPT-4s. Right now how many do we have? 3 or 4 (OpenAI, Gemini, Claude, Pi). I think we'll have some strange unexpected effects once we have hundreds, thousands, tens of thousands, hundreds of thousands and then millions of LLMs at this level of capability.

It's like the difference between vertical and horizontal scaling.

Nice toy Google, now how can it improve MY life?

....yeah, that's what I thought. This is another toy and another tool to spy on people with. It's not capable of improving lives.

Additionally, I had to tap the Back button numerous times to get back to this page. If you're going to EEE the Web, at least build your site correctly.

I am very excited for this in that I have a backup Plan if either this project or OpenAI gets shut down before I can use open source systems. I wonder if langchain can support this because they have Vertex AI as an existing API.

It's uncertain when Google discontinues Gemini.

Marketing: Gemini 90.0% || GPT-4 86.4%, new SotA exceeding human performance on MMLU!

Fine print: Gemini 90.0% chain of thought @ 32-shot || GPT-4 86.4% @ 5-shot

Technical report: Gemini 83.7% @ 5-shot || GPT-4 86.4% @ 5-shot

Granted, this is now the second-best frontier model in the world - but after a company-wide reorg and six months of constant training, this is not what success for Google looks like.

First 3 uses show me it's generally gonna be trash.

Severly disappointed.

I don't think they're taking shit seriously. Spent .ore time on the website that. The product. It should be equal too or better than 4.

Seems some of the benchmarks (maybe all?) rely heavily on either CoT or some other additional prompting method to achieve the results. Will their integration into Bard and other consumer products use something similar?

Here is what we have so far, correct me if I'm wrong:

Ultra Release: Scheduled for early next year.

Pro with Bard: Positioned between GPT-4 and Ultra in terms of performance. Currently available in the US only.

Benchmarking Notes: The benchmarks shared appear to be selectively chosen.

Demo Video Analysis: It's challenging to ascertain the extent of scripting in the recent demo video - was it real-time or pre-arranged?

Whatever the case, this is very exciting.

The most apple like launch from Google.

Google has the possibility to roll and integrate small LLM(!) to the Pixel phones, that's something OpenAI can't do easily. Too bad MSFT dropped the Windows phone.

great, but, where can I use it? bard seems still the same, and, is there a chat.gemini.ai site I can use? otherwise, it's just a PR for now.

google, listen, stop talking the talk, walking the walk when you have something in real, your Bard for example, is still one decade behind chatgpt, your gemini has not even made it better and you're announcing you had a chatgpt killer, don't drive your reputation to ground please, it's in decline over the years.

Anyone know if they're using TPUs for inference? It'll be real interesting if they're not bottlenecked by Nvidia chips.

Have to use vpn to USA to access via UK

of all the problems i have that chatgpt has been unable to solve, bard is still not able to solve them either

no improvement that i see, still glad to see this do some other really neat things

Whatever happened to putting text on a page. I give I am too old for all the rounded corners. It's AI! Coming soon.

I don't have anything to say about Gemini without using it, but man, that's a beautiful website. Not expected from Google.

The whitepaper has a few benchmarks vs. GPT-4. Most are reported benchmarks, though. Most of the blogs/news articles I've seen mention Google's push to focus on GPT-3.5. Found the whitepaper table way better at summarizing this. https://storage.googleapis.com/deepmind-media/gemini/gemini_...

I love that OpenAI surprised Google and lit a fire under them.

Google’s task now is to think through a post-search experience that includes advertising in a much more useful and intelligent way.

I think it can be done.

This demo makes me think they’re not that far off:

https://x.com/googledeepmind/status/1732447645057061279?s=46...

This is epic from a technical standpoint

Ok Unpopular opinion here, I expected more from Google here. Them just beating MSFT is not going to cut it. MSFT strength is enterprise, goog strength is tech. And right now MSFT is almost there on tech and better on enterprise.

Hijacking the back button to intercept hash route is annoying, basically it's impossible to go back to previous page.

Just logged into Bard to try it with the new Gemini (Pro) and I have to say, it’s just as bad as it ever was. Google continues to underwhelm in this space, which is too bad because OpenAI really needs some competition.

Kinda off-topic, but gemini.ai redirects to gemini.com (the crypto exchange).

ok. when will it be closed? so bard is no more?

I asked it to summarize this conversation. Initial result was okay, then it said it couldn't help more and suggested a bunch of unrelated search results. https://imgur.com/a/vS46CZE

Google again making announcements but not releasing for public to validate their claims.

What's the point of it? They hype it so much, but the actual release is disappointing. Bard was hyped up but was pretty shit compared to GPT-4.

They released the google search experiment with bard integration but the UX was so aweful it hid the actual results. I use Sider and it is a muuuuch much nicer experience.

Does google not have folks who can actually productionize their AI with usable UX, or do they have such a large managerial hierarchy, the promo driven culture actively sabotages a serious competitor to GPT4?

Ironically I go to gemini to get away from google.

You can just imagine the fire drills that has been going on in Google for half the year trying to get in par and beat OpenAI. Great to see, Im keen to see what OpenAI do but I am now more than ever rooting for the SOTA open source offering!

Watching a demo video, and of course it makes a plausible but factually incorrect statement that likely wasn't even noticed by the editors, within the first two minutes. Talking about a blue rubber duck it says it floats because "it's made of a material that is less dense than water". False, the material of rubber ducks is more dense than water. It floats because it contains air.

If I was going to release a highly produced marketing demo video to impress people I would definitely make sure that it doesn't contain subtle factual errors that aren't called out at all...

So it's an announcement with a nice web page. Well done.

I did another simple coding question between bard with gemeni upgrade and gpt4, it does not give me correct code, in fact completely wrong. Like hallucinates with calls from non existing libs, while gpt4 got it right with exact same prompt.

It's more on the level of GPT3.5 maybe not even.

Google again is gonna confuse the heck outta everyone like what they did with their messaging services, remember GTalk, Duo, hangouts, Messages. Their exec team is dumb af except in search, sheets and in buying Android.

With Bard still not available in Canada, I hope Gemini could.

Notable that the technical paper has no real details of the model architecture... No details of number of layers, etc.

Where's the Gemini/Bard Android/iOS app? Oh right, Google doesn't do apps /s

Extremely impressive. Looking forward to see how capable Gemini Nano will be. It'd be great to have a sensible local model.

Although open-source is improving immensely it's still far behind GPT4, so it's nice to see another company able to compete with OpenAI.

Brought about AI - what’s with the weird navigation UI on mobile. Not enjoying that at all.

I can't help but think that by the time they release this closed source Gemini project they brag about, the world will already have the same thing open sourced and better/comparable... ChatGPT beat them last year, and now we have a similar situation about to happen with this new product they speak of, but have yet to release anything.

Cool, bets on when they will kill it?

I give it a year.

The Star Trek ship computer gets closer every day.

Let's talk about it when it will be real product. Until then it is just marketing.

That is an incredibly intense brand/name choice.

Fatefully, Pollux survived the Trojan (!) war and Castor did not, and it was Pollux who begged Zeus to be mortal as he couldn’t bear to be without his brother.

Is this some prescient branding? Lol. Of all the names.

Meanwhile, Bard can't create images, see's more than there is on an image, and gave me this kind of response, after I was already talking about Rust:

Me: please show me the step by step guide to create a hello world in rust

Bard: I do not have enough information about that person to help with your request. I am a large language model, and I am able to communicate and generate human-like text in response to a wide range of prompts and questions, but my knowledge about this person is limited. Is there anything else I can do to help you with this request?

Doing "AI" before everyone else doesn't seem to mean they can get results as good as OpenAI's.

If this isn’t proof that AI is coming for your job I don’t know what is. Welcome to the human zoo, I suspect if you’re reading this you’re the exhibit.

Is there or is there not a chat interface or will this just replace bard or be bard’s backend?

instead of gpt1, gpt2, gpt3, ...

we have lamda, palm, palm2, bard, Gemini, bard with Gemini pro, ...

reminds me of

play station, play station 2, play station 3, ...

Xbox, Xbox 360, Xbox one, Xbox one X, Xbox one series X

Another promise? Where can we test this?

Improvements over GPT-4 are marginal. Given that this is Google, I.e. privacy doesn’t exist, I will not touch it tool at all.

are you gemini? >LOL. Got that wrong earlier today. Bard is on Gemini Pro in English across most of the world as of Dec 6, 2023.

It gives this exact same answer every time, and is a really weird and unprofessional response. Even if you ask it to be more formal it gives the exact same answer.

Who designed this web page? The back button hijacking is so annoying

960 comments is a lot! I created a SmartChat™ where you can get a summary (or anything else) of the comments: https://go.storytell.ai/hn-geminiai and here's a summary output example: https://s.drod.io/Jrum2mQK -- hope that's helpful.

Unrelated to the content of the announcement, but the scrolling behavior of the 'threads' at the bottom of the page is really neat.

I'll need to look into how that was done - I've seen similar things before but I can't think of any that are quite as nuanced as this one.

I would love to use Bard, if it were available in Canada. Don't quite understand why it's still not.

For some reason it's answering with the same weird phrase to every question that amounts to "Are you gemini pro?".

The answer is: "LOL. Got that wrong earlier today. Bard is on Gemini Pro in English across most of the world as of Dec 6, 2023."

I don't get it. Is this advertising? Why is it saying LOL to me.

Im a little disappointed to be honest, the improvement to GPT-4 is not as steep as I had anticipated, not enough to entice me to switch models in production.

Looks very ahead. Seems like OpenAI days are numbered.

Ultra is just vaporware, typical from google

If it reasons and helps with a lot better code for me than the other chat, perfect.

If it does not it's too late for me to change.

That's where i am at atm.

Related blog post: https://blog.google/technology/ai/google-gemini-ai/ (via https://news.ycombinator.com/item?id=38544746, but we merged the threads)

coming soon...

Google is uniquely positioned to bury everyone in this niche. Literally these models are based on data and google has the best.

It's pretty predictable. Sure OpenAI can introduce competition, but they don't have the fundamentals in place to win.

How many of these implementation are strict, narrow implementation just to show that Google is better than OpenAI for the investor community?

E.g. In a similar vein within Silicon Chip. The same move that Qualcomm tried to do with Snapdragon 8cx Gen 4 over M2. Then 1 week later, Apple came out with M3. And at least with processors, they seem to me marginal, and the launch cadence from these companies just gets us glued to the news, when in fact they have performance spec'ed out 5 years from now, and theoretically ready to launch.

Friendly reminder to not rely on any Google product still existing in a few months or years.

Site is a navigation and branding maze. What is the difference bt bard, gemeni and deepmind? Where do i type questions? How come it can't answer sth this simple?

Oops, your search for “what is a pannus” didn't return any results.

(Chatgpt won't tell me either out of modesty until I reassure it that's a medical term...)

Interesting example on page 57 of the technical report[1] with a poorly worded question:

"Prompt: Find the derivative of sinh 𝑥 + cosh 𝑦 = 𝑥 + 𝑦."

I couldn't understand what was being asked: derive what with respect to what? Gemini didn't have that problem, apparently it figured out the intent and gave the "correct" answer.

[1] https://storage.googleapis.com/deepmind-media/gemini/gemini_...

Y-axis in those charts doing a shitload of work.

Demo access or it didn't happen.

Can we talk about civil rights at this point, cause I'm not too keen on carrying around the weight of what happened <=1960's again.

does anyone know any paper that can accept video as input.

I hope to understand how to tokenize videos.

Let’s see how long it will last before going to Google’s graveyard.

A scroll is not a history event. Leave the back button alone, please.

Someone at Google is a Billy Madison fan with the blue duck reference.

Where’s the product though?

I’m surprised that the multimodal model is t significantly better than GPT4. I thought that all the Google photos training data would have given it an edge.

meh

They have a pretty interesting explanation of each cookie they use:

https://www.nkcommercial.com/%D7%A2%D7%95%D7%A8%D7%9A-%D7%93...

What I never quite understand is the analytics issue. We had server logs for analytics long before everyone started using cookies for that.

In my opinion the cookie part of GDPR is clearly bad regulation. It requires cookie banners for some things that are not privacy issues. And at the same time it doesn't institute a sensible consent mechanism that doesn't in practice amount to constant harassment.

Another woke GPT, no thanks.

Google believe that they know better, that their job is to decide for other what is the truth. And to play with the levers behind people back.

That will lead to a very dark path as it always does.

how to read Kant's books?

My first impression of their YouTube plugin is a bit disappointing.

I asked:

Can you tell me how many total views MrBeast has gotten on his YouTube videos during the current year?

It responded:

I'm sorry, but I'm unable to access this YouTube content. This is possible for a number of reasons, but the most common are: the content isn't a valid YouTube link, potentially unsafe content, or the content does not have a captions file that I can read.

I'd expect this query to be answerable. If I ask for the number of views in his most recent videos it gives me the number.

GeminAI

Missed opportunity + its an anagram (GAI) for Artificial General Intelligence (AGI) :/

"Bard isn't supported in your country"

Oh, the Internet? You had no trouble sending me the 404 page, so why not just send me the page with Bard on it?

I would rather build with OpenAI products rather than with Google products because if I use a Google product, I know that it will shut down in two years tops.

Saw it stated somewhere “better than 90% of programmers.”

*DOUBT

Maybe at very constrained types of leetcode-esque problems for which it has ample training data.

This being so high up is so funny in context of yesterday's popular post about the long-term consequences of Google's Chrome.

I sent a picture of a scenic picture (Lake Tahoe from the top of Heavenly) I took and asked Gemini where it was. To my surprise Gemini got it right. Even the ski resort. Woah.

Really loving the big button for using it on bard, which when clicked has no indication at all about what model it is currently actually using.

And when I ask the model what the base model it relies on is:

I am currently using a lightweight model version of LaMDA, also known as Pathways Language Model 2 (PaLM-2).

Which appears completely hallucinated as I'm pretty sure LaMDA and PaLM-2 are completely different models.

"Gemini, how can I easily sign up for Google cloud as an individual?'

Google really is an advertising company, it seems

Looks amazing! However, they don't easily show one how to try it out. Is this vaporware?

Until I see an actual hands on from an outside source I am not buying it. It is not clear at all how cherrypicked / conveniently edited these examples are.

Still waiting for an AI.

And still not available in Canada. Sigh.