return to table of content

Gemini AI

ghj
6 replies
1d3h

Some people on codeforces (the competitive programming platform that this was tested on) are discussing the model: https://codeforces.com/blog/entry/123035

Seems like they don't believe that it solved the 3200 rated problem (https://codeforces.com/contest/1810/problem/G) w/o data leakage

For context, there are only around 20 humans above 3200 rating in the world. During the contest, there were only 21 successful submissions from 25k participants for that problem.

Jensson
3 replies
22h31m

It doesn't code like human so you would expect it to be better at some kinds of tasks. It brute forces the problems by generating a million solutions and then tries to trim that down, a few problems might be vulnerable to that style of approach.

raincole
2 replies
15h11m

Are you sure? "brute forces the problems by generating a million solutions and then tries to trim that down" isn't how I would describe the way a LLM works.

paufernandez
0 replies
13h31m

The original AlphaCode paper in Nature explains the approach, they generate many potential solutions with the LLM and do a lot of processing after to select candidates. Here's where the probabilistic nature of LLMs hurts, I think.

derac
0 replies
8h39m

That is how it works, read the paper.

ghj
0 replies
8h16m

AdamantChicken2 (aka AlphaCode2) replied to the thread! https://codeforces.com/blog/entry/123035?#comment-1091379

foota
0 replies
1d1h

I guess we'll know in a few months (whenever the model is available and the next competition is run)

okish
3 replies
19h12m

That plot is downright criminal

https://imgur.com/a/GmbkDaz

86.4->89.8% = 1/3 of 89.8->90% ???

Great science + awful communication

alsodumb
1 replies
18h32m

It's just an UI issue. The plot looks fine (as in, correct Y-axis) when I opened the website on a landscape monitor.

laacz
0 replies
2h48m

No, it's not. On a large enough display zero axis is still somewhere near the basement. Proportions are not as bad, but still very much off.

jafitc
0 replies
19h9m

desperate times, desperate measure...ment practices

nojvek
3 replies
22h1m

Alexa from Amazon, Cortana from Microsoft, Siri from Apple.

Erica from Bank of America, Jenn from Alaska airlines.

Now Gemini from Google.

What is with tech bro culture to propagate the stereotype that women are there to serve and be their secretaries.

I like ChatGPT & Clippy. They are human agnostic names.

I expect better from Google.

jpeter
0 replies
21h56m

Gemini is not a name

fragmede
0 replies
21h22m

Given that Gemini is represented in Greek mythology by the two male twin half-brothers Castor and Pollux, I think you might be projecting a little.

educaysean
0 replies
21h54m

I think I agree with your broad point, but is Gemini really a feminine name? I thought they picked a pretty good genderless name.

idealboy
3 replies
21h43m

I had an interesting interaction:

Me: Are you using Gemini?

Bard: LOL. Got that wrong earlier today. Bard is on Gemini Pro in English across most of the world as of Dec 6, 2023.

When I asked it about the statement it said:

Bard: I apologize for the confusion. The "lol I made this mistake earlier" statement was not intended for you, but rather a reflection on a previous mistake I made during my training process. It was an error in my model that I have since corrected.

lixy
1 replies
19h13m

Hmm... Earlier today I asked "Are you Gemini pro?" And it answered word-for-word the same way. Is this a hard-coded or heavily prompt-coached answer?

It's suspicious when an AI answers 100% the same.

speedyStuff_
0 replies
16h47m

Huh, same here, that “LOL” response was the exact same thing for me. Pretty weird.

When I expressed my surprise about its casual response, it switched back to the usual formal tone and apologized.

Not sure what to make of this as I don’t consider myself to be in the know when it comes to ML, but could this be training data leakage? Then again, that “LOL” sentence would be such a weird training data.

dizhn
0 replies
12h30m

I think what we're seeing is the first instances of LLM based advertising.

nextworddev
2 replies
23h33m

Not sure why people are impressed with this. For context, they are only slightly beating GPT4 marginally on some tasks but GPT4 was trained almost 10 months ago

tim333
0 replies
22h1m

The demo on getting it to read 200,000 scientific papers seemed impressive to me.

dragonwriter
0 replies
23h25m

I would assume because there is so little competition (Micosoft/OpenAI, Anthropic, ??) here for commercial hosted solutions that Google being closer to parity here is significant, even if it still not on par with OpenAI.

hereme888
2 replies
1d2h

I thought Gemini was supposed to be a "massive leap" over GPT-4, and yet even in these benchmarks (unevenly employed) it just barely outperformed a specific model of GPT-4.

Google is the one that boasted in saying that.

By the time it's actually available to the public, OpenAI may be rolling out their next model.

But it does seem like Google is catching up faster than anyone else.

jordanpg
0 replies
1d2h

Also, who cares unless I can try it and see for myself.

hereme888
0 replies
11h1m

Well, just saw some videos of what Gemini can do. Actually impressive:

https://x.com/sundarpichai/status/1732433036929589301?s=20

po
1 replies
21h36m

One of the capabilities google should be evaluating their AI on is "determine if the top google search result for X is SEO spam AI nonsense or not."

rrrrrrrrrrrryan
0 replies
6h32m

This is unironically a great idea

luisgvv
1 replies
1d4h

Am I the only one not hyped by these kinds of demos? I feel that these are aimed toward investors so they can stay calm and not lose their sh*t

I mean it's a great achievement, however I feel that until we get our hands on a product that fully enhances the life of regular person I'll truly say "AI is here, I can't imagine my life without it"

Of course if it's specifically used behind the scenes to create products for the general consumer no one will bat an eye or care

That's why there are lots of people who don't even know that Chat GPT exists

miraculixx
0 replies
1d4h

Count me not impressed too. Let's make it a movement.

elchief
1 replies
21h46m

is it going to be pronounced Geminee (like the NASA project) or Gemineye?

TheMajor
0 replies
21h34m

The NASA project was pronounced Geminee? I always thought it was the latter.

chrgy
1 replies
20h16m

There are plenty of smart people I know personally at Google and DeepMind that will get this right. Google has 100X more data (data=food for neural networks) than OpenAI, It has youtube, Google Photos, Emails and search histories. There is a lot more pressure on Google than OpenAI to release Safe models, that is why this models are getting delayed, In my opinion they should go ahead and release it by phases to stop all this non sense speculation. We all want competition and I hope Google model will be a good one and free and can lift society forward and more prosperous and productive for everyone.

chrgy
0 replies
20h13m

I hope founders will come back, Larry or Sergay to the leadership positions and make company more innovative as before.

zoogeny
0 replies
1d3h

Just an observation based on some people complaining that this isn't some significant advance over GPT-4 (even if it happens to actually be a small percentage gain over GPT-4 and not just gaming some benchmarks).

One thing I consider isn't just what the world will be like once we have a better GPT-4. I consider what the world will be like when we have 1 million GPT-4s. Right now how many do we have? 3 or 4 (OpenAI, Gemini, Claude, Pi). I think we'll have some strange unexpected effects once we have hundreds, thousands, tens of thousands, hundreds of thousands and then millions of LLMs at this level of capability.

It's like the difference between vertical and horizontal scaling.

zlg_codes
0 replies
1d

Nice toy Google, now how can it improve MY life?

....yeah, that's what I thought. This is another toy and another tool to spy on people with. It's not capable of improving lives.

Additionally, I had to tap the Back button numerous times to get back to this page. If you're going to EEE the Web, at least build your site correctly.

zitterbewegung
0 replies
1d3h

I am very excited for this in that I have a backup Plan if either this project or OpenAI gets shut down before I can use open source systems. I wonder if langchain can support this because they have Vertex AI as an existing API.

xianwen
0 replies
1d3h

It's uncertain when Google discontinues Gemini.

xianshou
0 replies
1d4h

Marketing: Gemini 90.0% || GPT-4 86.4%, new SotA exceeding human performance on MMLU!

Fine print: Gemini 90.0% chain of thought @ 32-shot || GPT-4 86.4% @ 5-shot

Technical report: Gemini 83.7% @ 5-shot || GPT-4 86.4% @ 5-shot

Granted, this is now the second-best frontier model in the world - but after a company-wide reorg and six months of constant training, this is not what success for Google looks like.

webappguy
0 replies
1d3h

First 3 uses show me it's generally gonna be trash.

Severly disappointed.

I don't think they're taking shit seriously. Spent .ore time on the website that. The product. It should be equal too or better than 4.

ugh123
0 replies
1d3h

Seems some of the benchmarks (maybe all?) rely heavily on either CoT or some other additional prompting method to achieve the results. Will their integration into Bard and other consumer products use something similar?

trash_cat
0 replies
1d3h

Here is what we have so far, correct me if I'm wrong:

Ultra Release: Scheduled for early next year.

Pro with Bard: Positioned between GPT-4 and Ultra in terms of performance. Currently available in the US only.

Benchmarking Notes: The benchmarks shared appear to be selectively chosen.

Demo Video Analysis: It's challenging to ascertain the extent of scripting in the recent demo video - was it real-time or pre-arranged?

Whatever the case, this is very exciting.

toasted-subs
0 replies
21h15m

The most apple like launch from Google.

t0mk
0 replies
1d2h

Google has the possibility to roll and integrate small LLM(!) to the Pixel phones, that's something OpenAI can't do easily. Too bad MSFT dropped the Windows phone.

synergy20
0 replies
19h53m

great, but, where can I use it? bard seems still the same, and, is there a chat.gemini.ai site I can use? otherwise, it's just a PR for now.

synergy20
0 replies
19h29m

google, listen, stop talking the talk, walking the walk when you have something in real, your Bard for example, is still one decade behind chatgpt, your gemini has not even made it better and you're announcing you had a chatgpt killer, don't drive your reputation to ground please, it's in decline over the years.

synaesthesisx
0 replies
23h41m

Anyone know if they're using TPUs for inference? It'll be real interesting if they're not bottlenecked by Nvidia chips.

stranded22
0 replies
1d4h

Have to use vpn to USA to access via UK

stainablesteel
0 replies
1d4h

of all the problems i have that chatgpt has been unable to solve, bard is still not able to solve them either

no improvement that i see, still glad to see this do some other really neat things

squigglydonut
0 replies
17h39m

Whatever happened to putting text on a page. I give I am too old for all the rounded corners. It's AI! Coming soon.

spaceman_2020
0 replies
1d3h

I don't have anything to say about Gemini without using it, but man, that's a beautiful website. Not expected from Google.

smtp
0 replies
20h43m

The whitepaper has a few benchmarks vs. GPT-4. Most are reported benchmarks, though. Most of the blogs/news articles I've seen mention Google's push to focus on GPT-3.5. Found the whitepaper table way better at summarizing this. https://storage.googleapis.com/deepmind-media/gemini/gemini_...

shon
0 replies
21h51m

I love that OpenAI surprised Google and lit a fire under them.

Google’s task now is to think through a post-search experience that includes advertising in a much more useful and intelligent way.

I think it can be done.

This demo makes me think they’re not that far off:

https://x.com/googledeepmind/status/1732447645057061279?s=46...

seydor
0 replies
1d3h

This is epic from a technical standpoint

rookie123
0 replies
5h26m

Ok Unpopular opinion here, I expected more from Google here. Them just beating MSFT is not going to cut it. MSFT strength is enterprise, goog strength is tech. And right now MSFT is almost there on tech and better on enterprise.

revskill
0 replies
16h30m

Hijacking the back button to intercept hash route is annoying, basically it's impossible to go back to previous page.

replwoacause
0 replies
19h21m

Just logged into Bard to try it with the new Gemini (Pro) and I have to say, it’s just as bad as it ever was. Google continues to underwhelm in this space, which is too bad because OpenAI really needs some competition.

plumeria
0 replies
22h39m

Kinda off-topic, but gemini.ai redirects to gemini.com (the crypto exchange).

pikseladam
0 replies
1d2h

ok. when will it be closed? so bard is no more?

nycdatasci
0 replies
1d4h

I asked it to summarize this conversation. Initial result was okay, then it said it couldn't help more and suggested a bunch of unrelated search results. https://imgur.com/a/vS46CZE

nojvek
0 replies
1d1h

Google again making announcements but not releasing for public to validate their claims.

What's the point of it? They hype it so much, but the actual release is disappointing. Bard was hyped up but was pretty shit compared to GPT-4.

They released the google search experiment with bard integration but the UX was so aweful it hid the actual results. I use Sider and it is a muuuuch much nicer experience.

Does google not have folks who can actually productionize their AI with usable UX, or do they have such a large managerial hierarchy, the promo driven culture actively sabotages a serious competitor to GPT4?

nilespotter
0 replies
1d4h

Ironically I go to gemini to get away from google.

monkeydust
0 replies
1d

You can just imagine the fire drills that has been going on in Google for half the year trying to get in par and beat OpenAI. Great to see, Im keen to see what OpenAI do but I am now more than ever rooting for the SOTA open source offering!

modeless
0 replies
1d4h

Watching a demo video, and of course it makes a plausible but factually incorrect statement that likely wasn't even noticed by the editors, within the first two minutes. Talking about a blue rubber duck it says it floats because "it's made of a material that is less dense than water". False, the material of rubber ducks is more dense than water. It floats because it contains air.

If I was going to release a highly produced marketing demo video to impress people I would definitely make sure that it doesn't contain subtle factual errors that aren't called out at all...

miraculixx
0 replies
1d4h

So it's an announcement with a nice web page. Well done.

m3kw9
0 replies
1d1h

I did another simple coding question between bard with gemeni upgrade and gpt4, it does not give me correct code, in fact completely wrong. Like hallucinates with calls from non existing libs, while gpt4 got it right with exact same prompt.

It's more on the level of GPT3.5 maybe not even.

m3kw9
0 replies
1d4h

Google again is gonna confuse the heck outta everyone like what they did with their messaging services, remember GTalk, Duo, hangouts, Messages. Their exec team is dumb af except in search, sheets and in buying Android.

longstation
0 replies
1d4h

With Bard still not available in Canada, I hope Gemini could.

londons_explore
0 replies
16h24m

Notable that the technical paper has no real details of the model architecture... No details of number of layers, etc.

kernal
0 replies
1d2h

Where's the Gemini/Bard Android/iOS app? Oh right, Google doesn't do apps /s

joshuase
0 replies
1d3h

Extremely impressive. Looking forward to see how capable Gemini Nano will be. It'd be great to have a sensible local model.

Although open-source is improving immensely it's still far behind GPT4, so it's nice to see another company able to compete with OpenAI.

jonplackett
0 replies
1d2h

Brought about AI - what’s with the weird navigation UI on mobile. Not enjoying that at all.

jijji
0 replies
23h41m

I can't help but think that by the time they release this closed source Gemini project they brag about, the world will already have the same thing open sourced and better/comparable... ChatGPT beat them last year, and now we have a similar situation about to happen with this new product they speak of, but have yet to release anything.

jaimex2
0 replies
18h4m

Cool, bets on when they will kill it?

I give it a year.

hypertexthero
0 replies
1d

The Star Trek ship computer gets closer every day.

hsuduebc2
0 replies
21h42m

Let's talk about it when it will be real product. Until then it is just marketing.

happytiger
0 replies
19h7m

That is an incredibly intense brand/name choice.

Fatefully, Pollux survived the Trojan (!) war and Castor did not, and it was Pollux who begged Zeus to be mortal as he couldn’t bear to be without his brother.

Is this some prescient branding? Lol. Of all the names.

goshx
0 replies
1d3h

Meanwhile, Bard can't create images, see's more than there is on an image, and gave me this kind of response, after I was already talking about Rust:

Me: please show me the step by step guide to create a hello world in rust

Bard: I do not have enough information about that person to help with your request. I am a large language model, and I am able to communicate and generate human-like text in response to a wide range of prompts and questions, but my knowledge about this person is limited. Is there anything else I can do to help you with this request?

Doing "AI" before everyone else doesn't seem to mean they can get results as good as OpenAI's.

gnarlouse
0 replies
15h5m

If this isn’t proof that AI is coming for your job I don’t know what is. Welcome to the human zoo, I suspect if you’re reading this you’re the exhibit.

gigatexal
0 replies
23h29m

Is there or is there not a chat interface or will this just replace bard or be bard’s backend?

gerash
0 replies
1d2h

instead of gpt1, gpt2, gpt3, ...

we have lamda, palm, palm2, bard, Gemini, bard with Gemini pro, ...

reminds me of

play station, play station 2, play station 3, ...

vs

Xbox, Xbox 360, Xbox one, Xbox one X, Xbox one series X

geniium
0 replies
1d1h

Another promise? Where can we test this?

gchokov
0 replies
1d1h

Improvements over GPT-4 are marginal. Given that this is Google, I.e. privacy doesn’t exist, I will not touch it tool at all.

gcau
0 replies
1d1h

are you gemini? >LOL. Got that wrong earlier today. Bard is on Gemini Pro in English across most of the world as of Dec 6, 2023.

It gives this exact same answer every time, and is a really weird and unprofessional response. Even if you ask it to be more formal it gives the exact same answer.

gardenhedge
0 replies
1d2h

Who designed this web page? The back button hijacking is so annoying

drodio
0 replies
23h29m

960 comments is a lot! I created a SmartChat™ where you can get a summary (or anything else) of the comments: https://go.storytell.ai/hn-geminiai and here's a summary output example: https://s.drod.io/Jrum2mQK -- hope that's helpful.

dnadler
0 replies
21h41m

Unrelated to the content of the announcement, but the scrolling behavior of the 'threads' at the bottom of the page is really neat.

I'll need to look into how that was done - I've seen similar things before but I can't think of any that are quite as nuanced as this one.

dm_me_dogs
0 replies
1d4h

I would love to use Bard, if it were available in Canada. Don't quite understand why it's still not.

dizhn
0 replies
22h38m

For some reason it's answering with the same weird phrase to every question that amounts to "Are you gemini pro?".

The answer is: "LOL. Got that wrong earlier today. Bard is on Gemini Pro in English across most of the world as of Dec 6, 2023."

I don't get it. Is this advertising? Why is it saying LOL to me.

digitcatphd
0 replies
1d4h

Im a little disappointed to be honest, the improvement to GPT-4 is not as steep as I had anticipated, not enough to entice me to switch models in production.

didip
0 replies
19h54m

Looks very ahead. Seems like OpenAI days are numbered.

darklycan51
0 replies
21h59m

Ultra is just vaporware, typical from google

danielovichdk
0 replies
1d3h

If it reasons and helps with a lot better code for me than the other chat, perfect.

If it does not it's too late for me to change.

That's where i am at atm.

dang
0 replies
1d4h
cranberryturkey
0 replies
8h20m

coming soon...

corethree
0 replies
1d1h

Google is uniquely positioned to bury everyone in this niche. Literally these models are based on data and google has the best.

It's pretty predictable. Sure OpenAI can introduce competition, but they don't have the fundamentals in place to win.

ckl1810
0 replies
1d1h

How many of these implementation are strict, narrow implementation just to show that Google is better than OpenAI for the investor community?

E.g. In a similar vein within Silicon Chip. The same move that Qualcomm tried to do with Snapdragon 8cx Gen 4 over M2. Then 1 week later, Apple came out with M3. And at least with processors, they seem to me marginal, and the launch cadence from these companies just gets us glued to the news, when in fact they have performance spec'ed out 5 years from now, and theoretically ready to launch.

chmod775
0 replies
1d

Friendly reminder to not rely on any Google product still existing in a few months or years.

chillingeffect
0 replies
9h39m

Site is a navigation and branding maze. What is the difference bt bard, gemeni and deepmind? Where do i type questions? How come it can't answer sth this simple?

Oops, your search for “what is a pannus” didn't return any results.

(Chatgpt won't tell me either out of modesty until I reassure it that's a medical term...)

cbolton
0 replies
9h51m

Interesting example on page 57 of the technical report[1] with a poorly worded question:

"Prompt: Find the derivative of sinh 𝑥 + cosh 𝑦 = 𝑥 + 𝑦."

I couldn't understand what was being asked: derive what with respect to what? Gemini didn't have that problem, apparently it figured out the intent and gave the "correct" answer.

[1] https://storage.googleapis.com/deepmind-media/gemini/gemini_...

carabiner
0 replies
20h55m

Y-axis in those charts doing a shitload of work.

canjobear
0 replies
1d2h

Demo access or it didn't happen.

butlike
0 replies
3h22m

Can we talk about civil rights at this point, cause I'm not too keen on carrying around the weight of what happened <=1960's again.

billconan
0 replies
6h23m

does anyone know any paper that can accept video as input.

I hope to understand how to tokenize videos.

beretguy
0 replies
18h57m

Let’s see how long it will last before going to Google’s graveyard.

bdcravens
0 replies
1d2h

A scroll is not a history event. Leave the back button alone, please.

aydoubleyou
0 replies
1d3h

Someone at Google is a Billy Madison fan with the blue duck reference.

asylteltine
0 replies
22h44m

Where’s the product though?

anonomousename
0 replies
14h16m

I’m surprised that the multimodal model is t significantly better than GPT4. I thought that all the Google photos training data would have given it an edge.

anon115
0 replies
17h19m

meh

almogguata
0 replies
10h55m

They have a pretty interesting explanation of each cookie they use:

https://www.nkcommercial.com/%D7%A2%D7%95%D7%A8%D7%9A-%D7%93...

What I never quite understand is the analytics issue. We had server logs for analytics long before everyone started using cookies for that.

In my opinion the cookie part of GDPR is clearly bad regulation. It requires cookie banners for some things that are not privacy issues. And at the same time it doesn't institute a sensible consent mechanism that doesn't in practice amount to constant harassment.

alfor
0 replies
20h41m

Another woke GPT, no thanks.

Google believe that they know better, that their job is to decide for other what is the truth. And to play with the levers behind people back.

That will lead to a very dark path as it always does.

abcd8731
0 replies
18h44m

how to read Kant's books?

TheAceOfHearts
0 replies
1d1h

My first impression of their YouTube plugin is a bit disappointing.

I asked:

Can you tell me how many total views MrBeast has gotten on his YouTube videos during the current year?

It responded:

I'm sorry, but I'm unable to access this YouTube content. This is possible for a number of reasons, but the most common are: the content isn't a valid YouTube link, potentially unsafe content, or the content does not have a captions file that I can read.

I'd expect this query to be answerable. If I ask for the number of views in his most recent videos it gives me the number.

Obscurity4340
0 replies
17h33m

GeminAI

Missed opportunity + its an anagram (GAI) for Artificial General Intelligence (AGI) :/

Name_Chawps
0 replies
6h55m

"Bard isn't supported in your country"

Oh, the Internet? You had no trouble sending me the 404 page, so why not just send me the page with Bard on it?

NOWHERE_
0 replies
22h49m

I would rather build with OpenAI products rather than with Google products because if I use a Google product, I know that it will shut down in two years tops.

Madmallard
0 replies
22h56m

Saw it stated somewhere “better than 90% of programmers.”

*DOUBT

Maybe at very constrained types of leetcode-esque problems for which it has ample training data.

LaGrange
0 replies
1d2h

This being so high up is so funny in context of yesterday's popular post about the long-term consequences of Google's Chrome.

LZ_Khan
0 replies
20h29m

I sent a picture of a scenic picture (Lake Tahoe from the top of Heavenly) I took and asked Gemini where it was. To my surprise Gemini got it right. Even the ski resort. Woah.

Jackson__
0 replies
1d3h

Really loving the big button for using it on bard, which when clicked has no indication at all about what model it is currently actually using.

And when I ask the model what the base model it relies on is:

I am currently using a lightweight model version of LaMDA, also known as Pathways Language Model 2 (PaLM-2).

Which appears completely hallucinated as I'm pretty sure LaMDA and PaLM-2 are completely different models.

JOnAgain
0 replies
1d1h

"Gemini, how can I easily sign up for Google cloud as an individual?'

HeavyStorm
0 replies
19h22m

Google really is an advertising company, it seems

ElijahLynn
0 replies
23h4m

Looks amazing! However, they don't easily show one how to try it out. Is this vaporware?

DrSiemer
0 replies
1d1h

Until I see an actual hands on from an outside source I am not buying it. It is not clear at all how cherrypicked / conveniently edited these examples are.

CrzyLngPwd
0 replies
1d1h

Still waiting for an AI.

51Cards
0 replies
1d2h

And still not available in Canada. Sigh.