For people interested in AI research, there's nothing new here.
IMO they should do a better job of referencing existing papers and techniques. The way they wrote about "adaptors" can make it seem like it's something novel, but it's actually just re-iterating vanilla LoRA. It was enough to convince one of the top-voted HackerNews comments that this was a "huge development".
Benchmarks are nice though.
Was anyone expecting anything new?
Apple has never been big on living at the cutting edge of technology exploring spaces that no one has explored before—from laptops to the iPhone to iPads to watches, every success they've had has come from taking tech that was already prototyped by many other companies and smoothing out the usability kinks to get it ready for the mainstream. Why would deep learning be different?
Prototyping tech is one thing; making it a widely adopted success is another. For instance, Apple was the first to bring WiFi to laptops in 1999. Everyone laughed at them at the time. Who needs a wireless network when you can have a physical LAN, ey?
On the other hand, people who laughed at them removing the 3.5mm jack can still safely laugh away.
Then laugh at Samsung and their flagship line of phones as well, since they haven't had headphone jacks for a while now. "After Note 10 dumps headphone jack, Samsung ads mocking iPhone dongles disappear" (2019):
* https://www.cnet.com/tech/mobile/after-removing-headphone-ja...
"Samsung is hiding its ads that made fun of Apple's removal of headphone jack":
* https://www.androidauthority.com/samsung-headphone-jack-ads-...
I totally do. One of the problems with Apple is the industry seems to mindlessly ape their good and bad decisions. Their marketing has been so good, many people just assume whatever they do must be the best way.
At the time I felt like Apple was getting rid of the 3.5mm jack as a potential bottleneck for future iPhone designs (as one of the limiting aspects of form factor), but there still doesn't seem to be anything design-wise to justify it, even several years later. It is very clear now that it was merely to encourage Air Pod adoption.
I would say this was obvious to the cynical of us from the very beginning. Unless you are trying to go portless (for water resistance perhaps?) or have a very thin phone, there’s very little benefits of removing the jack… except to drive airpod sales, of course.
I mean to go thinner than 6/6s I can see the 3.5mm causing trouble. Part of me is still sad they bounced they went the other direction when it comes to iPhone thickness
That's not a problem with Apple.
It's more of a regulatory problem, under a certain light.
We absolutely do laugh at both already.
This is such a tired talking point. Use a (lightning|USB-C)->3.5mm adapter or use bluetooth.
Interesting that you suggest laughing at their decision to remove the headphone jack, when it was actually just the first of an industry-wide shift that has done so by other companies.
From https://en.wikipedia.org/wiki/AirPort:
"AirPort 802.11b card"
"The original model, known as simply AirPort card, was a re-branded Lucent WaveLAN/Orinoco Gold PC card, in a modified housing that lacked the integrated antenna."
That was also how lucent’s access points worked.
Was that really the case? I remember they were mocked for e.g. offering wifi only, firewire only etc., while the respective removed alternatives were way more common.
In the consumer space at least, WiFi was nowhere to be seen on a typical PC when Apple adopted it. Same with USB. So while it technically originated and existed elsewhere, there was no serious traction on it prior to Apple adoption.
What you say is also true: many people weren't ready to ditch the old when Apple decide to deprecate it.
I think you misinterpreted OP's comment. Apple makes it sound like there's smth new, but there isn't. They don't have to innovate, but it's good practice to credit who've done what they're taking and using. Also to use the names everyone else is already using.
That would be a very un-Apple thing to do. They really like to use their own marketing terms for technologies. It's not ARM, it's Apple Silicon. It wasn't Wi-Fi, it was AirPort. etc. etc.
See also: FireWire, iSight, Retina, FaceTime, etc.
None of these really fit the pattern. Apple invented FireWire, called it FireWire, and other companies chose to call it different things in their implementations (partly because Apple originally charged for licensing the name, IIRC). iSight is an Apple product. FaceTime is an Apple product. Retina is branding for high-resolution displays beyond a certain visual density.
"Apple invented FireWire" is maybe not fully accurate (but actually a good example of the point here).
Wikipedia: FireWire is Apple's name for the IEEE 1394 High Speed Serial Bus. Its development was initiated by Apple[1] in 1986,[3] and developed by the IEEE P1394 Working Group, largely driven by contributions from Sony (102 patents), Apple (58 patents), Panasonic (46 patents), and Philips (43 patents), in addition to contributions made by engineers from LG Electronics, Toshiba, Hitachi, Canon,[4] INMOS/SGS Thomson (now STMicroelectronics),[5] and Texas Instruments.
What might be interesting in this regard is that Sony was also using its own trademark for it: "i.LINK".
FWIW the term “airport” predated the name “wifi” — in those days you had to otherwise call it IEEE 802.11.
And the name as great: people were buying them like crazy and hiding them in the drop ceiling to get around the corporate IT department. A nice echo of how analysts would buy their own apple II + visicalc to…get around corporate IT.
I’m OK with Apple using “apple silicon” as the ARM is only part of it.
Just commenting on your two examples; in general I agree with your point.
As far as I know, both the AirPort trademark and the term Wi-Fi got introduced in 1999 (could be that AirPort was a couple of weeks earlier)
The strange thing is Apple did mention (twice) in the article that their adapters are loras so I don't understand OP's comment.
I gathered from OP's "huge development" comment he was talking about others people's popular perception that it wasn't a lora.
There was such a time. Same as with Google. Interestingly, around 2015-2016 both companies significantly shifted to iterative products from big innovations. It's more visible with Google than Apple, but here's both.
Apple:
- Final Cut Pro
- 1998: iMac
- 1999: iBook G3 (father of all MacBooks)
- 2000: Power Mac G4 Cube (the early grandparent of the Mac Mini form factor), Mac OS X
- 2001: iPod, iTunes
- 2002: Xserve (rackable servers)
- 2003: Iterative products only
- 2004: iWork Suite, Garage Band
- 2005: iPod Nano, Mac mini
- 2006: Intel Macs, Boot Camp
- 2007: iPhone and Apple TV
- 2008: MacBook Air, iPhone 3G
- 2009: iPhone 3Gs, all-in-one iMac
- 2010: iPad, iPhone 4
- 2011: Final Cut Pro X
- 2012: Retina displays, iBooks Author
- 2013: iWork for iCloud
- 2014: Swift
- 2015: Apple Watch, Apple Music
- 2016: Iterative products only
- 2017: Iterative products mainly, plus ARKit
- 2018: Iterative products only
- 2019: Apple TV +, Apple Arcade
- 2020: M1
- 2021: Iterative products only
- 2022: Iterative products only
- 2023: Apple Vision Pro
Google:
- 1998: Google Search
- 2000: AdWords (this is where it all started going wrong, lol)
- 2001: Google Images Search
- 2002: Google News
- 2003: Google AdSense
- 2004: Gmail, Google Books, Google Scholar
- 2005: Google Maps, Google Earth, Google Talk, Google Reader
- 2006: Google Calendar, Google Docs, Google Sheets, YouTube bought this year
- 2007: Street View, G Suite
- 2008: Google Chrome, Android 1.0
- 2009: Google Voice, Google Wave (early Docs if I recall correctly)
- 2010: Google Nexus One, Google TV
- 2012: Google Drive
- 2013: Chromecast
- 2014: Android Wear, Android Auto, Google Cardboard, Nexus 6, Google Fit
- 2015: Google Photos
- 2016: Google Assistant, Google Home
- 2017: Mainly iterative products only, Google Lens announced but it never rolled out really
- 2018: Iterative products only
- 2019: Iterative products only
- 2020: Iterative products only, and some rebrands (Talk->Chat, etc)
- 2021: Iterative products only, and Tensor Chip
- 2022: Iterative products only
- 2023: Iterative products only, and Bard (half-baked).
Some of your choices and what you consider iterative/innovative are strange to me. For 2009, a chassis update for the iMac and a spec/camera bump for the iPhone doesn't seem particularly innovative especially in comparison to say the HomePod in 2017 or Satellite SOS in 2022.
Also small correction but iTunes (as Soundjam MP) was originally third-party software and Final Cut was acquired by Apple.
Yes, it's not perfect.
About iTunes: I did not know that! Thank you.
About iterative/innovative: I considered hardware and software that became household names or general knowledge to be significant innovations. It is not rigorous, I tried to include more rather than less. Still, on some years these companies mostly did version increases for their hardware and software, like new iOS and macOS versions, and that was it. Those years I marked as iterative.
I included a few too many iPhones, although when I wrote that, my thought process was that these phones were pivotal to how iPhones developed. I should have included the original iPhone, and iPhone 3G — the first iPhone developed around the concept of an app platform and with an App Store. This has undoubtedly been a big innovation. iPhone 4 and 3Gs, perhaps, should not have been included.
It's loose and just to illustrate a general trend, individual items are less important, we could all pick slightly different ones. But I believe the trend would remain.
You're missing Apple Silicon, which has had a huge impact across the entire industry even if random soccer dad doesn't know about it -- if any one thing is responsible for Intel's marketshare collapsing in the future, the M series of processors is it.
The iMac refined a form factor that dated back to at least Commodore. The iBook came after decades of iteration by other companies on laptops. The Cube was just a PC with a more compressed form factor. The iPod came a few years after other commercial digital media players. Etc, etc.
Note that I'm not saying that there's anything wrong with their approach or that they didn't make real improvements. I'm just saying that Apple has never produced any successful product that would count as "new" to someone interested in cutting-edge research. They've always looked around at things that exist but aren't yet huge successes and given them the final push to the mainstream.
It depends on the definition of "new". With some definitions, we may claim that nothing is ever new — we may say computers started with the Antikythera mechanism and abaci, or maybe before. With other definitions (like "new" as in a "new for most people") we will see that Apple has brought about many new things. So we need to agree on the definition.
I used the definition of new somewhere between "new for most people", "newly popular", and "meaningfully advanced from the previous iteration". With such a definition, I think you can agree with me.
In the consumer space, I'm not sure I can think of any examples from anyone ever that are examples of cutting-edge research at the time. It's hard to build consumer products on the bleeding edge. You'd be releasing phones today using CHERI, for example, which is not quite ready for prime time.
No Newton?
Missed it, but should have been included for 1998. A very good example.
Apple was first with 64 bit iPhone chips. Remember Qualcomm VP at the time claimed it was nothing. Apple Silicon for M1 was impressive for instant in low power high performance.
Those are both still (major) incremental improvements to known tech, not cutting-edge research. Apple takes what other companies have already done and does it better.
all cutting-edge research other companies are supposedly doing are also incremental. Depends on your vantage point.
But last at bringing a calculator on the iPad =)
I think he is pointing out for people interested in research.
OTOH, it is interesting to see how a company is applying AI to customers at the end. It will bring up new challenges that will be interesting from at least an engineering point of view.
I thought the news of them using Apple Silicon rather than NVIDIA in their data centers was significant.
Perhaps there is still hope of a relaunch of xserve; with the widespread use of Apple computers amongst developers Apple has a real chance of challenging NVIDIA's CUDA moat.
Not at Apple's price points.
I think NVIDIA has the highest hardware markup at the moment.
You get considerably more ML FLOPS per dollar in a 4090 than any mac. It seems like the base M2 MAX is at roughly the same price point. It does grant you more RAM.
Quadro and Tesla cards might be a different story. I would still like to see concrete FLOPS/$ numbers.
The M2 is a chip designed to be in a laptop (and it is quite powerful given its low power consumption). Presumedly they have a different chip or at least completely different configuration (RAM, network, etc.) in their data centers.
The interesting point here is that developers targeting the Mac can safely assume that the users will have a processor capable of significant AI/ML workloads. On the Windows (and Linux) side of things, there's no common platform, no assumption that the users will have an NPU or GPU capable of doing what you want. I think that's also why Microsoft was initially going for the ARM laptops, where they'd be sure that the required processing power is available.
That's probably where Microsoft's "Copilot+ PCs" come in.
Plus DirectML, wich as the name implies, builds on top of DirectX, allowing multiple backends, CPU, GPU, NPU.
I believe MS is trying to standardize this, in the same way as they do with DirectX support levels, but I agree it's probably going to be inherently a bit less consistent than Apple offerings
DirectML can use multiple backends.
How does it help me (with maxed out M3 Max) that Apple might have some chip in the future right now? I do DL on A6000 and 4090, not waiting until Apple produces a chip someday that is faster than 1650 in ML...
Also that a significant proportion (majority?) of them will have just 8 GB of memory which is not exactly sufficient to run any complex AI/ML workloads.
That sounds like a big issue, but surely assuming for either case is bad.
I expect OS's will expose an API which, when queried, will indicate the level of AI inference available.
Similar to video decoding/encoding where clients can check if hardware acceleration is available.
There was a rumor floating around that Apple might try to enter the server chip business with an AI chip, which is an interesting concept. Apple's never really succeeded in the B2B business, but they have proven a lot of competency in the silicon space.
Even their high-end prosumer hardware could be interesting as an AI workstation given the VRAM available if the software support were better.
Idk every business I’ve worked and all the places my friends work seem to be 90% Apple hardware, with a few Lenovo issued for special case roles in finance or something.
They mean server infrastructure.
They don't need the entire mac. Their cost per Max chip is probably $200-300 which beats the 4090 by a massive margin and each chip can do more than a 4090 because it also has a CPU onboard.
4090 peaks out at around 550w which means they can run 5+ of their Max chips in the same power budget.
A 4090 is $2000. Apple can probably get 5 chips on a custom motherboard for that cost. They'll use the same amount of power, but get a lot more raw compute.
The GPU in the M-series is much slower than a 4090. 4060-4070ish performance at best, and it varies quite a bit.
If they can get 5 4070s for the price and power of one 4090, that's a win for them as they'll get more performance per dollar and per watt.
Of course you do, Apple's selling mobile SOC's not high end cards. That doesn't mean they're incapable of making them for the right application. You don't seriously think the server farms running on M4 Pro Max chips do you...
Depends on which card one is talking about.
Maybe. It is not really obvious how much you for the AI accellerator part of their offerings. For example the chips in iPhones are quite powerful even adjusted for price. However for some cases - like the max chip in the macbooks or the extra ram - their pricing seems high - maybe even nvidia high.
I mean, even Apple can't match the markups nVidia has right now. If you break a GPU in your compute server, you wait months for a replacement, and the part is sent back if you can't replace it in five days.
Crazy times.
Enterprise offerings tend to differ. You can get a replacement NVIDIA GPU via a partner, like Lenovo, in 2-3 weeks. And that's on the high side for some support contracts.
That's from HPE, for an Apollo 6500.
I'm wondering how my electricity they will save just from moving from Intel to Apple Silicon in their data centers.
"AI for the rest of us."
Except Apple isn't really for the rest of us. Outside of America and a handful wealthy western countries it's for the top 5-20% earners only.
In the EU the market share is 30%
Yes but not evenly distributed, BeNeLux, Germany, Austria, and Nordic countries have a lot of iPhone users, while moving further east (or south) you see lower market share. Maybe it’s “two handfuls” of wealthy western countries rather than just one, but I think OPs point holds true.
In Poland it’s 33%
Poland seems to be 14%: https://gs.statcounter.com/os-market-share/mobile/poland
In Romania it's 24.7%
Huh interesting, I missed that. You’re right (actually I see even 25.5%).
One average American user is probably worth 5-10 average European users.
(I've dabbled in mobile games)
Yes. Americans are THE most valuable customer base, y'all use insane amounts of money on mobile crap.
https://worldpopulationreview.com/country-rankings/iphone-ma...
Poland, Greece, Hungary and Bosnia-Herzegovina are the only ones under 20% (and maybe a few others).
OTOH Britain is over 50% as is Sweden. Finland, the land of Nokia is over 35%.
Approximately 33% of all smartphones in the world are iPhones.
60% in the US
Who do you think this presentation is geared toward?
Japan and Taiwan are both more than 50% iOS.
Ref: https://worldpopulationreview.com/country-rankings/iphone-ma...
This sounds like every newcomers to the stage except for big players like Apple.
This gives me the vibe of calling high resolution screens as "retina" screens.
I don't see anything wrong with that at all. They've created a branding term that allows consumers to get an idea of the sort of pixel density they can expect without having to actually check, should they not want to bother.
Except that everyone has different visual acuity and different distance they use the same devices at, and in the end, "retina" means nothing at all.
But this is exactly the type of marketing Apple is good at, though "retina" is probably not the most successful example.
If your "visual acuity" is so good that you can see the pixels of a retina-branded display from the intended viewing distance, you might need to be studied for science.
It's not so impossible to spot flaws if you're using worst-case testing scenarios. Which are not worthless because such patterns do actually pop up in real world usage, albeit rarely.
Examples?
Had one happen to me recently where I was scrolling Spotify, and they do the thing where if you try to scroll past max they will stretch the content.
One of the album covers being stretched had some kind of fine pattern on it that caused a clearly visible shifting/flashing Moiré pattern as it was being stretched.
Wish I could remember what album cover it was now.
Though really it's simple enough: As long as you can still spot a single dark pixel in the middle of an illuminated white screen, the pixels could benefit from being smaller. (Edit: swapped black and white)
If your visual acuity is 20/10, you'd roughly need 3600 pixels vertically to not notice any pixelation if Bill Otto did the calculations right at https://www.quora.com/What-resolution-does-20-10-vision-corr...
20/10 is rare but can easily be corrected to with glasses or contacts.
You also left that "intended viewing distance" hanging there, without at all acknowledging what that is at a minimum?
Agreed. It is not high resolution as such, but high resolution that the user can relate to - like cannot see the pixel.
Still remember the hard time using Apple newton in a conference vs the palm freely on loan in a Gartner group conference. Palm solved a problem, even though not very Apple … user can input on a small device. I kept it, on top of my newly bought newton.
It is the user …
Still no manufacturer compares to the quality of apple screens and resolution …
Those screens are produced by samsung.
Part of the screen is, yes. Apple designs the full stack and sources new technology from multiple suppliers including Samsung.
By your logic, I own a Foxconn smartphone with a FreeBSD-based OS. If you bought a Porsche, would you call it a Volkswagen?
Retina means high pixel density, not high resolution. And there are very few standalone displays on the market which can be called “retina”, unfortunately.
I think the thing they're saying that's novel, isn't what they have (LoRAs), but where and when and how they make them.
Rather than just pre-baking static LoRAs to ship with the base model (e.g. one global "rewrite this in a friendly style" LoRA, etc), Apple seem to have chosen a bounded set of behaviors they want to implement as LoRAs — one for each "mode" they want their base model to operate in — and then set up a pipeline where each LoRA gets fine-tuned per user, and re-fine-tuned any time the data dependencies that go into the training dataset for the given LoRA (e.g. mail, contacts, browsing history, photos, etc) would change.
In other words, Apple are using their LoRAs as the state-keepers for what will end up feeling to the user like semi-online Direct Preference Optimization. (Compare/contrast: what Character.AI does with their chatbot response ratings.)
---
I'm not as sure, from what they've said here, whether they're also implying that these models are being trained in the background on-device.
It could very well be possible: training something that's only LoRA-sized, on a vertically-integrated platform optimized for low-energy ML, that sits around awake but doing nothing for 8 hours a day, might be practical. (Normally it'd require a non-quantized copy of the model, though. Maybe they'll waste even more of your iPhone's disk space by having both quantized and non-quantized copies of the model, one for fast inference and the other for dog-slow training?)
But I'm guessing they've chosen not to do this — as, even if it were practical, it would mean that any cloud-offloaded queries wouldn't have access to these models.
Instead, I'm guessing the LoRA training is triggered by the iCloud servers noticing you've pushed new data to them, and throwing a lifecycle notification into a message queue of which the LoRA training system is a consumer. The training system reduces over changes to bake out a new version of any affected training datasets; bakes out new LoRAs; and then basically dumps the resulting tensor files out into your iCloud Drive, where they end up synced to all your devices.
I don't think the LoRAs are fine-tuned locally at all. It sounds like they use RAG to access data.
Consider a feature from earlier in the keynote: the thing Notes (and Math Notes) does now where it fixes up your handwriting into a facsimile of your handwriting, with the resulting letters then acting semantically as text (snapping to a baseline grid; being reflowable; being interpretable as math equations) but still having the kind of long-distance context-dependent variations that can't be accomplished by just generating a "handwriting font" with glyph variations selected by ligature.
They didn't say that this is an "AI thing", but I can't honestly see how else you'd do it other than by fine-tuning a vision model on the user's own handwriting.
For everything other than handwriting I don't think the LoRAs are fine-tuned locally.
Well, here's another one: they promised that your local (non-iCloud) photos don't leave the device. Yet they will now — among many other things they mentioned doing with your photos — allow you to generate "Memoji" that look like the people in your photos. Which includes the non-iCloud photos.
I can't picture any way to use a RAG to do that.
I can picture a way to do that that doesn't involve any model fine-tuning, but it'd be pretty ridiculous, and the results would probably not be very good either. (Load a static image2text LoRA tuned to describe the subjects of photos; run that once over each photo as it's imported/taken, and save the resulting descriptions. Later, whenever a photo is classified as a particular subject, load up a static LLM fine-tune that summarizes down all the descriptions of photos classified as subject X so far, into a single description of the platonic ideal of subject X's appearance. Finally, when asked for a "memoji", load up a static "memoji" diffusion LoRA, and prompt it with the that subject-platonic-appearance description.)
But really, isn't it easier to just fine-tune a regular diffusion base-model — one that's been pre-trained on photos of people — by feeding it your photos and their corresponding metadata (incl. the names of subjects in each photo); and then load up that LoRA and the (static) memoji-style LoRA, and prompt the model with those same people's names plus the "memoji" DreamBooth-keyword?
(Okay, admittedly, you don't need to do this with a locally-trained LoRA. You could also do it by activating the static memoji-style LoRA, and then training to produce a textual-inversion embedding that locates the subject in the memoji LoRA's latent space. But the "hard part" of that is still the training, and it's just as costly!)
I believe this could be achieved by providing a seed image to the diffusion model and generating memoji based on it. This way fine tuning isn't required.
Yup this is pretty much it, and DALLE and others can do this already
That's going to be something similar to IPAdapter FaceID: https://ipadapterfaceid.com Basically you use a facial structure representation that you'd use for face recognition (which of course Apple already compute on all your photos) together with some additional feature representations to guide the image generation. No need for additional fine-tuning. A similar approach could likely be used for handwriting generation.
I didn't see the presentation but judging by your description, this is achievable using in-context learning.
I think you’re misunderstanding what they mean by adapting to use cases. See this passage:
This along with other statements in the article about keeping the base model weights unchanged says to me that they are simply swapping out adapters on a per app or per task basis. I highly doubt they will fine tune adapters on user data since they have taken a position against this. I wonder how successful this approach will be vs merging the adapters with the base model. I can see the benefits but there are also downsides.
Easel has been on iMessage for a bit now: https://apps.apple.com/us/app/easel-ai/id6448734086
There is no way they would secretly train loras in the background of their user's phones. The benefits are small compared to the many potential problems. They describe some LoRA training infrastructure which is likely using the same capacity as they used to train the base models.
Apple would not implement these sophisticated user specific LoRA training techniques without mentioning them anywhere. No big player has done anything like this and Apple would want the credit for this innovation.
Was there anything about searching through our own photos using prompts? I thought this could be pretty amazing and still a natural way to find very specific photos in one’s own photo gallery.
Which is in turn just multimodal embedding
Besides I could do "named person on a beach in August" and get the correct thing in photos on Android photos, so I don't get it.
It's amazing for apple users if they didn't have it before. But from a tech stand point people could have had it for a while.
The difference is that Apple has been doing this on-device for maybe 4-5 years already with the Neural Engine. Every iOS version has brought more stuff you can search for.
The current addition is "just" about adding a natural language interface on top of data they already have about your photos (on device, not in the cloud).
My iPhone 14 can, for example, detect the breed of my dog correctly from the pictures and it can search for a specific pet by name. Again on-device, not by sending my stuff to Google's cloud to be analysed.
They have been trying and failing to do a tiny little bit of this. It's so broken and useless that I've been uploading all my iCloud photos to Google as well, for search and sharing.
If you like Google using your personal photos for machine learning, that's your option. Now they have your every photo, geotagged and timestamped so they can see where you have been and at what times. Then they of course anonymise that information into an "advertiser id" they tag on to you and a sufficient quantity of other people so they can claim they're not directly targeting anyone.
I prefer Apple's privacy focused option myself.
>If you like Google using your personal photos for machine learning, that's your option.
It's a trade-off between getting the features I need and the price I have to pay. All else being equal I do prefer privacy as well. Unfortunately, all else is not equal.
>I prefer Apple's privacy focused option myself.
It's only an option if it works.
Photos has had this for a while with structured natural language queries, and this kind of prompt was part of the WWDC video.
Run OpenAI's CLIP model on iOS to search photos. https://github.com/mazzzystar/Queryable
Yes, exactly this. I have had this for a while and works wonderfully well in most cases but it’s wonky and not seamless. I wanted a more integrated approach with Photos app which only Apple can bring to the table.
Very little of the “AI” boom has been novel, most has been iterative elaborations (though innovative nonetheless). Academics have been using neural network statistical models for decades. What’s new is the combination of compute capability and data volume available for training. It’s iterative all the way down though, that’s how all technologies are developed.
This is the important part.
My advisor said new means old method applied to new data or new method on old data.
Commercially, that means price points, i.e., discrete points where something becomes viable.
Maybe that's iterative, but maybe not. Either way, once the opportunity presents, time is of the essence.
Most people don't realize this, but almost all research works that way. Only the media spins research as breakthrough-based, because that way it is easier to sell stories. But almost everything is incremental/iterative. Even the transformer architecture, which in some way can be seen as the most significant architectural advancement in AI in the past years, was a pretty small, incremental step when it came out. Only with a lot of further work building on top of that did it become what we see today. The problem is that science-journalists vastly outnumber scientists producing these incremental steps, so instead of reporting on topics when improvements actually accumulated to a big advancement, every step along the way gets its own article with tons of unnecessary commentary heralding its features.
It's a huge development in terms of it being a consumer-ready, on-device LLM.
And if Karpathy thinks so then I assume it's good enough for HN:
https://x.com/karpathy/status/1800242310116262150
The productization of it (like Karpathy mentioned) is awesome. But I think the URL for that would be this maybe? [link](https://www.apple.com/apple-intelligence/)
They refer to LoRA explicitly in the post.
Although I caught that on the first read, I found myself questioning when I read the adaptors part, "is this not just LoRA...".
Maybe it's my fault as a reader, but I think the writing could be clearer. Usually in a research paper you would link to the LoRA paper there too.
This isn't about AI research, it's about delivering AI at unimaginable scale.
180 million users for chatgpt isn’t unimaginable but it does exceed the number of iPhone users in the United States.
You know what company you are talking about here?
I think you’re referring to my comment about this being huge for developers?
Just want to point out I call this launch huge, didn’t say “huge development” as quoted, and didn’t imply what was interesting was the ML research. No one in this thread used the quoted words, at least that I can see.
My comment was about dev experience, memory swapping, potential for tuning base models to each HW release, fine tune deployment, and app size. Those things do have the potential to be huge for developers, as mentioned. They are the things that will make a local+private ML developer ecosystem work.
I think the article and comment make sense in their context: a developer conference for Mac and iOS devs.
Apple also explicitly says it’s LoRA.
reminds me of Easel on iMessage: https://easelapps.ai/
Feel Apple should have just focused on their models for this one and not complicate the conversation with OpenAI. They could have left that to another announcement later.
Quick straw poll survey around the office, many think their data will be sent off to OpenAI by default for these new features which is not the case.
I think your conclusion is uncharitable or at least depends on how deep your interest in AI research actually is. Reading the docs, there are at least several points of novelty/interest:
* Clearly outlining their intent/policies for training/data use. Committing to no using user data or interactions for training their base models is IMO actually a pretty big deal and a differentiator from everyone else.
* There's a never-ending stream of new RL variants ofc, but that's how technology advances, and I'm pretty interested to see how these compare with the rest: "We have developed two novel algorithms in post-training: (1) a rejection sampling fine-tuning algorithm with teacher committee, and (2) a reinforcement learning from human feedback (RLHF) algorithm with mirror descent policy optimization and a leave-one-out advantage estimator. We find that these two algorithms lead to significant improvement in the model’s instruction-following quality."
* I'm interested to see how their custom quantization compares with the current SoTA (probably AQLM atm)
* It looks like they've done some interesting optimizations to lower TTFT, this includes the use of some sort of self-speculation. It looks like they also have a new KV-cache update mechanism and looking forward to reading about that as well. 0.6ms/token means that for your average I dunno, 20 token query you might only wait 12ms for TTFT (I have my doubts, maybe they're getting their numbers from much larger prompts, again, I'm interested to see for myself)
* Yes, it looks like they're using pretty standard LoRAs, the more interesting part is their (automated) training/re-training infrastructure but I doubt that's something that will be shared. The actual training pipeline (feedback collection, refinement, automated deployment) is where the real meat and potatoes of being able to deploy AI for prod/at scale lies. Still, what they shared about their tuning procedures is still pretty interesting, as well as seeing which models they're comparing against.
As this article doesn't claim to be a technical report or a paper, while citations would be nice, I can also understand why they were elided. OpenAI has done the same (and sometimes gotten heat for it, like w/ Matroyshka embeddings). For all we know, maybe the original author had references, or maybe since PEFT isn't new to those in the field, that describing it is just being done as a service to the reader - at the end of the day, it's up to the reader to make their own judgements on what's new or not, or a huge development or not. From my reading of the article, your conclusion, which funnily enough is now the new top-rated comment on this thread isn't actually much more accurate the the one old one you're criticizing.
Those people aren’t looking at Apple.
They seem to have a good model for adding value to their products without the hold my beer, conquer the world bullshit that you get from OpenAI, et al.
Thing is, Apple takes these concepts and polishes them, makes them accessible to maybe not laypeople but definitely a much wider audience compared to those already "in the industry", so to speak.