Relationship with CVNets
CoreNet evolved from CVNets, to encompass a broader range of applications beyond computer vision. Its expansion facilitated the training of foundational models, including LLMs.
We can expect it to have grown from here: https://apple.github.io/ml-cvnets/index.html
It looks like a mid-level implementations of training and inference. You can see in their "default_trainer.py"[1] that the engine uses Tensors from torch but implements its own training method. They implement their own LR scheduler and optimizer; the caller can optionally use Adam from torch.
It's an interesting (maybe very Apple) choice to build from the ground up instead of partnering with existing frameworks to provide first class support in them.
The MLX examples seem to be inference only at this point. It does look like this might be a landing ground for more MLX specific implementations: e.g. https://github.com/apple/corenet/blob/5b50eca42bc97f6146b812...
It will be interesting to see how it tracks over the next year; especially with their recent acquisitions:
Datakalab https://news.ycombinator.com/item?id=40114350
DarwinAI https://news.ycombinator.com/item?id=39709835
1: https://github.com/apple/corenet/blob/main/corenet/engine/de...
It smells of a somewhat panicked attempt to prepare for WWDC to me. Apple has really dropped the ball on AI and now they're trying to catch up.
I don’t get the idea that Apple dropped the ball on AI. They were fairly early with adding neural engine hardware to their chips and have been using ML extensively on-device for a long time now
They haven’t put an LLM assistant out there. But they don’t make their own search engine either so I don’t think “online LLM assistant” is something they’ll ever put much effort into unless it’s part of a bigger effort to launch their own AI-based search engine as well.
As for generative AI I don’t think the quality is up to a level that would be reasonable for Apple.
The only area where i would expect Apple to keep up is the kind of Copilot integration Microsoft is working on. And we know Apple is working on on-device AI assistant, and probably have for a long time. It’ll be launched when they can get good quality results on-device. Something nobody else has achieved anyway, so we can’t say that they’re behind anyone yet.
That's the public perception. Maybe due to them not getting in on a quick cash grab off the LLM hype wave?
I share this perception for two reasons:
1) Siri
2) Dearth of published AI research
https://machinelearning.apple.com/research seems to have too many publications to be considered a "dearth" IMO.
[delayed]
I agree with 1. For 2, have they ever been a company big into research? They're very consumer focused and it can take time to integrate new tech into consumer products at scale. Especially the way Apple likes to do it: polished and seamlessly integrated into the rest of their ecosystem.
I would say not doing AI research (or buying another big company that does) is tantamount to dropping the ball on AI, if it turns out that AI is a capability they should have had and must have to succeed.
You could argue that publishing research is not the same thing as doing it. But they don't seem to have done much of it until fairly recently.
I agree that Apple does less research than other big tech companies. But they do it where they think it matters. Their M-series CPUs are more than just integration and polishing. And they have been doing some research in health AI as well, I think.
If that's all it takes to stay ahead of the curve, then Rockchip and Qualcomm are arguably right up there alongside them. Tons of vendors shipped their own AI silicon, and of those vendors, it seems like Nvidia is the only one that shipped anything truly usable. Medium-sized LLMs, Stable Diffusion and probably even stuff like OAI Whisper is faster run on Apple's GPUs than their AI coprocessor.
Be careful not to have NVIDIA-shaped tunnel vision. Performance isn't the whole story. It's very telling that approximately everybody making SoCs for battery powered devices (phones, tablets, laptops) has implemented an AI coprocessor that's separate from the GPU. NVIDIA may take exception, but the industry consensus is that GPUs aren't always the right solution to every AI/ML-related problem.
Ideally, you're right. Realistically, Apple has to choose between using their powerful silicon (the GPU) for high-quality results or their weaker silicon (the Neural engine) for lower-power inference. Devices that are designed around a single power profile (eg. desktop GPUs) can integrate the AI logic into the GPU and have both high-quality and high-speed inference. iPhones gotta choose one or the other.
There's not nothing you can run on that Neural Engine, but it's absolutely being misunderstood relative to the AI applications people are excited for today. Again; if chucking a few TOPS of optimized AI compute onto a mobile chipset is all we needed, then everyone would be running float16 Llama on their smartphone already. Very clearly, something must change.
Curious then, why they keep recruiting search engineers[1]. And why they run a web crawler[2]. And why typing "Taylor Swift" into safari offers a Siri Suggested website before Google.
I guess what people mean by search engine is "show ads alongside web search to as many people as possible"?
1: https://jobs.apple.com/en-us/details/200548043/aiml-senior-s... 2: https://support.apple.com/en-us/HT204683
Wouldn’t WWDC-related endeavors be more product-facing? I’m not so sure this has to do with their efforts to incorporate ai into products, and tbh I would say their ai research has been pretty strong generally speaking.
I expect that a lot of WWDC will be Apple trying to get more developers to build AI products for their platforms, because at the moment, Apple products don't have much AI. The other tech companies have integrated user facing LLM products into a significant part of their ecosystem - Google and Microsoft have them up front and center in search. Apple's AI offerings for end users are what exactly? The camera photos app that does minor tweaks to photos (composing from multiple frames). What else actually is there in the first party ecosystem that significantly leverages AI? Siri is still the same trash it's been for the last 10 years - in fact IMO it's become even less useful, often refusing to even do web searches for me. (I WANT Siri to work very well).
So because their first party AI products are so non-existent, I think WWDC is a desperate attempt by Apple to get third party developers to build compelling AI products. I say desperate because they're already a year behind the competition in this space.
(I can imagine they'll be trying to get developers to build Vision Pro software too, though I hear sales there have collapsed so again, way too little, too late)
I am guessing you are not familiar with the AI-powered vision features that already ship since a few years. Mostly accessibility related, so I am not surprised you missed it.
Yep. Google, the AI company, only recently launched image descriptions in TalkBack, which VoiceOver has had for years now. Google still doesn't have Screen Recognition, which basically does OCR and image/UI classification to make inaccessible apps more accessible.
Can you be more specific?
What AI products are present in other ecosystems (eg. Android, Samsung, whatever) and missing from Apple?
Honest question: I find the platform distinction largely meaningless in most cases apart from “what your phone looks like” and “can you side load apps”…
They have tons of computer vision, NN inference and natural language processing in their products. It's reductive to say Apple products don't have much AI.
I'm not sure what you mean by "AI products", and why you think Apple needs them for their platforms.
For one thing, I can search for any text I’ve ever take a photo of. Finding a picture of someone I took 20+ years ago by searching for a single work I remember on their t-shirt is pretty cool, and is all done on-device.
I think it’s important to remember that there are a bunch of actual useful AI-driven features out there that aren’t just GenAI chatbots.
Apple put a neural engine on-die in the A11 back in 2017:
* https://en.wikipedia.org/wiki/Apple_A11#Neural_Engine
The A-derived M-series chips had them from the beginning in 2020:
* https://en.wikipedia.org/wiki/Apple_M1#Other_features
Seems like they've been doing machine learning for a while now.
They've been using them, too. Auto OCR so selecting text in images Just Works, image enhancements, Siri. I'm sure LLM Siri is in the works. Scanning your photos for CSAM. Let's hope that last one is more reliable than Siri :/
Wasn’t csam ultimately rolled back? And wasn’t it not Ai based but hash based?
The interface looks very Apple as well. Looks like you create a config file, and you already have a model in mind with the hyperparameters and it provides a simple interface. How useful is this to researchers trying to hack the model architecture?
One example: https://github.com/apple/corenet/tree/main/projects/clip#tra...
Not much. But if you just want to adapt/optimize hyperparams, this is a useful approach. So I can certainly see a possible, less technical audience. If you actually want to hack and adapt architectures it's probably not worth it.
I’m not familiar with how any of this works but what does state of the art training look like? Almost no models release their training source code or data sets or pre processing or evaluation code. So is it known what the high level implementation even is?
https://github.com/NVIDIA/Megatron-LM
This is probably a good baseline to start thinking about LLM training at scale.
What you say is true about the project but both PyTorch works on Mace and Tensorflow was ported to Macs by Apple