HN comments for: CoreNet: A library for training deep neural networks

gbickford

28 replies

15h50m

2024-04-24 02:35:27 UTC

Relationship with CVNets

CoreNet evolved from CVNets, to encompass a broader range of applications beyond computer vision. Its expansion facilitated the training of foundational models, including LLMs.

We can expect it to have grown from here: https://apple.github.io/ml-cvnets/index.html

It looks like a mid-level implementations of training and inference. You can see in their "default_trainer.py"[1] that the engine uses Tensors from torch but implements its own training method. They implement their own LR scheduler and optimizer; the caller can optionally use Adam from torch.

It's an interesting (maybe very Apple) choice to build from the ground up instead of partnering with existing frameworks to provide first class support in them.

The MLX examples seem to be inference only at this point. It does look like this might be a landing ground for more MLX specific implementations: e.g. https://github.com/apple/corenet/blob/5b50eca42bc97f6146b812...

It will be interesting to see how it tracks over the next year; especially with their recent acquisitions:

Datakalab https://news.ycombinator.com/item?id=40114350

DarwinAI https://news.ycombinator.com/item?id=39709835

1: https://github.com/apple/corenet/blob/main/corenet/engine/de...

davedx

22 replies

9h48m

2024-04-24 08:37:33 UTC

It's an interesting (maybe very Apple) choice to build from the ground up instead of partnering with existing frameworks to provide first class support in them.

It smells of a somewhat panicked attempt to prepare for WWDC to me. Apple has really dropped the ball on AI and now they're trying to catch up.

audunw

10 replies

8h29m

2024-04-24 09:56:22 UTC

I don’t get the idea that Apple dropped the ball on AI. They were fairly early with adding neural engine hardware to their chips and have been using ML extensively on-device for a long time now

They haven’t put an LLM assistant out there. But they don’t make their own search engine either so I don’t think “online LLM assistant” is something they’ll ever put much effort into unless it’s part of a bigger effort to launch their own AI-based search engine as well.

As for generative AI I don’t think the quality is up to a level that would be reasonable for Apple.

The only area where i would expect Apple to keep up is the kind of Copilot integration Microsoft is working on. And we know Apple is working on on-device AI assistant, and probably have for a long time. It’ll be launched when they can get good quality results on-device. Something nobody else has achieved anyway, so we can’t say that they’re behind anyone yet.

chrsw

5 replies

3h9m

2024-04-24 15:16:57 UTC

I don’t get the idea that Apple dropped the ball on AI.

That's the public perception. Maybe due to them not getting in on a quick cash grab off the LLM hype wave?

fauigerzigerk

4 replies

3h2m

2024-04-24 15:23:28 UTC

I share this perception for two reasons:

1) Siri

2) Dearth of published AI research

jldugger

1 replies

28m

2024-04-24 17:57:11 UTC

Dearth of published AI research

https://machinelearning.apple.com/research seems to have too many publications to be considered a "dearth" IMO.

fauigerzigerk

0 replies

2024-04-24 18:23:56 UTC

[delayed]

chrsw

1 replies

2h16m

2024-04-24 16:09:09 UTC

I agree with 1. For 2, have they ever been a company big into research? They're very consumer focused and it can take time to integrate new tech into consumer products at scale. Especially the way Apple likes to do it: polished and seamlessly integrated into the rest of their ecosystem.

fauigerzigerk

0 replies

55m

2024-04-24 17:30:05 UTC

I would say not doing AI research (or buying another big company that does) is tantamount to dropping the ball on AI, if it turns out that AI is a capability they should have had and must have to succeed.

You could argue that publishing research is not the same thing as doing it. But they don't seem to have done much of it until fairly recently.

I agree that Apple does less research than other big tech companies. But they do it where they think it matters. Their M-series CPUs are more than just integration and polishing. And they have been doing some research in health AI as well, I think.

talldayo

2 replies

2h31m

2024-04-24 15:54:43 UTC

They were fairly early with adding neural engine hardware to their chips

If that's all it takes to stay ahead of the curve, then Rockchip and Qualcomm are arguably right up there alongside them. Tons of vendors shipped their own AI silicon, and of those vendors, it seems like Nvidia is the only one that shipped anything truly usable. Medium-sized LLMs, Stable Diffusion and probably even stuff like OAI Whisper is faster run on Apple's GPUs than their AI coprocessor.

wtallis

1 replies

1h21m

2024-04-24 17:04:19 UTC

and of those vendors, it seems like Nvidia is the only one that shipped anything truly usable. Medium-sized LLMs, Stable Diffusion and probably even stuff like OAI Whisper is faster run on Apple's GPUs than their AI coprocessor.

Be careful not to have NVIDIA-shaped tunnel vision. Performance isn't the whole story. It's very telling that approximately everybody making SoCs for battery powered devices (phones, tablets, laptops) has implemented an AI coprocessor that's separate from the GPU. NVIDIA may take exception, but the industry consensus is that GPUs aren't always the right solution to every AI/ML-related problem.

talldayo

0 replies

16m

2024-04-24 18:09:21 UTC

Ideally, you're right. Realistically, Apple has to choose between using their powerful silicon (the GPU) for high-quality results or their weaker silicon (the Neural engine) for lower-power inference. Devices that are designed around a single power profile (eg. desktop GPUs) can integrate the AI logic into the GPU and have both high-quality and high-speed inference. iPhones gotta choose one or the other.

There's not nothing you can run on that Neural Engine, but it's absolutely being misunderstood relative to the AI applications people are excited for today. Again; if chucking a few TOPS of optimized AI compute onto a mobile chipset is all we needed, then everyone would be running float16 Llama on their smartphone already. Very clearly, something must change.

jldugger

0 replies

32m

2024-04-24 17:53:24 UTC

they don’t make their own search engine

Curious then, why they keep recruiting search engineers[1]. And why they run a web crawler[2]. And why typing "Taylor Swift" into safari offers a Siri Suggested website before Google.

I guess what people mean by search engine is "show ads alongside web search to as many people as possible"?

1: https://jobs.apple.com/en-us/details/200548043/aiml-senior-s... 2: https://support.apple.com/en-us/HT204683

pizza

7 replies

9h43m

2024-04-24 08:42:52 UTC

Wouldn’t WWDC-related endeavors be more product-facing? I’m not so sure this has to do with their efforts to incorporate ai into products, and tbh I would say their ai research has been pretty strong generally speaking.

davedx

6 replies

9h23m

2024-04-24 09:02:20 UTC

I expect that a lot of WWDC will be Apple trying to get more developers to build AI products for their platforms, because at the moment, Apple products don't have much AI. The other tech companies have integrated user facing LLM products into a significant part of their ecosystem - Google and Microsoft have them up front and center in search. Apple's AI offerings for end users are what exactly? The camera photos app that does minor tweaks to photos (composing from multiple frames). What else actually is there in the first party ecosystem that significantly leverages AI? Siri is still the same trash it's been for the last 10 years - in fact IMO it's become even less useful, often refusing to even do web searches for me. (I WANT Siri to work very well).

So because their first party AI products are so non-existent, I think WWDC is a desperate attempt by Apple to get third party developers to build compelling AI products. I say desperate because they're already a year behind the competition in this space.

(I can imagine they'll be trying to get developers to build Vision Pro software too, though I hear sales there have collapsed so again, way too little, too late)

lynx23

1 replies

7h26m

2024-04-24 10:59:57 UTC

I am guessing you are not familiar with the AI-powered vision features that already ship since a few years. Mostly accessibility related, so I am not surprised you missed it.

devinprater

0 replies

4h47m

2024-04-24 13:38:31 UTC

Yep. Google, the AI company, only recently launched image descriptions in TalkBack, which VoiceOver has had for years now. Google still doesn't have Screen Recognition, which basically does OCR and image/UI classification to make inaccessible apps more accessible.

wokwokwok

0 replies

7h51m

2024-04-24 10:34:25 UTC

Can you be more specific?

What AI products are present in other ecosystems (eg. Android, Samsung, whatever) and missing from Apple?

Honest question: I find the platform distinction largely meaningless in most cases apart from “what your phone looks like” and “can you side load apps”…

tzakrajs

0 replies

8h37m

2024-04-24 09:48:33 UTC

They have tons of computer vision, NN inference and natural language processing in their products. It's reductive to say Apple products don't have much AI.

niek_pas

0 replies

8h53m

2024-04-24 09:32:00 UTC

I'm not sure what you mean by "AI products", and why you think Apple needs them for their platforms.

matthewmacleod

0 replies

15m

2024-04-24 18:10:28 UTC

For one thing, I can search for any text I’ve ever take a photo of. Finding a picture of someone I took 20+ years ago by searching for a single work I remember on their t-shirt is pretty cool, and is all done on-device.

I think it’s important to remember that there are a bunch of actual useful AI-driven features out there that aren’t just GenAI chatbots.

throw0101c

2 replies

7h26m

2024-04-24 10:59:27 UTC

Apple has really dropped the ball on AI and now they're trying to catch up.

Apple put a neural engine on-die in the A11 back in 2017:

* https://en.wikipedia.org/wiki/Apple_A11#Neural_Engine

The A-derived M-series chips had them from the beginning in 2020:

* https://en.wikipedia.org/wiki/Apple_M1#Other_features

Seems like they've been doing machine learning for a while now.

jdewerd

1 replies

7h0m

2024-04-24 11:25:28 UTC

They've been using them, too. Auto OCR so selecting text in images Just Works, image enhancements, Siri. I'm sure LLM Siri is in the works. Scanning your photos for CSAM. Let's hope that last one is more reliable than Siri :/

thealistra

0 replies

5h21m

2024-04-24 13:04:05 UTC

Wasn’t csam ultimately rolled back? And wasn’t it not Ai based but hash based?

error9348

1 replies

13h59m

2024-04-24 04:26:57 UTC

The interface looks very Apple as well. Looks like you create a config file, and you already have a model in mind with the hyperparameters and it provides a simple interface. How useful is this to researchers trying to hack the model architecture?

One example: https://github.com/apple/corenet/tree/main/projects/clip#tra...

sigmoid10

0 replies

12h10m

2024-04-24 06:15:20 UTC

Not much. But if you just want to adapt/optimize hyperparams, this is a useful approach. So I can certainly see a possible, less technical audience. If you actually want to hack and adapt architectures it's probably not worth it.

blackeyeblitzar

1 replies

15h24m

2024-04-24 03:01:07 UTC

It looks like a mid-level implementations of training and inference

I’m not familiar with how any of this works but what does state of the art training look like? Almost no models release their training source code or data sets or pre processing or evaluation code. So is it known what the high level implementation even is?

spott

0 replies

15h19m

2024-04-24 03:06:06 UTC

https://github.com/NVIDIA/Megatron-LM

This is probably a good baseline to start thinking about LLM training at scale.

zitterbewegung

0 replies

3h38m

2024-04-24 14:47:33 UTC

What you say is true about the project but both PyTorch works on Mace and Tensorflow was ported to Macs by Apple

symlinkk

25 replies

16h21m

2024-04-24 02:04:14 UTC

Pretty funny that Apple engineers use Homebrew too.

guywithabike

24 replies

16h19m

2024-04-24 02:06:01 UTC

Why is it funny? Homebrew is the de facto standard terminal packaging tool for macOS.

ramesh31

9 replies

15h48m

2024-04-24 02:37:32 UTC

Why is it funny? Homebrew is the de facto standard terminal packaging tool for macOS.

It's funny because a multi-trillion dollar company can't be bothered to release a native package manager or an official binary repository for their OS after decades of pleading from developers.

Tagbert

3 replies

15h9m

2024-04-24 03:16:18 UTC

So you want them to Sherlock Homebrew?

TillE

2 replies

14h2m

2024-04-24 04:23:39 UTC

"Sherlocking" can be unfortunate for a developer, but it's odd to view it as an inherently bad thing. A package manager is a core OS feature, even Microsoft has WinGet now.

fragmede

0 replies

13h12m

2024-04-24 05:13:14 UTC

it's odd to feel empathetic when someone has their livelihood taken from them?

Someone

0 replies

12h7m

2024-04-24 06:18:53 UTC

A package manager is a core OS feature

It has become a core OS feature. Historically, you see the set of core OS features expand tremendously. Back in the 80’s drawing lines and circles wasn’t even a core OS feature (not on many home computers, and certainly not on early PCs), bit-mapped fonts were third part add-ons for a while, vector-based fonts were an Adobe add-on (https://en.wikipedia.org/wiki/Adobe_Type_Manager), printer drivers were third party, etc.

I think that’s natural. As lower layers become commodities (try making money selling an OS that only manages memory and processes), OS sellers have to add higher layer stuff to their products to make money on them.

As to Sherlocking, big companies cannot do well there in the eyes of “the angry internet”:

- don’t release feature F: “They don’t even support F out of the box. On the competitor’s product, you get that for free”

- release a minimal implementation: “They have F, but it doesn’t do F1, F2, or F3”

- release a fairly full implementation: “Sherlocking!” and/or nitpicking about their engineering choices.

randomdata

1 replies

14h39m

2024-04-24 03:46:11 UTC

They released "App Store" for the average Joe. We can all agree it is not suitable for power users, but at the same time what would power users gain over existing solutions if they were to introduce something?

katbyte

0 replies

13h22m

2024-04-24 05:03:14 UTC

You can brew install mas (I think) and then install/manage Mac store stuff via the cli pretty easily

etse

1 replies

15h31m

2024-04-24 02:54:16 UTC

Well, without charging for it, right?

2muchcoffeeman

0 replies

15h25m

2024-04-24 03:00:14 UTC

They should do it to become the de facto platform for programming.

astrange

0 replies

15h26m

2024-04-24 02:59:45 UTC

They did, they sponsored MacPorts. (And then Swift Package Manager.)

AceJohnny2

6 replies

16h15m

2024-04-24 02:10:42 UTC

TMWNN

5 replies

16h4m

2024-04-24 02:21:48 UTC

I also use MacPorts, but certainly have often noticed that Homebrew has some package that MacPorts doesn't.

I guess there's nothing stopping me from moving to Homebrew other than familiarity.

fastball

3 replies

14h36m

2024-04-24 03:49:03 UTC

I used MacPorts a decade ago, but at some point realized that Homebrew had more packages that were kept consistently up-to-date. Switched and never looked back.

nicolas_t

2 replies

13h37m

2024-04-24 04:48:26 UTC

I switched away back to macports when homebrew decided to get rid of formula options. To be honest, I always find homebrew frustrating, it feels that they've often made technical decisions that are not necessarily the best but they've been much more successful at marketing themselves than macports.

pnw_throwaway

1 replies

12h38m

2024-04-24 05:47:14 UTC

If I’m reading the formula docs right, only homebrew-core packages don’t support it (due to CI not testing them). That part does suck, though.

Other taps, like homebrew-ffmpeg, offer a ton of options.

nicolas_t

0 replies

5h35m

2024-04-24 12:50:07 UTC

oh, I actually hadn't realized that this is what they settled on in the end. ffmpeg is the quintessential package where options make sense so good that that's still supported.

The other issue I experienced with homebrew around that time were related to having different versions of openssl installed because I had some old codebase I had to run (and for performance reasons didn't want to use docker). But that's definitely a edge case.

detourdog

0 replies

5h41m

2024-04-24 12:44:09 UTC

I haven't looked at Homebrew since that got started. The philosophical difference at that time was using macports and having a consistent and managed */local/ collection of tools with self contained dependencies vs. adding new tools with dependencies tied to the current Mac OS release.

I still use MacPorts for that reason and it is easy enough to create a local portfile for whatever isn't in Macports.

I find this to be the easy way to manage networked development computers.

photonbeam

5 replies

15h46m

2024-04-24 02:39:27 UTC

I hear a lot about people moving to nix-darwin, is it popular or am I showing my own bubble

jallmann

1 replies

15h0m

2024-04-24 03:25:14 UTC

I use nixpkgs on MacOS, is nix-darwin is a different project?

I love Nix but it probably has too many rough edges for the typical homebrew user.

tymscar

0 replies

14h8m

2024-04-24 04:17:30 UTC

Its a different complementary thing. It lets you define your macos settings the same way you would on nixos

pyinstallwoes

0 replies

12h26m

2024-04-24 05:59:38 UTC

I never even heard of nix-Darwin. Interesting.

firecall

0 replies

14h37m

2024-04-24 03:48:39 UTC

I've never heard of it until now, but will check it out! :-)

armadsen

0 replies

13h44m

2024-04-24 04:41:10 UTC

I’m a full-time Mac and iOS developer, have been for almost 20 years, and this is the first I’ve heard of it. Might just be my bubble, but I don’t think it’s a huge thing yet. (I’m going to check it out now!)

sevagh

0 replies

13h27m

2024-04-24 04:58:29 UTC

Apple should do like this library, re-release Homebrew with their own name on the README and people would lap it up.

buildbot

9 replies

16h4m

2024-04-24 02:21:45 UTC

Does this support training on Apple silicon? It’s not very clear unless I missed something in the README.

blackeyeblitzar

6 replies

15h26m

2024-04-24 02:59:26 UTC

Would such a capability (training) be useful for anything other than small scale experimentation? Apple doesn’t make server products anymore and even when they did, they were overpriced. Unless they have private Apple silicon based servers for their own training needs?

MBCook

3 replies

14h29m

2024-04-24 03:56:42 UTC

There are an insane number of Apple Silicon devices out there.

If your product runs on an iPhone or iPad, I’m sure this is great.

If you only ever want to run on 4090s or other server stuff, yeah this probably isn’t that interesting.

Maybe it’s a good design for the tools or something, I have no experience to know. Maybe someone else can build off it.

But it makes sense Apple is releasing tools to make stuff that works better on Apple platforms.

blackeyeblitzar

2 replies

14h21m

2024-04-24 04:04:47 UTC

I can understand the inference part being useful and practical for Apple devs. I’m just wondering about the training part, for which there Apple silicon devices don’t seem very useful.

spmurrayzzz

0 replies

5h17m

2024-04-24 13:08:47 UTC

My M2 Max significantly outperforms my 3090 Ti for training a Mistral-7B LoRA. Its sort of a case-by-case situation though, as it depends on how optimized the CUDA kernels happen to be for whatever workload you're doing (i.e. for inference, theres a big delta between standard transformers vs exllamav2, apple silicon may outperform the former, but certainly not the latter).

rgbrgb

0 replies

13h18m

2024-04-24 05:07:22 UTC

I’ve seen several people fine tune mistral 7B on MacBooks.

jjtheblunt

0 replies

13h28m

2024-04-24 04:57:21 UTC

Isn’t the current Mac Pro available in rack mount form?

https://www.apple.com/mac-pro/

donavanm

0 replies

13h21m

2024-04-24 05:04:00 UTC

Unless they have private Apple silicon based servers for their own training needs?

Id be SHOCKED if so. Its been 15 years, but I was there when xserve died. Priorities were iphone > other mobile devices >>> laptops > displays & desktops >>> literally anything else. When xserve died we still needed osx for OD & similar. Teams moved on to 3P rack mount trays of mac minis as a stop gap. Any internal support/preference for server style hardware was a lolwut response. Externally I see no reason to suspect thats changed.

zmk5

1 replies

16h2m

2024-04-24 02:23:18 UTC

I believe the MLX examples allow for it. Seems like a general purpose framework rather than a Mac specific one.

gbickford

0 replies

15h41m

2024-04-24 02:44:01 UTC

I couldn't find any training code in the MXL examples.

leodriesch

7 replies

14h29m

2024-04-24 03:56:26 UTC

How does this compare to MLX? As far as I understand MLX is equivalent to PyTorch but optimized for Apple Silicon.

Is this meant for training MLX models in a distributed manner? Or what is its purpose?

dagmx

4 replies

14h25m

2024-04-24 04:00:10 UTC

Just skimming the README it looks like it’s a layer above MLX. So looks like a framework around it to ease ML

ipsum2

3 replies

14h16m

2024-04-24 04:09:36 UTC

It's a layer on top of PyTorch, and it has code to translate PyTorch models into MLX.

Mandelmus

2 replies

12h43m

2024-04-24 05:42:54 UTC

So, is CoreNet the equivalent of Keras, whereas MLX is the Jax/PyTorch equivalent?

ipsum2

0 replies

8h33m

2024-04-24 09:52:24 UTC

Not quite. The closest equivalent would be something like fairseq. It's config (yaml) driven.

hmottestad

0 replies

12h39m

2024-04-24 05:46:39 UTC

Sounds reasonable. Apple writes the following about MLX: "The design of MLX is inspired by frameworks like NumPy, PyTorch, Jax, and ArrayFire."

simonw

0 replies

14h26m

2024-04-24 03:59:15 UTC

It looks like MLX is a part of this initiative. https://github.com/apple/corenet lists "MLX examples" as one of the components being released in April.

reader9274

0 replies

12h41m

2024-04-24 05:44:38 UTC

As mentioned in the "mlx_examples/open_elm": "MLX is an Apple deep learning framework similar in spirit to PyTorch, which is optimized for Apple Silicon based hardware."

miki123211

4 replies

15h49m

2024-04-24 02:36:04 UTC

What's the advantage of using this over something like Huggingface Transformers, possibly with the MPS backend?

pshc

1 replies

15h25m

2024-04-24 03:00:54 UTC

"MLX examples demonstrate how to run CoreNet models efficiently on Apple Silicon. Please find further information in the README.md file within the corresponding example directory."

> mlx_example/clip: ... an example to convert CoreNet's CLIP model implementation to MLX's CLIP example with some customized modification.

  - FP16 Base variant: 60% speedup over PyTorch
  - FP16 Huge variant: 12% speedup

> mlx_example/open_elm: ... an MLX port of OpenELM model trained with CoreNet. MLX is an Apple deep learning framework similar in spirit to PyTorch, which is optimized for Apple Silicon based hardware.

Seems like an advantage is extra speedups thanks to specialization for Apple Silicon. This might be the most power-efficient DNN training framework (for small models) out there. But we won't really know until someone benchmarks it.

HarHarVeryFunny

0 replies

2h32m

2024-04-24 15:53:08 UTC

OpenELM (ELM = Efficient Language Models) has an unfortunate name clash with another LLM-related open source project.

https://github.com/CarperAI/OpenELM (ELM = Evolution through Large Models)

upbeat_general

0 replies

10h10m

2024-04-24 08:15:18 UTC

The implementation seems to be pretty clean and modular here where transformers (and diffusers) isn’t, unless you take their modules standalone.

This repo has a lot of handy utilities but also a bunch of clean implementations of common models, metrics, etc.

In other words, this is more for writing new models rather than inference.

jaimex2

0 replies

14h20m

2024-04-24 04:05:14 UTC

Nothing, its basically pytorch with an Apple logo.

ipsum2

3 replies

14h38m

2024-04-24 03:47:00 UTC

It's interesting that Apple also actively develops https://github.com/apple/axlearn, which is a library on top of Jax. Seems like half the ML teams at Apple use PyTorch, and the other half uses Jax. Maybe its split between Google Cloud and AWS?

josephg

1 replies

11h43m

2024-04-24 06:42:55 UTC

In my experience, this is pretty normal in large companies like Apple. Coordination costs are real. Unless there's a good reason to standardize on a single tool, its usually easier for teams to just pick whichever tool makes the most sense based on the problem they're solving and what the team has experience with.

tomComb

0 replies

6h41m

2024-04-24 11:44:14 UTC

Big companies like Apple yes, but not Apple

te_chris

0 replies

11h5m

2024-04-24 07:20:38 UTC

I don’t know as haven’t worked there, but have always heard Apple described more as a series of companies/startups than one coherent entity like Meta or whatever. Each is allowed a large degree of autonomy from what I’ve heard.

coder543

3 replies

15h35m

2024-04-24 02:50:04 UTC

They also mention in the README:

CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data

This is the first I’m hearing of that, and the link seems broken.

simonw

0 replies

14h25m

2024-04-24 04:00:32 UTC

The link should go here I think: https://github.com/apple/corenet/tree/main/projects/catlip

seanvelasco

0 replies

2h59m

2024-04-24 15:26:16 UTC

somewhat related, i came across this, mlx examples for openai clip: https://github.com/ml-explore/mlx-examples/tree/main/clip

curious to know how fast catlip is. the above using openai clip is already fast.

huac

0 replies

15h28m

2024-04-24 02:57:11 UTC

cat's out of the bag, too early?

javcasas

1 replies

8h32m

2024-04-24 09:53:43 UTC

Looks at Apple: CoreNet Looks at Microsoft: Net Core

My inner trademark troll demands a bucket of popcorn.

pixl97

0 replies

2024-04-24 18:16:42 UTC

Heh, when I saw this post this is the first thing I thought.

benob

1 replies

9h34m

2024-04-24 08:51:11 UTC

OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework https://arxiv.org/abs/2404.14619

Apple is pushing for open information on LLM training? World is changing...

tzakrajs

0 replies

8h35m

2024-04-24 09:50:37 UTC

We are all starting to better understand the ethos of their engineering teams more generally.

RivieraKid

1 replies

3h2m

2024-04-24 15:23:41 UTC

What library would you recommend for neural net training and inference on Apple M1? I want to use it from C++ or maybe Rust. The neural net will have 5M params at most.

the_king

0 replies

13m

2024-04-24 18:12:50 UTC

I would use Pytorch as your starting point. Its metal backend is pretty quick on Apple Silicon, and it's the most widely used library for everyone from hackers to foundation model builders.

orena

0 replies

3h58m

2024-04-24 14:27:38 UTC

The style is not very different than NeMo(nvidia)/fairseq(Facebook)/espent(oss) etc..

mxwsn

0 replies

16h14m

2024-04-24 02:11:52 UTC

Built on top of pytorch.

m3kw9

0 replies

3h28m

2024-04-24 14:57:12 UTC

Ok, why would anyone use this when you have industry standard methods already?

jn2clark

0 replies

9h37m

2024-04-24 08:48:00 UTC

I would love an LLM agent that could generate small api examples (reliably) from a repo like this for the various different models and ways to use them.

gnabgib

0 replies

16h57m

2024-04-24 01:28:52 UTC

h1: CoreNet: A library for training deep neural networks

andreygrehov

0 replies

16h16m

2024-04-24 02:09:30 UTC

What hardware would one need to have for the CoreNet to train efficiently?