return to table of content

Launch HN: Sorcerer (YC S24) – Weather balloons that collect more data

jviotti
22 replies
1d

Very cool! How are the balloons transferring telemetry back to earth for analysis, etc?

Asking because my research at the University of Oxford was around hyper space-efficient data transfer from remote locations for a fraction of the price.

The result was an award-winning technology (https://jsonbinpack.sourcemeta.com) to serialise plain JSON that was proven to be more space-efficient than every tested alternative (including Protocol Buffers, Apache Avro, ASN.1, etc) in every tested case (https://arxiv.org/abs/2211.12799).

If it's interesting, I'd love to connect and discuss (jv@jviotti.com) how at least the open-source offering could help.

f549abd01
5 replies
22h44m

Sounds cool. How does it differ from CBOR?

jviotti
4 replies
21h28m

CBOR is a schema-less binary format. JSON BinPack supports both schema-less (like CBOR) and schema-driven (with JSON Schema) modes. Even on the schema-less mode, JSON BinPack is more space-efficient than CBOR. See https://benchmark.sourcemeta.com for a live benchmark and https://arxiv.org/abs/2211.12799 for a more detailed academic benchmark

f549abd01
3 replies
8h30m

Thanks for linking the benchmarks. I appreciate the work on shaving additional bytes especially in cases where every byte matters. Real savings seem to be in the schema-driven mode. Comparing a "realistic", schemaless payload for a general storage use-case (eg. the config examples), it looks pretty even with CBOR. E: my bad, BinPack is getting more efficient with larger payloads https://benchmark.sourcemeta.com/#jsonresume

galangalalgol
1 replies
7h37m

As a note, while cbor is schemaless, there do exist tools to make it work with schemas. In rust cborium will generate rust types from a json schema that serde can use.

jviotti
0 replies
4h8m

I never used cborium, but if I'm understanding it correctly, I think it adds types at the language deserialisation stage and not over the wire. Which means that it makes it a lot more ergonomic to use within Rust, but doesn't use typings for space-efficiency over the wire.

jviotti
0 replies
4h5m

Exactly! The real hardcore savings will always be when you pass a schema, as JSON BinPack uses that to derive smarter encoding rules.

However, schema-less is very useful too. The idea with supporting both is that clients can start with schema-less, without bothering with schemas, and already get some space-efficiency. But once they are using JSON BinPack, they can start incrementally adding schema information on the messages they care about, etc to start squeezing out more performance.

Compare that with i.e. Protocol Buffers, which pretty much forces you to use schemas from the beginning and it can be a bit of a barrier for some projects, mainly at the beginning.

leeoniya
4 replies
23h26m

JSON BinPack is space-efficient, but what about runtime-efficiency?

When transmitting data over the Internet, time is the bottleneck, making computation essentially free in comparison.

i thought this was an odd sales pitch from the jsonbinpack site, given that a central use-case is IoT, which frequently runs on batteries or power-constrained environments where there's no such thing as "essentially free"

ok_dad
1 replies
20h29m

batteries or power-constrained environments

I would imagine that CPUs are much more efficient than a satellite transmitter, probably? I guess you'd have to balance the additional computational energy required vs. the savings in energy from less transmitting.

jviotti
0 replies
3h34m

Yeah, it all depends very much, given how huge the "embedded/IoT" spectrum is. Each use case has its own unique constraints, which makes it very hard to give general advice.

jviotti
0 replies
23h11m

Fair point! "Embedded" and "IoT" are overloaded terms. For example, you find "IoT" devices all the way from extremely low powered micro-controllers to Linux-based ones with plenty of power and they are all considered "embedded". I'll take notes to improve the wording.

That said, the production-ready implementation of JSON BinPack is designed to run on low powered devices and still provide those same benefits.

A lot of the current work is happening at https://github.com/sourcemeta/jsontoolkit, a dependency of JSON BinPack that implements a state-of-the-art JSON Schema compiler (I'm a TSC member of JSON Schema btw) to do fast and efficient schema evaluation within JSON BinPack on low powered devices compared to the current prototype (which requires schema evaluation for resolving logical schema operators). Just an example of the complex runtime-efficiency tracks we are pursuing.

freeone3000
0 replies
1h42m

For sure, but radio transmitter time is almost always much more expensive than CPU time! It’s 4mA-20mA vs 180mA on an esp32; having the radio on is a 160mA load! As long as every seven milliseconds compressing saves a millisecond of transmission, your compression algorithm comes out ahead.

lajr
3 replies
22h13m

This looks promising! One of the important aspects of protocol buffers, avro etc is how they deal with evolving schemas and backwards/forward compatibility. I don't see anything in the docs addressing that. Is it possible for old services to handle new payloads / new services to handle old payloads or do senders and receivers need to be rewritten each time the schema changes?

michaelmior
1 replies
17h39m

A lot of people already think about this problem with respect to API compatibility for REST services using the OpenAPI spec for example. It's possible to have a JSON Schema which is backwards compatible with previous versions. I'm not sure how backwards-compatible the resulting JSON BinPack schemas are however.

jviotti
0 replies
4h4m

Great seeing you over here Michael :) For other people reading this thread, Michael and I are collaborating on a paper covering the schema compiler I've been working on for JSON BinPack. Funny coincidence!

jviotti
0 replies
21h25m

Good question! Compared to Protocol Buffers and Apache Avro, that each have their own specialised schema languages created by them, for them, JSON BinPack taps into the popular and industry-standard JSON Schema language.

That means that you can use any tooling/approach from the wide JSON Schema ecosystem to manage schema evolution. A popular one from the decentralised systems world is Cambria (https://www.inkandswitch.com/cambria/).

That said, I do recognise that schema evolution tech in the JSON Schema world is not as great as it should be. I'm a TSC member of JSON Schema and a few of us are definitely thinking hard on this problem too and trying to make it even better that the competition.

promiseofbeans
1 replies
16h33m

Do you have any info on how your system stacks up to msgpack? (https://msgpack.org/index.html)

Asking because we use msgpack in production at work and it can sometimes be a bit slower to encode/decode than is ideal when dealing with real-time data.

jviotti
0 replies
4h9m

We do! See https://benchmark.sourcemeta.com for a live benchmark and https://arxiv.org/abs/2211.12799 for a more detailed academic benchmark.

The TLDR is that is that if you use JSON BinPack on schema-less mode, its still more space-efficient than MessagePack but not by a huge margin (depends on the type of data of course). But if you start passing a JSON Schema along with your data, the results become way smaller.

Please reach out to jv@jviotti.com. I would love to discuss your use case more.

kyrofa
1 replies
1d

From the OP:

Our payload uses a satellite transceiver for communications
jviotti
0 replies
1d

That's the hardware. I meant on the software side through the transceiver. If you transfer less bits through the satellite transceiver, I believe you can probably reduce costs.

tndl
0 replies
1d

Let's definitely talk, we're using protobufs right now. I'll send an email

the__alchemist
0 replies
2h56m

Why this over a compact, data-specific format? JSON feels like an unnecessary limitation for this company's use case. I am having a hard time believing it is more space-efficient than a purpose-built format.

jviotti
0 replies
3h57m

It surprised me how popular this message got. I love nerding out about binary serialization and space-efficiency and great to see I'm not the only one :)

If you want to get deeper, I published two (publicly available) deep papers studying the current state of JSON-compatible binary serialization that you might enjoy. They study in a lot of detail technologies like Protocol Buffers, CBOR, MessagePack, and others that were mentioned in the thread:

- https://arxiv.org/abs/2201.02089

- https://arxiv.org/abs/2201.03051

Hope they are useful!

pagade
8 replies
1d

To a layperson like me, could you explain how these balloons will be cleaned up / collected after their life? What material are they made up of?

tndl
6 replies
1d

Sure thing! They're made of about 300 grams of polyethylene. Towards the end of their lifespan, we can steer them to an area that's easy for us to drive out and pick them up. The payload has a GPS, which lets us track where they are both in the sky and on the ground.

Right now, most weather balloons fall back to Earth and stay where they land unless someone happens across them (since they can't be controlled and only last a couple of hours).

DAlperin
4 replies
1d

we can steer them to an area that's easy for us to drive out and pick them up.

What does this look like in practice? As you mentioned I know you don't really have any lateral control, but I imagine you can wait for it to overfly somewhere convenient to descend?

shagie
2 replies
23h37m

I believe it is along the line of...

Pull up https://www.pivotalweather.com/model.php?m=nam&p=sfct-mean-i... and pick some point (note the 'click for point sounding'). You can see the wind direction at that location as a function of altitude.

Using this as a vector field, you can do "the balloon is here now, 30 minutes from now it will be there, if it is at altitude Z at that time, it will be follow the wind in this direction" which in turn allows you to predict where it will be in 30 minutes and take the forecast for that location at that time and determine what altitude you want to be at.

Saying I want it to be at X,Y at some time is solving this backwards. Which isn't necessarily easy, but it's computable.

tndl
1 replies
23h28m

Pretty much this. We add the data the balloons themselves are collecting to make things more precise as well

shagie
0 replies
20h49m

Digging into it a little bit more...

The Balloon Learning Environment https://research.google/blog/the-balloon-learning-environmen... (https://news.ycombinator.com/item?id=31155137 - 73 points | 10 comments)

(2016) Station-keeping of a high-altitude balloon with electric propulsion and wireless power transmission: A concept study https://www.sciencedirect.com/science/article/abs/pii/S00945...

(2022) Station-keeping for high-altitude balloon with reinforcement learning - https://www.sciencedirect.com/science/article/abs/pii/S02731...

(2023) Resource-Constrained Station-Keeping for Helium Balloons using Reinforcement Learning - https://arxiv.org/abs/2303.01173

Chasing the citations from those papers to previous works can provide a fairly deep rabbit hole of things to read.

ImPostingOnHN
0 replies
23h34m

Due to the rotation of the earth, wind current direction rotates based on altitude. If you want to go in a particular direction, you ascend or descend to a altitude that has winds blowing in that direction.

At least, that's how I understand hot air balloons "steer".

mariushn
0 replies
21h5m

How do you control the altitude? I would imagine 'heat/cool the air inside the baloon', but this would be too energy intensive?

Congratulations for a great non-saas market and product!

dev2point0
8 replies
1d

This is awesome, how do you manage climbing and descending with a balloon. Are you compressing the gas on board or using thermals?

tndl
3 replies
1d

How we make it go up and down is the secret sauce :) I'm a hangglider guy, so I'd love to be using thermals, but I can say that's not how we do it right now

scottyah
1 replies
17h49m

Do you plan to sell to hobbyist consumers? As another thermal-rider I realize this would be amazing if it could do short term soundings at a launch site, or to have a fleet that can navigate back to a designated site for pickup and redeployment.

I'm picturing having a few dozen at launch site containers, launching them at the start of a day of flying, having them programmed to land in a rural area that a member can pick them up from and return to the launch sites.

tndl
0 replies
17h37m

This might be the coolest use case I’ve heard suggested. We’re not quite at the level where all of this is user friendly enough, but let’s catch up in 9-12 months

dev2point0
0 replies
21h54m

That’s what I figured :) Being able to control it with a one pound payload is very impressive.

mdorazio
3 replies
22h37m

I’d love to know, too. If it was me I’d use a small piezo plate on the side of the balloon to do it the same way hot air balloons do. Heat the gas in the balloon = go up, reverse the polarity to cool it = go down. That would be pretty energy inefficient, though, so hopefully their secret method is better.

Loon used pumps and an interior air ballast like blimps do. So clearly there are a few ways.

Ecco
2 replies
15h32m

Did you mean Pelletier?

mdorazio
0 replies
8h58m

Ah yes, got my P plates mixed up. Thanks.

aflukasz
8 replies
22h51m

In 1981, weather disasters caused $3.5 billion in damages in the United States. In 2023, that number was $94.9 billion (https://www.ncei.noaa.gov/access/billions/time-series).

Does that surprise someone? I think I would not have guessed this growth to be on such a scale. The chart suggests that severe storms are the main culprit.

mdorazio
3 replies
22h45m

Not at all. Look at the growth in human buildings in the most at-risk areas and you’ll see why that number is so big now. It’s only slightly due to an increase in severe weather event frequency / severity.

mrandish
2 replies
21h45m

Indeed, and not just building more in more at-risk places but also the cost of building materials, construction labor and code compliance requirements have all generally increased more than baseline inflation. Factors like these tend to greatly increase recent estimates vs historical.

I read a paper a few years back which dove into how the data sources for weather damage assessment have changed a lot over the years. Much of the increase is due to more complete reporting and changes in categorization. Also, nowadays more things are insured and modern IT has made gathering the insurance reporting far more exhaustive. Plus local, state and federal agencies responsible for relief and/or recovery are gathering and reporting increasing amounts of data with each decade since the 70s (in part because their budgets rely on it). Factors like these mean in prior decades the total damage costs may have been more similar to today's than they appear but a lot of the damage data we gather and report now wasn't counted or gathered then.

Although I have no experience related to weather science, I remember the paper because it made me realize how many broad-based, multi-decadal historical data comparisons we see should have sizable error bars (which never make it into the headline and rarely even into the article). Data sources, gathering and reporting methods and motivations are rarely constant on long time scales - especially since the era of modern computing. Of course, good data scientists try to adjust for known variances but in a big ecosystem with so many evolving sources, systems, entities and agencies, it quickly gets wickedly complex.

tndl
1 replies
20h20m

Factors like these mean in prior decades the total damage costs may have been more similar to today's than they appear but a lot of the damage data we gather and report now wasn't counted or gathered then

This is definitely part of it. Another part is that people live in more at-risk regions now than in the past (Florida is a great example, population has more than 10x'd since 1950).

Ultimately, the way we think about it is no matter what the underlying cause, weather-related damages could be significantly reduced with better data/forecasts

mrandish
0 replies
18h36m

weather-related damages could be significantly reduced with better data/forecasts

I agree. My point was only so those surprised by the massive increase in cost estimates can put those numbers in perspective since neither the average quantity nor severity of adverse weather events have changed substantially over the decades.

Providing more granular data to enable more accurate and timely weather forecasts is a sound business thesis even if adverse weather isn't happening 2x more frequently or energetically. It's still a large economic impact where money can be saved. More broadly, better forecasts can improve agricultural yields, reduce business disruption and increase throughput of transportation networks.

agurk
1 replies
8h37m

One detail here is that 1981 dollars aren't 2023 dollars, so to compare they need to be adjusted.

Using [0] $3.5 bn in 1981 would have been worth $11.7 bn in 2023.

Another comment [1] noted (but unfortunately didn't cite) that two years later the damage was assessed at $36 bn, or $110 bn in 2023 dollars.

[0] https://www.usinflationcalculator.com/

[1] https://news.ycombinator.com/item?id=41295116

freestyle24147
0 replies
4h42m

No, they don't need to be adjusted. The linked website has already adjusted for CPI. There's even an option to turn on/off adjusting, and it's on by default. I didn't cite because this is using the same data / website as the original claim.

tndl
0 replies
22h35m

Yeah, it's a shocking number, and it's just for the US. The global estimates for severe weather are even higher [0], and in places with less infrastructure, the costs are usually more heavily weighted toward human life lost.

Obviously what we're doing can't prevent severe weather from happening, but even very small improvements in accuracy and timelines can have a massive beneficial effect when a disaster does happen. My cofounders and I are all from Florida, so hurricanes are the most visceral examples for us. When hurricanes hit, there are always issues along the lines of "we didn't have the right resources in the right places to respond effectively." Those types of issues can be combated with better info.

[0]: https://www.statista.com/statistics/818411/weather-catastrop...

freestyle24147
0 replies
19h59m

Definitely a bit of cherry picking. Just 2 years later in 1983 the damages were $36 billion, but that wouldn't make quite as scary of a statement for the website.

simjnd
7 replies
1d

These conditions make the stratosphere a very difficult place to deploy to prod

This sentence is legendary

tndl
3 replies
23h53m

Every once in a while things go spectacularly wrong up there and we kick ourselves for not doing a b2b saas

darknavi
2 replies
23h34m

b2b

You are, as long as you mean balloons to business

tndl
1 replies
22h34m

XD

ahazred8ta
0 replies
21h40m

You're totally crushing the balloons-to-boondocks market segment.

freestyle24147
2 replies
20h3m

[flagged]

OccamsMirror
1 replies
14h14m

You didn't have to contribute this snark.

freestyle24147
0 replies
4h44m

And we don't constantly have to label mundane things as legendary

lormayna
5 replies
1d

As someone that is involved in catching radiosondes, this is quite cool!

tndl
4 replies
1d

Thanks! Let me know if you want to chase ours sometime.

lormayna
3 replies
23h14m

Yes, this would be really interesting! Are they using VHF frequencies as the actual radiosondes (Vaisala, etc.)?

tndl
2 replies
23h8m

We do all comms via satellite, including the current GPS location, no comms over VHF. I'm curious how you're tracking down a traditional radiosonde though, do you actively chase while it's in the air and then use visual reference as it comes down?

sciurus
0 replies
21h34m

You can use the data collected from hobbyist ground stations and displayed at https://sondehub.org/ to track them while in the air.

lormayna
0 replies
22h49m

I am not really into tracking radiosondes while they are in air. What I do, is like a radio "fox-hunting" when they come back on the surface, using a directive antenna.

johnsillings
5 replies
1d2h

This is one of the most fascinating Launch HNs in a while. Excited to follow your progress and congrats on the launch!

faitswulff
2 replies
1d

It’s a literal launch!

tndl
1 replies
1d

Being a balloon company means we get to launch pretty much every day, which is very fun :)

ahazred8ta
0 replies
21h49m

My coworker used to fly the chase plane for the Canadian Space Agency's balloons; they would call position and altitude for air traffic control and recover the instrument gondola. Lots of bushwhacking; one came down on an eagle's nest and mama wouldn't let them near the tree.

tndl
0 replies
1d2h

Thanks :)

davidw
0 replies
18h11m

Yeah, I don't have anything of substance to say other than it's really cool to see someone doing something innovative in a niche I'd never really thought of.

DoctorOetker
4 replies
4h59m

Is there a reason weather balloons can't be designed to stay aloft for much longer? is it really impossible to miniaturize gas separators in weight to replenish the lighter fraction of molecules to compensate for the slow leakage?

worldvoyageur
2 replies
4h4m

In reading the documents google's Loon made available before shutting down (and saying that hydrogen airships were the way to go) [https://storage.googleapis.com/x-prod.appspot.com/files/The%...], the bottleneck on staying up longer was helium loss. Lots of things, mechanical failure etc, caused flights to end early. But the ultimate limit seems to be either helium loss or the impact of helium absorption on seals etc causing failure. Helium is a very small molecule, so very hard to keep confined.

However, as they learned from experience, Loon was slowly increasing their upper limit on how long their balloons could stay up.

DoctorOetker
1 replies
3h24m

Sure there will be leakage.

Let's look at the composition vs height:

https://en.wikipedia.org/wiki/Atmosphere_of_Earth#/media/Fil...

I understand helium will be sparse at low heights, but at each height there is a diversity of species, among which will be a lighter one. Could a balloon be oversized so that instead of using helium, at each height an 80% fill of the locally (that height) lightest gas could keep the balloon afloat. Oversized for this but also for the added weight for separating equipment for the wanted gas and associated panels to power it.

I.e. what prevents a balloon from flying indefinitely? At least up until radiation damage of the balloon membrane...

Have such attempts been made and what were the lessons?

worldvoyageur
0 replies
1h54m

These guys [https://www.scientificballoonsolutions.com/news/], as of October 2018, claim the world record. [edit: the site linked says Alan Adamson but an upstream post mentions Lee Meadows. That thread also credits Bill Brown as a pioneer - https://www.stratoballooning.org/membership#!biz/id/5f4d7b97...]

The world record balloon was launched on September 21, 2016, stayed up over 767 days, circled the world 35 times and traveled over 1.5 million kilometers.

Their website, still from 2018, indicates they are working on even better designs.

tndl
0 replies
3h17m

One of the first ideas we explored was putting an electrolyzer on the balloon to replenish hydrogen over time. Unfortunately right now, for balloons our size and power budget it's just not feasible. And actually, we can get a pretty low leakage rate with our materials which lets us stay aloft for a really long time, but eventually the UV degradation becomes to extreme.

xd1936
3 replies
1d

Very cool, congrats on the launch. You're spending nearly all of your time in the stratosphere collecting data, but what correlation does that have to ground forecasts? Are your "AI models" that you're producing forecasting stratosphere conditions, or more than that?

tndl
2 replies
1d

We cruise in the stratosphere until we gather enough solar power to descend to ground level. On the descent and descent we're collecting soundings, which is the data that's useful for all types of forecasts.

Right now we're focused on the stratospheric forecasts because that's what we know really well (and we already have some interested customers). Our data/models are great for all kinds of forecasts, including ground forecasts, and we'll quickly expand beyond the stratosphere.

OccamsMirror
1 replies
14h10m

Are you pairing your data with satellite observations?

avecchi-sorc
0 replies
11h24m

Yes, however, like with traditional forecasts, we weigh our balloon observations much higher.

tomnicholas1
3 replies
22h17m

So the equivalent of these balloons in oceanography are called ARGO floats, which similarly cannot be driven laterally but can control their own depth like a submarine. So far millions of timeseries have been collected across the world ocean using these floats.

https://argo.ucsd.edu/

One difference though is that the ARGO floats are unfortunately not recycled, and just wash up on various beaches. (I'm curious whether you think you can realistically collect many of these mini balloons?)

If you do want to control the lateral position of fleets of sensors, oceanographers also now have "gliders", which are basically small powered drone submarines. These are used by a few groups, but most of the gliders in the world are operated by the US Navy, who launch them out of torpedo tubes to survey local ocean conditions (which is badass).

https://oceanservice.noaa.gov/facts/ocean-gliders.html

The recorded measurements present an interesting data assimilation challenge - they record data along 3D trajectories (4D including time), sampling jagged and twisting lines through the 4D space. But we normally prefer to think of weather/ocean data as gridded, so you need to interpolate the trajectory data onto the grid, whilst keeping the result physically-consistent. Oceanographers use systems like ECCO for ocean state estimation, which effectively find the "ocean of best fit" to various data sources.

https://www.ecco-group.org/

Interestingly ECCO uses an auto-differentiable form of the governing equations for the ocean flow to ensure that updates stay physically consistent. This works by using a differentiable ocean fluid model called [MITgcm](https://github.com/MITgcm/MITgcm) to perform runs which match experimental data as closely as possible, and minimizing a loss function through gradient descent. The gradient is of a loss function (error) with respect to model input parameters + forcings, which is calculated by running MITgcm in adjoint mode - i.e. automatic differentation. Therefore this approach is sort of ML before it was cool (they were doing all this well before the new batch of AI weather models). See slides 9-18 of this deck for a nice explanation

https://firebasestorage.googleapis.com/v0/b/firescript-577a2...

The trajectory data is also interesting because it's sort of tabular, but also you often want to query it in an array-like 4D space. You could also call it a "ragged" array. We have nice open-source tools for gridded (non-ragged) arrays (e.g. xarray and zarr, and the pangeo.io project) but I think we could provide scientists with better tools for trajectory-like data in general. If that seems relevant to you I would love to chat.

P.S: Sorceror seems awesome, and I applaud you for working on something hard-tech & climate-tech!

tndl
2 replies
21h52m

This is super interesting, I'd never come across ARGO before. Data assimilation is a similar problem for our data, and there currently exist systems for assimilating weather balloon observations into gridded reanalysis data (https://www2.mmm.ucar.edu/wrf/users/). One thing we believe, however, is that the reanalysis step in weather forecasting is unnecessary in the long term, and that future (ML) weather models will eventually opt to generate predictions based on un-assimilated raw data and will get better results in doing so.

That being said, trajectory-based data tooling could be super interesting to us. Let's definitely chat: austin@sorcerer.earth

And re: recovery, we're pretty confident we'll be able to recover the majority of our systems. Being in the air has the advantage that we can choose to 'beach' ourselves in a specific location, rather than the first place we run across land like with the buoys. At his previous company, Alex wrote a prediction engine able to get similar balloon systems to land in a predicted 1kmx1km zone for recovery

counters
1 replies
21h13m

One thing we believe, however, is that the reanalysis step in weather forecasting is unnecessary in the long term, and that future (ML) weather models will eventually opt to generate predictions based on un-assimilated raw data and will get better results in doing so.

The idea that we'll be able to run ML weather models using "raw" observations and skip or implicitly incorporate an assimilation is spot-on - there's been an enormous shift in the AI-weather community over the past year to acknowledge that this is coming, and very soon.

But... in your launch announcement you seem to imply that you're already using your data for building and running these types of models. Can you clarify how you're actually going to be using your data over the next 12-24 months while this next-generation AI approach matures? Are you just doing traditional assimilation with NWP?

Also, to the point about reanalysis - that's almost certainly not correct. There are massive avenues of scientific research which rely on a fully-assimilated and reconciled, corrected, consistent analysis of atmospheric conditions. AI models in the form of foundation models or embeddings might provide new pathways to build reanalysis products, but they are a vital and critical tool and will likely be so for the foreseeable future.

avecchi-sorc
0 replies
19h57m

There are massive avenues of scientific research which rely on a fully-assimilated and reconciled, corrected, consistent analysis of atmospheric conditions.

That’s a good point! In fact, the outputs for observation based foundational models will likely include a "reanalysis-like" step for the final output.

Regarding the next 6-12 months, we will be integrating our data with traditional NWP models and utilizing AI for forecasting. We've developed a compact AI model that can directly assimilate our "ground truth" data with reanalysis, specifically for use in AI forecasting models.

Once we have hundreds of systems deployed, we'll use the collected observations, combined with historical publicly available data, to train a foundational model that will directly predict specific variables based on raw observations.

popctrl
3 replies
22h51m

This sounds like super interesting and meaningful work. Are you hiring, or do you have any advice for your average software engineer on getting into this space?

tndl
1 replies
22h28m

We're not hiring right now, but definitely check back in a few months. As for advice, there's almost always a place for talented engineers. https://www.climatetechlist.com/ and https://jobs.climatebase.org/ both aggregate jobs at climatetech cos specifically if that's what you want to pursue.

popctrl
0 replies
22h7m

Thank you, this is just what I was looking for!

avecchi-sorc
0 replies
14h52m

I was in the same boat. I've been a software engineer for as long as I can remember and always wanted to do more than just build B2B SaaS.

Max, the first engineer at Urban Sky, hit me up and asked if I wanted to build their mission control. At the time, Urban Sky was just a four-person team, so they couldn’t pay me as much, but I jumped at the chance, even though it meant taking about half my usual salary.

Funny enough, my SaaS background actually helped me create mission control software that was way ahead of the curve!

I guess my advice is, find a small company you're passionate about, where you can make a big impact, and be open to taking a pay cut. It helps the company take less of a risk on you, and you get to work on something that really matters. Plus, when you’re solving real problems, things tend to work out, and eventually, you’ll end up making what you should in salary.

justinl33
3 replies
20h9m

as someone who's worked on a project involving weather data for agriculture over developing regions, I can't explain how bad (sparse and inconsistent) the data is. Can I ask how you handled the regulatory aspect of launching from urban areas in SF? I can imagine the FAA would have given you some trouble?

tndl
1 replies
19h42m

What's the project? Maybe we can help!

Can I ask how you handled the regulatory aspect of launching from urban areas in SF? I can imagine the FAA would have given you some trouble?

We actually fall under a weather balloon exemption to the normal FAA rules for unmanned balloon flights. Here’s a quick rundown of the relevant rules and regulations (Part 101.1) for weather balloon flights in the U.S. Our balloons fully comply with all of these:

- Any on-board cellular tracking devices must be set to Airplane mode before takeoff (we don't have any cellular) - Each individual payload box/package must weigh less than 6 pounds (ours is <250g)

- If a payload has a weight-to-size ratio exceeding 3.0 ounces per square inch, it must weigh less than 4 pounds.

  - Calculation: Divide the total payload weight in ounces by the area of its smallest face in square inches.
- If multiple payloads are carried by a single balloon, their combined weight must be under 12 pounds.

- The string connecting the payload to the balloon must break under an impact force of no more than 50 pounds (our string is 30g)

- It’s prohibited to design or operate an unmanned free balloon in a way that poses a hazard to people or property.

- Dropping objects from the balloon that could endanger people or property is not allowed.

We don't have to notify the FAA about our operations as long as we meet these criteria. To be super safe, we don't launch near airports or other busy areas.

schoen
0 replies
19h38m

I had idly wondered about FAA restrictions on (non-passenger-sized) balloons, but never learned about them.

Is there any history of weather balloons having caused damage to aircraft? It seems like it could be really bad, but that the aircraft would also have to be exceptionally unlucky.

maxmclau
0 replies
19h34m

Speaking to farmers in Central America was a huge impetus for us starting Sorcerer! FAA's part 101.1 regulation is for ballooning is confusing [0], but ultimately very open if your payload is below 6 pounds. Several thousands balloons are launched every day under this regulation and so far as I know there haven't been any incidents outside of a few radiosondes landing on peoples cars

[0](https://www.ecfr.gov/current/title-14/chapter-I/subchapter-F...)

cstasuik
3 replies
1d

Incredibly cool, thanks for sharing. Are you planning on hiring any time soon? Also, if you don't mind sharing, what designer/firm worked on your web and branding?

tndl
1 replies
1d

We're not hiring right now, but send me an email (austin@sorcerer.earth). And Max did the website, it's built on webflow

cstasuik
0 replies
1d

Will do, thanks! Also very cool, as a designer I always enjoy seeing a big emphasis on design out of the gate, and the brand does a great job of sticking to modern design trends while doing some things a bit differently.

samstephenson
0 replies
1d

I'm curious too!

thoop
2 replies
18h53m

Very cool and very useful!

"Worldwide, most radiosonde observations are taken daily at 00Z and 12Z (6 a.m. and 6 p.m. EST)" - NOAA.gov

I know we get more weather data from other sources, but it seems insane that these 2 launch times per day (per balloon location) are what make up most of our current weather forecasting data.

You mentioned solar. Do you have the capability (or plans) to run these over night as well?

supdudesupdude
0 replies
4h47m

Weather soundings play a part in making a forecast but it's not the only thing? Many times per day we collect weather station data, correlate against past forecasts, bias correct, etc.

Forecasts are made of so much input data it's insane, yes balloons matter but it's not the only thing. They are the only decent source of conditions aloft and the jet stream is main controller of our surface weather so it makes sense.

maxmclau
0 replies
16h25m

You’re right - Satellite data like GPS occultations and aircraft derived data still make up most of the data by volume. Soundings remain the ideal data source and are weighted very heavily in models, punching way above their volume.

As of right now power constraints mean we maintain tracking throughout the night, but cannot execute altitude maneuvers. We have a solution to this cooking though!

ramonyc
2 replies
5h54m

Neat :) I'm involved in a few projects focused on vertical profiling in the coastal urban boundary layer.

I saw in a response you said the balloons will periodically return to sea level and ascend (which sounds like a fun design challenge by itself.) Will you be doing so near populated areas as well?

Good luck!

tndl
1 replies
3h6m

Very interesting, I'd love to hear more about it! In short, yes, we plan to do descents near urban areas if there's a route where we can go down to a safe height and stay away from any airspace. What cities are you looking at right now?

ramonyc
0 replies
1h40m

Most recently Houston had a large ARM field campaign which brought together a lot of atmospheric scientists from around the country. (https://www.arm.gov/research/campaigns/amf2021tracer)

ARM is a DOE program that ships top-tier instrumentation to various sites around the world. Loads of university researchers will follow, and you end up with a massive open source data pool. Houston in particular was focused on Aerosol effects on precipitation in the Coastal-Urban environment. There were loads of balloon launches from sites all over the city during the campaign, from large ozonesondes to the tiny sparv embedded foam cup ones (https://sparvembedded.com/products/windsond)

I'm in NY, and my university NOAA department has a focus on PBL Ozone measurements lately. My work in particular is focused on low cost UAV profiling up to about 150m, with a pipe dream of doing 0-3km.

I'm just a grad student, but if anything there sounds interesting feel free to email and I can try and get you in touch with more knowledgeable people.

mnky9800n
2 replies
23h49m

How many balloons do you have currently deployed? Do you have a data API for the balloons?

tndl
1 replies
23h20m

We've deployed a few dozen, and we don't have a public API right now but send me and email: austin@sorcerer.earth

samstave
0 replies
22h47m

May you please deploy whatever-camera (gopro) to these with a 1-fps connector:

https://news.ycombinator.com/item?id=41173161

Such that we can see them?

---

As others mentioned, this is a fantastic launch.

I'd love to have one permanently teathered to a place of my choise using a fiber-optic+carbon/kevlar thread to hold it in place with data coming down the fiber, and have the camera and pico compute data and radios powered by solar.

mbokinala
2 replies
1d

Congrats on the launch! Seeing this reminded me of building and launching a high-altitude weather balloon with some buddies back in high school - one of the coolest projects I've gotten to work on.

If it's not proprietary, I'd love to know - how do you "steer" vertically between different wind layers to move in the direction you want to go?

Can't wait to see where you guys take this!

tndl
1 replies
1d

We think balloons are pretty cool too!

So I can't get into exactly how we do our altitude control, but the Google Loon project has a really great explanation of how they made their (very big) balloons go up and down: https://x.company/projects/loon/

Loon made all of their research public after they shut down, and we're obviously heavily inspired by their work. Our systems use a lot of the tech they pioneered, just on a much, much smaller scale (for reference, Loon's balloons were the size of tennis courts) Here's the PDF in case you're interested in checking out the 400+ page writeup: https://storage.googleapis.com/x-prod.appspot.com/files/The%...

mbokinala
0 replies
23h9m

Awesome, thanks for sharing!

hyperific
2 replies
17h5m

It seems to me there is a real application for wide area persistent surveillance (WAPS). This is a significant concern for civil liberties and WAPS is a largely controversial technology. To whom will this technology be licensed and what, if any, limitations will you impose on allowed payloads?

tndl
1 replies
16h31m

I don’t know much about WAPS at all, but generally speaking, we’re only interested in doing weather sensing. As of now, we’re not planning to license the vehicle at all. And generally speaking our platform doesn’t work for payloads that are heavier than ~0.25lb and have more than minimal power requirements.

lozaning
0 replies
9h4m

I'm honestly shocked in-q-tel isn't one of your backers.

gusgordon
2 replies
22h45m

This is awesome, nice work! In case it's useful, I made a Python package for calculating solar irradiance at altitude: https://github.com/gusgordon/airmass

Takes into account lots of stuff (e.g. attenuation from air, ozone, and water vapor) with the goal of estimating solar power at any altitude/latitude/day/time.

tndl
1 replies
22h34m

Very cool! Are you working on some kind of airborne solar platform?

firesteelrain
2 replies
16h22m

Very Cool! We have made and built many PicoBalloons that have circumnavigated the globe. No weather reports - just WSPR reports. We can detect spots in the world where GPS spoofing is happening.

“ Each vehicle (balloon + payload) weighs less than a pound and can be launched from anywhere in the world, per the FAA and ICAO reg”

Florida recently passed a law that does not allow PicoBalloon or your weather balloon type launches from Florida soil. It will result in a $150 fine.

HB321

https://www.flsenate.gov/Session/Bill/2024/321/BillText/er/P...

Article

https://www.cbsnews.com/miami/news/floridas-balloon-ban-will...

tndl
0 replies
15h51m

We're actually just going to have kids launch our balloons, it's no problem:

  A person who is 6 years of age or younger who
  intentionally releases, organizes the release of, or
  intentionally causes to be released balloons as 
  prohibited by s.
  379.233 does not violate subsection (4) and is not. 
  subject to
  the penalties specified in subparagraph 1.

maxmclau
0 replies
16h13m

Got our start with WSPR pico balloons!

Just saw that - No Florida launches on the horizon luckily

datageek_31
2 replies
11h31m

Since you are generating hundreds of terabytes of data every day, can we get some insight into what the data platform is being used here to handle such scale?

maxmclau
0 replies
2h3m

That's cumulative for all data collected like satellites, weather radars, and surface observing systems. Soundings are relatively "lightweight" compared to most sets. The major difficulty is pushing the data through our low-power satellite backhaul - in practice we average ~10B/s

cosmic_quanta
0 replies
6h38m

I would be interested in this as well. I hope they start an engineering blog at some point.

OhMeadhbh
2 replies
22h53m

Cool Stuff. Sounds like you're following your dreams and doing something that needs doing.

It would be very cool if you could do an open house for bay area geeks to come and just ooh and ahh at the gadgetry. Even a virtual open house would be cool. Something less than a full demo, and more focused on the story behind the gestation and launch of the project (and then a demo.)

tndl
0 replies
22h47m

For sure, we'd love to do some launches with some local folks. We've done launches with a few different companies in SF at their offices too, send me an email and we can figure something out: austin@sorcerer.earth

dang
0 replies
11h8m

(Sorry for the offtopicness but your otherwise fine comment was formatted in HN's 'code' style (https://news.ycombinator.com/formatdoc) rather than as a regular comment. I changed the formatting, but for the future it would be better not to indent your text with spaces unless it's code or something that needs that format.)

zebomon
1 replies
1d2h

This is awesome! Congratulations! It makes sense to me as a layperson that as more companies are doing more expensive things in the sky there would be greater value to high-precision weather insights. This seems like an exciting mission and well-timed business.

worldmerge
1 replies
19h2m

This is incredibly cool!

Are you hiring? This is really exciting work.

voxelizer
1 replies
1d

Wow! I am surprised it is even legan to launch those ballons from SF, specially that close to the airport. What is the regulation? Is it based on the size/weigth of the ballon?

tndl
0 replies
1d

Our balloons fall under the FAA Part 101 weather balloon exemption, which is based on the weight and purpose (basically it has to be small and has to collect weather data)

underdeserver
1 replies
6h30m

How did you come up with the name?

maxmclau
0 replies
2h0m

Sorcerer (1977)

reachableceo
1 replies
1d

Happy to discuss super pressure / float in depth.

Can help with the federal contract side and mass manufacturing etc.

Charles@turnsys.com

tndl
0 replies
1d

Sent you an email, thanks!

plopz
1 replies
23h31m

For your gridded data, what file format are you using, grib, netcdf, zarr?

tndl
0 replies
22h11m

a little bit of each right now, depending on what the customer wants

pkhodiyar
1 replies
11h47m

curious to know how you guys store and process the data and make meaningful inferences out of the data collected. Redshift? BigQuery? Datazip?

avecchi-sorc
0 replies
11h17m

Internally yes we use a data warehouse to store the raw observations. We go through a similar process to traditional NWP to produce forecasts except with a few AI steps in between. Unfortunately, the standard currently is to store these as large multi-dimensional data files like grib2/netCDF/zarr.

In case you’re curious, here’s where NOAA stores all their GFS related forecasts: https://registry.opendata.aws/noaa-gfs-bdp-pds/

petervandijck
1 replies
23h27m

When they said "launch more often" I guess you took that advice to heart :) Congrats on the launch(es).

bigveech
0 replies
16h38m

Yes! It’s quite confusing to distinguish between launches at the office haha

mrandish
1 replies
21h40m

Very cool! Seems like you're leveraging some similar navigational ideas as Google X's Project Loon. Loon was such a good fundamental idea, just a little too soon and not the right use case or business model. Initially operating on a smaller scale and focusing on less power-hungry data acquisition vs high bandwidth two way comms seems like a much more viable plan.

tndl
0 replies
21h11m

What we're doing wouldn't be possible without the work the Loon team did (and published)

lispisok
1 replies
20h24m

If this works this is huge. It will improve everything from knowing the weather for your weekend hike to hurricane evacuation order lead time and area

tndl
0 replies
20h18m

That's the goal. We're all from Florida, so the possibility of reducing the damage caused by hurricanes was a big reason we started the company in the first place.

Our balloons are actually cheap and steerable enough that we plan to fly them into TS/Hurricanes to get data way out in the Atlantic, farther than hurricane hunters can operate.

chasemgray
1 replies
1d1h

This looks amazing. Super exciting to see how this project grows!

Nathanael_M
0 replies
22h44m

*inflates

bobbob1921
1 replies
1d2h

Very cool and wishing you all the best of luck.

One question that came to mind, and this applies to all weather baloons not yours specifically, with the large number of weather balloons launched daily, how is it That more aren’t sucked into airplane engines causing potential disaster for the airplane? Thanks

tndl
0 replies
1d2h

Weather balloons (ours included) are pretty small and don't spend much time at the altitudes where planes are. And in the very, very unlikely case a plane were to hit one, our payload is less than 250g (about the mass of a pigeon).

bnycum
1 replies
19h39m

Awesome project. I currently work on flight data recorders. As you are aware, weather plays a big part in aviation. We do collect some weather information and it’s always one of the more exciting topics in the work we do. Understand a lot of the pain points with data and satellite communication. Sounds like something to keep my eye on.

tndl
0 replies
19h34m

Reach out if you think we could help at all :)

ahaucnx
1 replies
22h28m

Interesting. I have two questions:

1) What parameters are you measuring ? Did you think about also measuring gases?

2) What's your business model?

tndl
0 replies
22h14m

1. Wind speed/direction, air pressure, temperature, humidity, and solar irradiance. We are considering doing atmospheric composition as well, there are a couple of partners that are interested in GHGs, VOCs, water vapor, etc. We have the weight budget to do it, but we haven't flown any production payloads with those sensors yet.

2. The US National Weather Service actually has a commercial data buy program called MESONET [0], where they buy weather data from both academic and commercial partners. We're in the process of becoming one of their commercial partners now. Once we are, a single balloon will pay for itself in a matter of days with the data it can collect, which will let us scale up the number of systems we have deployed. The data we collect right now also lets us build niche weather forecast products like the stratospheric wind forecast mentioned in the post. Once we have enough balloons up, we can start producing useful weather forecast products at a regional and then global scale.

[0]: https://nationalmesonet.us/

Prcmaker
1 replies
9h22m

Nice one team, I like it. Anything a remote mech eng can help with? Not looking to be hired, but keen to help with something different.

tndl
0 replies
3h12m

Send us an email! team@sorcerer.earth

Faaak
1 replies
13h11m

Check ou r2ho.me, I'm sure you could have great synergies for lowering price

maxmclau
0 replies
1h59m

We love Yohan!

zuckerma
0 replies
6h4m

This looks super cool

supdudesupdude
0 replies
4h44m

Cool stuff but i've never seen much come out of these kinds of things. Crowd sourced surface pressure application will bring in so many observations the forecasts will be improved 30%. Fuck no.

So you really think you can launch a giant network of balloons and have that data integrated into the NOAA/NCEP model suite? Even if you get over the red tape it will take 10 years + to integrate this shit into the data assimilation program. You claim that you can input your balloon data into magical AI and it produces better forecasts than what the GFS? What is the standard of measurement I dont actually believe you at all

ptsd_dalmatian
0 replies
1d

Very exciting, congrats on this! I’ve been watching numerous weather forecasts over the last couple of years because of my interests in mountain sports, and I am very very curious how will this improve forecast accuracy. Good luck worh everything.

nojvek
0 replies
23h13m

YC money well spent instead of the next LLM app.

nilaymodi123
0 replies
23h5m

Congrats on launching. I love this idea. Makes so much sense haha

kyrofa
0 replies
1d

This is a great idea. I had no idea about the single-use radiosondes.

h1fra
0 replies
10h20m

Once in a while, a very interesting project that seems to be truly disrupting, congrats!

StanislavPetrov
0 replies
9h57m

Sounds very cool.

Sorcerer was also an amazing Infocom game. Good company.

PaywallBuster
0 replies
1h50m

flying condom /s

the idea sounds great tho!