return to table of content

Python toolkit for quantitative finance

openrisk
6 replies
18h10m

The financial industry never considered a serious open source strategy to be aligned with their interests and that has painted the sector in increasingly narrower corners.

Think eg. the comparison with the acumen of the adtech sector, which supports (among countless other things) the most used open source mobile OS, the most used open source web browser, the most sophisticated open source suites for machine learning etc. etc.

In fact a good reason why "adtech" is (absurdly) considered part of "big tech" is that no other business sector has managed to articulate a long-term sustainable digitization story.

jgalt212
5 replies
18h4m

The problem is that the models (closed sourced or open source) only get you part of the way. For example, (to name just a few items) a stock option pricing model is useless without

- holiday calendars

- ex dividend dates

- interest rate curves

- real-time stock prices

- corporate actions database

Are there open source and free sources of the above? For the first two, sort of, for the remainder, no. And I'm sure I'm forgetting a number of other inputs.

drexlspivey
1 replies
6h30m

Yes the US treasury has an open API to get the yield curve from and yahoo finance has a free stock price API

brewmarche
0 replies
5h18m

That’s not the correct curve to use for pricing in general. You’d infer discount factors from overnight indexed swaps (OIS) instead as the overnight rate (ESTR, SOFR, SONIA, etc.) is what is used for collateralisation typically. To create such a discount curve you need OIS swap rates.

solatic
0 replies
8h22m

I can't see the financial industry getting behind FOSS, but there's very much a missed opportunity for Data API companies. Let trading firms pay standard rates to get access to high-resolution, realtime data; the secret sauce in each firm then becomes what kind of trading algorithm you write to make the most profit off that data. Ensuring that everyone has potential access to the same underlying data helps dissuade claims that profits are made from insider trading. There should be all kinds of data for all kinds of domains available when you fork over a little money for an API key.

openrisk
0 replies
11h52m

Any data requirement in the above list that is public knowledge can be solved (in principle), but it takes coordination between parties that are not used to collaborative/coopetitive behavior.

There is also the bit of data cleaning work that is costly - somebody must be paid to design and operate it, but again with modern tech solutions its likely that this could become immaterial.

Yet there is broader challenge beyond concrete applications: the financial industry is 100% an information processing industry but is largely inconsequential and absent in the development of modern digital technology.

HolyLampshade
0 replies
16h8m

You can limp into a fair bit of the corpact data via open source/free channels, but reliable sources are definitely expensive.

Not to mention the real-time data which is, quite simply, catastrophically expensive. And that’s assuming the least sophisticated (retail) implementation of this stuff.

sk11001
1 replies
21h50m

Does someone have LOC as a performance indicator?

ng12
0 replies
21h47m

Unironically probably. The only place I've ever worked that used LoC as a performance indicator was a hedge fund.

heyoni
1 replies
21h57m

The pointless act of "encapsulating" nothing? I see that a lot unfortunately =[

IshKebab
0 replies
21h44m

Presumably @plot_function does something. Also there are lots of other functions in the same API that are more than one line.

I suspect he wouldn't have thought it was over engineering if it didn't have such a long comment for one line of code... Which is silly.

wepple
0 replies
18h47m

It’s so bizarre my first reaction was that surely it must be so an LLM can make some kind of sense of it, otherwise what on earth are they doing?

nixpulvis
0 replies
21h51m

What about the @plot_function bit? Also, wrapping dependency calls for a more consistent interface isn’t necessarily a bad thing.

sexy_year
3 replies
12h10m

You can also check OpenBB, one of the most known projects in this category: https://github.com/OpenBB-finance/OpenBBTerminal

Most data vendors will require a free API key because that's their GTM. They want you to create an account with them to get a free API key and then expect to be able to upsell you over time.

Or you can access "free data" (e.g. yfinance) that relies on projects who actively scrape financial data from a website - these tend to need a lot of updates from main maintainer since there are no aligned incentives between maintainer of the scrapped API and company that has the data.

PS: I'm the main creator behind the OpenBB project on GitHub.

dash2
1 replies
10h26m

Is there a way out of this dilemma?

Are there any decent APIs for UK data, by the way?

financetechbro
0 replies
2h29m

You can pull public company data straight from SEC / EDGAR. They have a public and free API

TacticalCoder
0 replies
5h17m

Most data vendors will require a free API key because that's their GTM. They want you to create an account with them to get a free API key and then expect to be able to upsell you over time.

PS: I'm the main creator behind the OpenBB project on GitHub.

As you point out the GTM of others, can you tell us what your GTM is?

raluk
3 replies
23h19m

In readme, they should write tutorial on how to make money with this. I mean is there any other reson for using this software?

nurettin
0 replies
23h5m

To make money, write software for people who will make money.

constantcrying
0 replies
23h1m

In readme, they should write tutorial on how to make money with this.

You don't make money with this.

affyboi
0 replies
28m

Trading is a zero sum game, no one is ever going to give up their secret sauce

curiousgal
1 replies
10h13m

How could one study said design? By generating class diagrams or what?

ironSkillet
0 replies
6h32m

By reading through the code, understanding how they've laid out their abstractions and user interfaces. I am in the field, and in my experience, the most impactful design decisions for libraries like this are related to how researchers will actually interact with the tool.

999900000999
0 replies
19h11m

I was really surprised there aren't any real examples here. You have a couple of videos linked, but am I really going to pause the video and hand copy each line of code?

snovymgodym
2 replies
20h9m

Nah this looks pretty orthogonal to that. This just looks like a collection of pure python libraries for doing common quant work. The thing Cal Peterson is describing (which is pretty transparently JP Morgan's Athena) would be SecDB at Goldman and would be running on their proprietary scripting language called "Slang". None of that is open source.

Goldman was the first place to do a system like that, and when it was copied at other investment banks like JP Morgan and Bank of America, they opted to use Python instead of an in-house language and so "bank python" was born. Actually, the banks all poached engineers from one another, so many of the people that built the system at one place ended up building it again at another, hence why there are so many similarities between the equivalent systems at all these US investment banks. Some of those people eventually went on to build it again as a SaaS offering: https://www.beacon.io/

bostik
0 replies
10h5m

Beacon is more PaaS than SaaS from what I've seen

You are correct. (Full disclosure: I work at Beacon.)

Financial institutions are extremely sensitive about where their data is held, processed, stored and/or sent to. Some of it is just basic corporate governance ("we do not like the additional risk"). Some you could lump in with secrecy and competitive edge ("this is our secret sauce, no way are we going to let anyone else get it"). Some is driven by regulations ("we hold/process highly sensitive financial and personal data on individuals, sending it to a third party is a huge no-no"). And some is just garden variety contract obligations.

[Note that I intentionally chose to omit any consideration for "plain" security. In this industry that can get political.]

Where data governance/sovereignity is concerned, the term "SaaS" is commonly understood as: "send data to a third party, get results back". You can imagine how well that plays with any data an institution considers precious.

scrlk
2 replies
22h32m

Does GS still use their proprietary Slang language or has that been phased out in favour of Python?

1htfp
1 replies
22h14m

Still very much present, it powers "SecDB" (which is pretty much the nervous system of the entire markets business). While there's certainly been openness to Python/etc and tech to integrate Slang into the 21st century, it's the kind of thing that's hard to imagine ever being phased out.

simonh
0 replies
20h7m

I worked at Bank Of America for a while on the Quartz platform, which is their Python based clone of SecDB. The lead architect was one of the founders of the Slang/SecDB platform at Goldman. It as great fun to work on. Incredible power at your fingertips.

fifilura
1 replies
9h10m

"Copyright 2020 Goldman Sachs."

Probably added in panic mode around March 2020?

lizmutton
0 replies
1h27m

Aha. Wonder if they ever actually used it xD

NewsaHackO
2 replies
23h16m

This seems pretty basic, really just classes of common data structures used in finance. Closer to what you would expect for a final project for a undergrad in OOP course.

tomrod
0 replies
17h45m

I have bad news for you. Most code you tend to come across is at that level.

nerdponx
0 replies
15h45m

That's what you find in a lot of domain specific libraries written by scientists, mathematicians, etc. Professional engineer-quality code written by people who aren't professional engineers is rare. Or it's some enormously popular library that has had a lot of attention from engineers over the years.

rr808
0 replies
15h35m

I did a trawl for quant libraries a year ago and didn't see this. Any other big firms out there with quant libs esp vol related?

richardw
0 replies
20h41m

Rather than just dump on it, I’d find it more interesting to hear how it came to be and what it’s used for internally (if so).

I’ve seen surprising stuff that had rational reasons under closer investigation. Companies have cultures and internal priorities that make sense when you’re inside the bubble, but look weird from outside.

dash2
0 replies
10h27m

Huge missed opportunity here to name the library "vampire-squid"...

From the README this looks like a piece of advertising to developers about what GS does, more than anything useful to the outside world.

caseyf7
0 replies
19h56m

The toolkit may be free, but the data is very expensive.

CoderJoshDK
0 replies
22h14m

This code commits so many sins. The contributing standards are so strange. And what is up with the licensing?

Looking at this code hurts my eyes.