Skip to main content

New top story on Hacker News: Is KDB a sane choice for a datalake in 2024?

Is KDB a sane choice for a datalake in 2024?
9 by sonthonax | 8 comments on Hacker News.
Pardon the vague question, but KDB is very much institutional knowledge hidden from the outside world. People have built their livelihoods around it and use it as a hammer for all sorts of nails. It's also extremely expensive and written in a language with origins so obtuse that it's progenitor APL needed a custom keyboard laden with mathematical symbols. Within my firm, it's very hard to get an outside perspective, the KDB developers are true believers in KDB, but they they obviously don't want to be professionally replaced. So I'm asking the more forward leaning HN. One nail in my job, is KDB as a data-lake and I'm being driven nuts by it. I write code in Rust that prices options. There's a lot of complex code involved in this, I use a mix of numeric simulations to calculate greeks and somewhat lengthy analytical formulas. The data that I save to KDB is quite raw, I save the market data and derived volatility surfaces, which are themselves complex-ish models needing some carefully unit-tested code to convert in to implied vols. Right now my desk has no proper tooling for backtesting that uses our own data. And I'm constantly being asked to do something about it, and I don't know what to do! I'm 99% sure KDB is the wrong tool for the job, because of three things: - It's not horizontally scalable. A divide and conquer algo on N<{small_number} cores is pointless. - I'm scared to do queries that return a lot of data. It's non trivial to get a day's worth of data. The query will just often freeze, it doesn't even buffer. Even if I'm just trying to fetch what should be a logical partition, the wire format is really inefficient and uncompressed. I feel like I need to engineering work for trivial things. - The main thing is that I need to do complex math to convert my raw data, order-books and vol-surfaces into useful data to backtest. I have no idea how do do any of this in KDB. My firm is primarily a spot desk, and while I respect my colleagues, their answer is: > Other firms are really invested in KDB and use KDB for this, just figure it out. I'm going nuts because I'm under the assumption that these other firms are way larger and have teams of KDB-quants doing the actual research. While we have some quant traders who know a bit of KDB but they work in the spot side with far more simple math. I keep on advocating for some Parquet style data-store with Spark/Dask/Arrow/Polars running on top of it that can be horizontally scaled and most importantly, with Polars, I can write my backtests in Rust and leverage the libraries I've already written. I get shot down with "we use KDB here". I just don't know how I can deliver a maintainable solution to my traders with this current infrastructure. Bizarrely, and this is a financial firm, no one in a team of ~100 devs has ever touched Spark style tech other than me here. What should I do? Are my concerns overblown? Am I misunderstanding the power of KDB?

Comments

Popular posts from this blog

New top story on Hacker News: Tell HN: I think I found Toyota's battery

Tell HN: I think I found Toyota's battery 173 by scythe | 29 comments on Hacker News. Recently there was a thread about a "breakthrough" in battery technology at Toyota. https://ift.tt/nUtv4yY Toyota has been putting out PR puff pieces about their "solid-state" (solid-electrolyte) batteries for years, but this story was unique in that it had a quote from Keiji Kaita, who holds some high-level role at Toyota. Anyway, I didn't think much of it, because there was no paper referenced in the Guardian article, which seemed to be the original source. But while reading about something else, I came across the paper "A near dimensionally invariable high-capacity positive electrode material", published in Nature Materials last December: https://ift.tt/24ZXPy5 This paper, reporting a cathode that has very little (much less than normal) change in size or shape when charged and discharged, claims reversible storage with a solid electrolyte. It stands to reaso

New top story on Hacker News: Show HN: Neucards – Privacy based digital contact card

Show HN: Neucards – Privacy based digital contact card 7 by bdominy | 1 comments on Hacker News. Neucards is an end-to-end encrypted contact information sharing and updating iOS app that protects your identity while letting you keep in touch with people. I started working on neucards as a side project more than ten years ago, and I decided three years ago to go full-time and try to build a community around it. There are two major problems that neucards addresses. First, most people end up with contact lists that are hopelessly out of date. Over time, people move, change jobs, or add social profiles and unless they tell you, chances are you could lose touch. Second, your contact information ends up in the wrong hands. There has been a huge increase in robocalls, unsolicited emails, data breaches, and online scams that is driven by accessing a person's contact info. Even worse, with AI now being able to imitate a person's voice or other mannerisms, knowledge about the connecti