Skip to main content

New top story on Hacker News: Show HN: ScratchDB – Open-Source Snowflake on ClickHouse

Show HN: ScratchDB – Open-Source Snowflake on ClickHouse
36 by memset | 5 comments on Hacker News.
Hello! For the past year I’ve been working on a fully-managed data warehouse built on Clickhouse. I built this because I was frustrated with how much work was required to run an OLAP database in prod: re-writing my app to do batch inserts, managing clusters and needing to look up special CREATE TABLE syntax every time I made a change. I found pricing for other warehouses confusing (what is a “credit” exactly?) and worried about getting capacity-planning wrong. I was previously building accounting software for firms with millions of transactions. I desperately needed to move from Postgres to an OLAP database but didn’t know where to start. I eventually built abstractions around Clickhouse: My application code called an insert() function but in the background I had to stand up Kafka for streaming, bulk loading, DB drivers, Clickhouse configs, and manage schema changes. This was all a big distraction when all I wanted was to save data and get it back. So I decided to build a better developer experience around it. The software is open-source: https://ift.tt/WYzwRel and and the paid offering is a hosted version: https://ift.tt/knCwO4X . It's called “ScratchDB” because the idea is to make it easy to get started from scratch. It’s a massively simpler abstraction on top of Clickhouse. ScratchDB provides two endpoints [1]: one to insert data and another to query. When you send any JSON, it automatically creates tables and columns based on the structure [2]. Because table creation is automated, you can just start sending data and the system will just work [3]. It also means you can use Scratch as any webhook destination without prior setup [4,5]. When you query, just pass SQL as a query param and it returns JSON. It handles streaming and bulk loading data. When data is inserted, I append it to a file on disk, which is then bulk loaded into Clickhouse. The overall goal is for the platform to automatically handle managing shards and replicas. The whole thing runs on regular servers. Hetzner has become our cloud of choice, along with Backblaze B2 and SQS. It is written in Go. From an architecture perspective I try to keep things simple - want folks to make economical use of their servers. So far ScratchDB has ingested about 2 TB of data and 4,000 requests/second on about $100 worth of monthly server costs. Feel free to download it and play around - if you’re interested in this stuff then I’d love to chat! Really looking for feedback on what is hard about analytical databases and what would make the developer experience easier! [1] https://ift.tt/EN215lI [2] https://ift.tt/JYbnlzM [3] https://ift.tt/xdDgmjb [4] https://ift.tt/Hw9D0K1 [5] https://ift.tt/lzH6KyA

Comments

Popular posts from this blog

New top story on Hacker News: Tell HN: I think I found Toyota's battery

Tell HN: I think I found Toyota's battery 173 by scythe | 29 comments on Hacker News. Recently there was a thread about a "breakthrough" in battery technology at Toyota. https://ift.tt/nUtv4yY Toyota has been putting out PR puff pieces about their "solid-state" (solid-electrolyte) batteries for years, but this story was unique in that it had a quote from Keiji Kaita, who holds some high-level role at Toyota. Anyway, I didn't think much of it, because there was no paper referenced in the Guardian article, which seemed to be the original source. But while reading about something else, I came across the paper "A near dimensionally invariable high-capacity positive electrode material", published in Nature Materials last December: https://ift.tt/24ZXPy5 This paper, reporting a cathode that has very little (much less than normal) change in size or shape when charged and discharged, claims reversible storage with a solid electrolyte. It stands to reaso...

New top story on Hacker News: Show HN: Neucards – Privacy based digital contact card

Show HN: Neucards – Privacy based digital contact card 7 by bdominy | 1 comments on Hacker News. Neucards is an end-to-end encrypted contact information sharing and updating iOS app that protects your identity while letting you keep in touch with people. I started working on neucards as a side project more than ten years ago, and I decided three years ago to go full-time and try to build a community around it. There are two major problems that neucards addresses. First, most people end up with contact lists that are hopelessly out of date. Over time, people move, change jobs, or add social profiles and unless they tell you, chances are you could lose touch. Second, your contact information ends up in the wrong hands. There has been a huge increase in robocalls, unsolicited emails, data breaches, and online scams that is driven by accessing a person's contact info. Even worse, with AI now being able to imitate a person's voice or other mannerisms, knowledge about the connecti...