More

arsalanb · 2025-07-28T10:18:16 1753697896

Working on Livedocs.com

We are building a modern alternative to Jupyter, something like Cursor meets Jupyter.

arsalanb · 2025-05-25T22:40:02 1748212802

https://livedocs.com

An AI data scientist for serious data work. Think of it like an AI native Jupyter notebook.

arsalanb · 2025-04-25T08:14:19 1745568859

Check out livedocs.com, we built a notebook around Polars and DuckDB (disclaimer: I'm the founder)

arsalanb · 2025-04-08T19:52:32 1744141952

We use Vega pretty heavily, its a broader ecosystem. Where it really shines is in combination with Altair and Vegafusion to do number-crunching on the backend and return a chart spec that can just be rendered on the front-end.

That makes it particularly useful when building interactive visualizations with a lot of data.

arsalanb · on July 30, 2024

Is the Python code executing via Pyodide in the browser?

ralph_r · on July 30, 2024

That's correct

arsalanb · on June 3, 2024

Out of curiosity, what took you 10 years? Keen to hear your journey as someone building in the same space..

RyanHamilton · on June 3, 2024

Year 1 - Getting the first version out that worked for me and my colleagues. Year 2 - Cleaning it up enough to promote wider use. Documentation. Years 3-5 - Minor bug fixes only as I thought qStudio solved the problem. Years 6 - I realised restricting qStudio to only 1-2 database technologies was foolish. Major change to support many more databases. Improved generic SQL highlighting. Added a partial dark mode. Years 7-8 - Minor bug fixes. Year 9 - Added a proper Dark Mode and support for many themes by using FlatLaf. Now looking properly modern. Year 10 - Realise that I'm not fully solving the problem. That actually for most data analysts I should support creating the analysis (pivot table) and improve exporting (real excel export, not just nicely escaped CSV).

There were more learnings, like I should definitely have went fully open source at the start. It's harder to do later.

mixmastamyk · on June 3, 2024

Good software takes ten years, at least according to Joel. That’s a significant application of course.

pklee · on June 3, 2024

https://www.joelonsoftware.com/2001/07/21/good-software-take... - Reference. I read this for the first time. Thank you for sharing this.

ReleaseCandidat · on June 3, 2024

> Out of curiosity, what took you 10 years

That's not the first release, that's "just" 3.0, they released QStudio 1.25 in 2013 (their first blog post) https://www.timestored.com/b/qstudio-kdb-ide-1-25-released/

arsalanb · on May 6, 2024

Noob question: What is the advantage of replicating data into a warehouse vs. just querying it in place on a postgres database?

efxhoy · on May 6, 2024

Typically data warehouses are OLAP databases that have much better performance than OLTP databases for large queries.

There might also be several applications in a company, each with their own database, and a need to produce reports based on combinations of data from multiple applications.

I think that in many cases your question is based on an idea that is completely right. engineers are too eager to split out applications into multiple databases and tacking on separate data warehouses. The costs of maintaining separate databases is often higher than initially thought. Especially when some of the data in the warehouse needs to go back into the application database, for example for customer facing analytics. I think many companies would be better served by considering traditional data warehousing needs directly in their main application databases and abstain from splitting out databases. Having one single ACID source of truth and paying a bit more for a single beefy database server makes a lot more sense than is commonly thought. Especially now when many customer facing products, like recommendation systems, are “data driven”. At least that’s my impression after working in the space for a while.

CuriouslyC · on May 6, 2024

If the postgres database is recording business transactions, you don't want to cause your business to stop being able to take credit cards because you generated a report.

arsalanb · on May 6, 2024

Assuming you use a connection pool, why would it stop? Either the query returns a result or it doesnt? Am I missing something?

CuriouslyC · on May 6, 2024

Reporting queries can put a significant load on the db, to the point that it interrupts service.

necubi · on May 6, 2024

Futhermore, Postgres is an OLTP (transactional) database, designed to efficiently perform updates and deletes on individual rows. OLAP (analytical) databases/query engines like Clickhouse, Presto, Druid, etc. are designed for efficient processing across many rows of mostly unchanging data.

Analytical queries (like "find the average sales price across all orders over the past year, grouped by store and region") can be 100-1000x faster in an OLAP database compared to Postgres.

arsalanb · on May 6, 2024

I see, thanks!

throwaway82533 · on May 6, 2024

What about using a read-only replica for reporting. Are there any downsides to that? Seems to be easier to manage

cgio · on May 7, 2024

That’s the use case for cdc, to make it equally easy to use a DW. As always the complexity is just air you move in the balloon. The oltp db can spit out the events and forget them, how you load them efficiently is now a data engineer’s problem to solve ( if it was easy to write event grain on an olap you would not need an oltp). Kafka usually enters the room at this stage and the simplification promise is becoming tenuous.

mritchie712 · on May 7, 2024

that works great if all the data you need for reporting is in the database you're replicating.

You'd likely want a data warehouse if you also need to report on data that isn't in your prod database (e.g. stripe, your CRM, marketing data, etc.).

If setting up a data warehouse, etl, BI, etc. sounds like a lot of work to get reports, you're right, it is.

shameless plug: we're making this all much simpler at https://www.definite.app/

matthieucan · on May 6, 2024

Additionally, unless your data model is designed as append-only (which is unusual and requires logic downstream), you won't be able to track updates and deletions, which are valuable for reporting

nijave · on May 7, 2024

Event sourcing

Also reporting/analytics work loads tend to be ad hoc queries and hard to optimize so you generally favor fast storage over indexes. Frequently for analytics and reporting it's more efficient to use a columnar DB than a relational db

acossta · on May 7, 2024

When you need to do large/medium-scale analytical queries. Postgres is fairly slow for aggregate/group queries needed for analytics. Think if you're building Google Analytics type functionality.

ralfhn · on May 6, 2024

Data warehouses are structured to handle large volumes of data and complex queries more efficiently than a typical transactional database like PostgreSQL.

arsalanb · on March 21, 2024

That sounds like a great add! Will add this in over the weekend!

cranberryturkey · on March 22, 2024

and "add to calendar" link

arsalanb · on Nov 14, 2023

This is great, especially to introduce new devs to models. I use (and love) TablePlus (https://tableplus.com/) which has a diagram generator plugin that does the same!

riffic · on Nov 14, 2023

I don't have a formal compsci background but one of the most enlightening things I've undertaken recently was a course involving data modeling. this seems like an important exercise many overlook.

arsalanb · on Jan 3, 2023

I imagine this is possible because of better modulation of the acoustic and pressure controllers to minimize energy loss..