These sqlc-style libraries are a great solution to the problem of “make running a query as easy as calling a function”, but I’ve always thought SQL’s lack of composability is a more interesting problem that I haven’t seen addressed (the problems with views are well documented).
There's two different approaches to solving sql composability issues:
1. Compiles-to-SQL domain specific languages. This category spans from ORM DSLs embedded in another programming language, like Ruby/Rail's ActiveRecord/AREL or Django's ORM; to stand-alone text-based languages like PRQL Pipelined Relational Query Language" (https://prql-lang.org) that a compiler program converts to SQL text or SQL files. The downside to the DSL option is that it requires practitioners be fluent in both the SQL query they want, and in the DLS language - to know how to obtain the SQL query in the DSL.
2. Query fragment literals in the caller programming language, like sql`name = ${name}` in TypeScript (eg https://github.com/gajus/slonik). These are usually thin abstraction over concatenating a `{ queryText: string[], queryArgs: T[] }` structure. The author only needs to be fluent in SQL, and in the caller language, but do less to save you from the expressive limitations of SQL itself.
I've found query fragment composition to be the sweet spot. Easy SQL queries remain trivial to express and understand, since it's Just SQL:
sql`SELECT * FROM block WHERE id = ${args.id}`
But you can DRY up repetition in the codebase through regular function calls. Abbreviated example from Notion's client code:
function selectOfflinePageMetadata(args: { userId: string }) {
return sql`
SELECT
offline_page.id,
offline_page.space_id,
offline_page.download_status,
offline_page.last_downloaded_at,
offline_page.last_downloaded_version,
offline_page.last_downloaded_sync_cursor,
offline_page.target_sync_cursor,
CASE
WHEN EXISTS (
SELECT 1 FROM offline_action
WHERE offline_action.impacted_page_id = offline_page.id
AND offline_action.origin_page_id = offline_page.id
AND offline_action.autosync_type = 'not_autosynced'
)
THEN 1
ELSE 0
END AS is_explicitly_offlined_origin,
CASE
WHEN EXISTS (
SELECT 1 FROM offline_action
WHERE offline_action.impacted_page_id = offline_page.id
AND offline_action.origin_page_id = offline_page.id
AND offline_action.autosync_type = 'created_offline'
)
THEN 1
ELSE 0
END AS is_offline_created_origin,
CASE
WHEN EXISTS (
SELECT 1 FROM offline_action
WHERE offline_action.impacted_page_id = offline_page.id
AND offline_action.origin_page_id = offline_page.id
AND offline_action.autosync_type = 'frecent'
)
THEN 1
ELSE 0
END AS is_autosynced_origin
FROM offline_page
WHERE offline_page.meta_user_id = ${args.userId}
`
}
function selectOfflinePageById(args: {
userId: string
pageId: string
}) {
const { userId, pageId, } = args
return sql`
SELECT * FROM (${selectOfflinePageMetadata({ userId })}) WHERE id = ${pageId}
`
}
function selectOfflineAutosyncedOrigins(args: {
userId: string
spaceId: string
}) {
const { userId, spaceId, } = args
return sql`
WITH offline_page_metadata AS (
${selectOfflinePageMetadata({ userId })}
)
SELECT offline_page_metadata.* FROM offline_page_metadata
WHERE space_id = ${spaceId}
AND is_autosynced_origin = 1
ORDER BY last_downloaded_at ASC
`
}
I'm not sure if it solves your "view problem", but it does a pretty good job for _my_ view problem.
This is interesting! A field being nullable because it's legitimately optional in the domain model is one thing, but for new fields which shouldn't be nullable in the domain model, unless you can pick a reasonable identity value, you need a concept of absence that's different from null. Luckily the intersection of "non-nullable fields" and "fields with no reasonable identity value" and "fields which didnt exist in v1 of the domain model" is normally pretty small, but it's painful when it happens.
This reminds me of frozen/nonfrozen enums in Swift. You can do exhaustive case analysis on frozen enums, but case analysis on nonfrozen enums requires adding an `@unknown default` case.
What they are saying is that the field is always present in the domain model but we don't have the information to backfill it. For example, say you have a customers table. Originally, it just stored their name and internal ID. But now you are adding in their government ID as well. Except that you already have thousands of customers and you don't have their government ID. So you either make the column nullable and slowly backfill it over time. Or you find some default value which isn't null but the code understands it to still be empty. And again, you slowly backfill over time.
> So you either make the column nullable and slowly backfill it over time. Or you find some default value which isn't null but the code understands it to still be empty. And again, you slowly backfill over time.
What would be the justification for using some other “default value” in this case? That’s just null with extra steps. Null _is_ the default value
There’s nothing gross or unholy about null values. Very smart people many years ago envisioned them as part of relational databases specifically for the use cases like these.
I want to represent that a field should never get new null values, it should be treated as non-nullable for the purpose of writing; however there are historical records which were made before the introduction of the field, so it should be treated as nullable for the purpose of reading.
> I want to represent that a field should never get new null values, it should be treated as non-nullable for the purpose of writing
That's a job of DB triggers — arbitrary restrictions that are automatically applied when you create or update records. You can restrict new null values, you can restrict numeric values to be odd or prime, you can make sure that only a single field can be null out of two, but not two at the same time — all the things that you can't even describe in a type system.
If you are referring to Tony Hoares talk, it is not nulls which is the billion dollar mistake. The mistake is type systems which does not distinguish between nullable and non-nullable.
Rails definitely seems optimized for creating code, but how is it at maintaining code? I've never used it, but it seems like Ruby's dynamic types would make it really challenging to do large refactors. What techniques do rails developers use?
The key to maintaining a healthy Rails codebase is to make most of your code independent of Rails. Rails is great at the basics: Handling requests and composing views. It's okay at storage stuff. But it's not good at handling complex usecases, managing JS, etc.
This isn't to say that you can't use Rails if you're making complicated stuff, or using JS, it just means you need to know when to keep your code to Rails's structures and when to write your own structures that you then invoke from within Rails.
> with functions named by their normalized hash contents, and referred to anywhere by that, but I can't seem to remember what it's called right now. Something like "Universe" I think?
Given that the literal founder of Flutter is in this thread using this introduction multiple times, it’s kinda hard to give you the benefit of the doubt that you honestly just meant “a founder using Flutter”.
What was that phrase about not assuming malice when you could assume something else?
Most likely GP saw the actual Flutter founder announcing themselves as "flutter founder" and assumed it was a common phrase for founders who use Flutter lol.
That is seriously what happened. I didn't realize the actual Flutter authors were here using that phrase. Because it sounds odd to say founder for author of a software package. I genuinely thought those were Founders using Flutter.
I am not the Founder of Flutter. Sorry. I am a founder using Flutter. Just skimmed the comments and saw the phrase used multiple times and thought those were founders using Flutter, and thought to use the same term.
I can understand the test frustration on both sides.
* He's being asked to fix tests that were already failing. That can be an enormous task depending on the nature of the failure and the code base.
* The team doesn't want to merge his PR since they don't have a test. The code is presumably working without his PR and understandably they won't change a single byte without seeing tests work.
> Redundant conditions are nice because they require no changes to the database [...] This makes them useful for queries that are only sometimes run or where indexes can't be easily added to the main conditions
but where possible I typically prefer adding an index on the expression that the query is using (if your database supports it) since it expresses your intent more clearly. A redundant condition is more likely to get "optimized away" by other developers or changes in the query planner.
Yeah, I would think functional/function-based indexes should be mentioned. Add an index or add a redundant column? Almost all of time I would prefer the index.
Of course if your database can't be changed then maybe you can use this trick if there is already a suitable indexed column.