kemono2/docs/database.md
2024-11-26 00:11:49 +01:00

2.3 KiB

Database

Schema reference

The folder /db/schema/<schema_name> holds the information about relevant symbols for the codebase in <schema_name>.
This information is plain SQL files with only symbol declarations without any data-modifying queries.

The main usecase is to be able to look into database schema without reving up the docker stack/setting up database.
The other perk coming from this is schema reference data is bound to commits, so it allows to diff schema changes
before trying to make sense of migrations.

From this point on (2024-07-03T00:00:0.000Z) any commit which includes schema-changing migrations
must also include changes in the schema reference.

File structure

There is no particular structure beyond having files in a specific schema folder.
But in general small enough declarations without any category go into schema.sql,
otherwise they go into separate files. If a separate file becomes too big, then it goes into
a separate folder split into separate files.

Migrations already include this information

They do, but as of time of writing (2024-07-03T00:00:0.000Z) there 75 migration files with a ton of data-changing boilerplate.

Why like this?

It's quick, located alongside migrations and (eventually) queries and doesn't require to learn some goofy DSL
which will eventually fall behind the new features of Postgresql.

The alternative considered was relying on pg_dump with --schema-only argument
but it has pretty big drawbacks:

  • it requires a functioning and running database to work, which isn't going to always happen during development.
  • The output SQL is not a clean one, i.e. it includes system-specific data, which is irrelevant to the repo code.
    That means it has to be parsed and transformed which is not gonna happen.
  • Included symbols make sense in context of the command, but not in the context of repo. I.e. the repo doesn't need to know the user ownership of symbols. That means the repo has to store all symbol names somewhere in the first place. And that list has to be updated manually anyway.
  • The output SQL is written in a desugarized form, which is a dealbreaker since migrations are written in sugar
    and this disparity only makes it harder to understand them and the whole schema.