Jurnalul lui vundicind

joi, 13 noiembrie 2025

Avoiding the Planning Trap: Why N×M is a Planning Disaster (The Cartesian Product of Tasks)

I’ve been thinking a lot about planning lately. I realized I’ve fallen into the same trap many times, and understanding the root cause has made all the difference.

My first impulse was always to over-organize. I’d create a neat system:

I'd create a separate folder and file for each family of tasks (e.g., "Work," "Personal," "Learning"). It felt great at the beginning—so structured! But after a few days, it became a hassle.
The same feeling occurred when I tried building a complex goalscape, with first-level goals for task families, and then another level of goals for each individual task.

In both cases, I was creating a system that demanded too much energy just to maintain.

***

I was unknowingly introducing an unnecessary dimension to my planning.

When I used multiple files, it felt like I was generating a final complexity of N x M, where N is the number of files (or task families) and M is the total number of tasks.

***

Then, I tried the opposite: one single file.

In this file, I created a section for each task family, and a simple Table of Contents (ToC) at the beginning. Each task lives in its relevant section.

Immediately, the system simplified. By removing the file-switching dimension, the complexity was converted to a simple line: M, the total number of tasks.

***

This is exactly like when I analyze code.

If I am analyzing a method x() with N lines, and it calls another method y() with M lines, it’s easy to get lost by constantly jumping: analyze a line in x(), then go deep into y(), then jump back to x(), and so on.

The better approach is to stay at the level of x(), or finish analyzing y() completely, but you must avoid the inefficient back-and-forth.

duminică, 19 octombrie 2025

PostgreSQL: ERROR: missing chunk number 0 for toast value 123456 in pg_toast_2619

Problem: After database migration

SELECT * FROM <table_name>

fails with the following message:

ERROR: missing chunk number 0 for toast value 123456 in pg_toast_2619

At first, I found https://gist.github.com/supix/80f9a6111dc954cf38ee99b9dedf187a and https://newbiedba.wordpress.com/2015/07/07/postgresql-missing-chunk-0-for-toast-value-in-pg_toast/, but I could not isolate corrupted row, i.e. every query of the following form

SELECT * FROM <table_name> ORDER BY id LIMIT 5000 OFFSET 0;
SELECT * FROM <table_name> ORDER BY id LIMIT 5000 offset 5000;
...

even

SELECT * FROM <table_name> ORDER BY id LIMIT 1;

failed with the same error message.

A second article (see https://fluca1978.github.io/2021/02/08/PostgreSQLToastCorruption) together with a set of Postgres functions for detecting corrupted toasts (see https://gitlab.com/fluca1978/fluca1978-pg-utils/-/blob/master/examples/toast/find_bad_toast.sql) inspired me how to find the table backed by `pg_toast_2619`, since it was already clear that the problem was not in my <table_name>:

SELECT reltoastrelid, reltoastrelid::regclass, pg_relation_filepath( reltoastrelid::regclass ), relname
FROM   pg_class
WHERE  reltoastrelid = 'pg_toast.pg_toast_2619'::regclass
  AND  relkind = 'r';

The result was pg_statistic and I executed the inverse query to recheck:

SELECT reltoastrelid, reltoastrelid::regclass, pg_relation_filepath( reltoastrelid::regclass ), relname
FROM   pg_class
WHERE  relname = 'pg_statistic'
  AND  relkind = 'r';

the result was pg_toast_2619. Hence the problem was not in <table_name> but in pg_statistic, more precisely in the TOAST of pg_statistic. But why the SELECT * FROM <tabel_name> generated this problem? Because Postgres uses pg_statistic to plan the query. Meanwhile a collegue of mine

found https://www.ongres.com/blog/solving-postgresql-statistic-corruption/. This article shows how to deal with such cases but also shows that pg_toast_2619 is somehow constant between different Postgres installations.

What I learned from this problem:

How to find a table by its corresponding TOAST.
There are hidden actions behind a simple SELECT * FROM <tabel_name>, i.e. PostgreSQL can perform a reading from another table first and then from the table `<tabel_name>` (not everything is as it seems at first sight).

PostgreSQL: ERROR: Key (id)=(4703) is duplicated. could not create unique index ""

Problem: After database migration

REINDEX <tabel_name>;

fails with the following message:

ERROR: Key (id)=(4703) is duplicated. could not create unique index "<table_name>"

It's very strange because a simple

SELECT * FROM <table_name> WHERE id = 4703;

returns only one row. So where does REINDEX find the second (or more) row with id = 4703?

After searching in Google, with "Key (id)=(4703) is duplicated. could not create unique index", I found

https://stackoverflow.com/questions/45315510/postgres-could-not-create-unique-index-key-is-duplicated,

that inspired me to try

SET LOCAL enable_indexscan = off;
SET LOCAL enable_bitmapscan = off;
SET LOCAL enable_indexonlyscan = off;
SELECT * FROM <table_name> WHERE id = 4703;

This query returned two rows. To understand the difference in behavior modify both queries to execute an EXPLAIN SELECT * FROM <table_name> ... instead of SELECT * FROM <table_name> ... and you'll see that Postgres is using the index in the first query to search for the row, that is why 
we don't see the second row, it's not present in the index (by the definition of the index). While the second query is searching for it physically in the table and thus finds both rows.

Why there are two rows with the same id, is another question. My assumption is that one row was marked as deleted in the source database and somehow the data was corrupted and it appeared as non deleted in the destination database. The database was migrated byte by byte, not using .sql export/import.

What I learned from this problem:

How to force Postgres not to use (or use) table's index for a specific query.
A SELECT * FROM <table_name> in most cases is not a physical read of `<table_name>`, i.e. more generally, in Postgres, not everything is as it seems.

miercuri, 20 august 2025

Debugging Hotel Wi-Fi Login on Arch Linux (and a Docker Gotcha)

Recently, while staying at a hotel, I tried connecting my Arch Linux laptop to the Wi-Fi. As usual, the network redirected to a captive portal login page. But this time, things didn’t work as expected.

First, I checked the captive portal detection URL with:

curl -I http://detectportal.firefox.com

That gave me a gateway URL like:

http://127.17.0.1/login.php

But when I tried to open it in a browser, I kept getting “Unable to connect”.

At first, I thought the hotel’s Wi-Fi was broken. Then I noticed something odd: the IP address started with 127.17…. That reminded me of Docker, since I had seen a similar subnet in docker info.

So I ran:

ip addr show

And sure enough, I saw this:

docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 02:42:84:b7:e7:1e brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:84ff:feb7:e71e/64 scope link proto kernel_ll

Aha! Docker was hijacking the IP range 172.17.x.x, which happened to be the same network the hotel’s captive portal was using.

To fix it, I temporarily disabled the Docker bridge:

sudo ip link set docker0 down
sudo ip addr flush dev docker0

After that, I could finally access:

http://172.17.0.1/login.php

and log in to the Wi-Fi.

marți, 12 august 2025

Quickly Format a List for SQL IN Clause Using Notepad++

Let’s say I have a list of strings:

STRING1
STRING2
...
STRING200

And I want to create a query like:

SELECT count(*) 
FROM TABLE 
WHERE COLUMN IN ('STRING1', 'STRING2', ..., 'STRING200');

Quick solution using Notepad++:

Open the file in Notepad++.
In the Find what field, enter:
```
^(.*)$
```
In the Replace with field, enter:
```
'$1',
```
Set Search Mode to Regular expression.
Click Replace All.

This will wrap each string in quotes and add a trailing comma, so you can paste it directly into your SQL IN clause.

joi, 24 iulie 2025

VS Code "Open Folder" Does Nothing on Arch Linux (LXQt)

I installed Visual Studio Code on Arch Linux with LXQt, but clicking "Open Folder" did nothing — no error, no folder picker, just silence.

To debug, I ran:

journalctl -f

Then clicked "Open Folder" again. This time, I saw the error:

xdg-desktop-portal: Backend call failed: Could not activate remote peer 'org.freedesktop.impl.portal.desktop.gtk': activation request failed: unit is masked

Fix:

systemctl --user unmask xdg-desktop-portal-gtk.service

systemctl --user enable --now xdg-desktop-portal-gtk.service

vineri, 6 iunie 2025

Unexpected Behavior: PL/pgSQL Block Ran Forever the First Time

I once encountered a curious situation where a particular DO $$ BEGIN ... END $$; block in PostgreSQL executed forever the first time I ran it. But then — surprisingly — every subsequent execution was relatively fast and uneventful.

The same behavior repeated when I deployed to a new environment: the first run took ages, and then all future runs were fast again.

CREATE TABLE a (...);

DO BEGIN $$
  INSERT INTO a  ... -- many many rows, e.g. 1 million

  UPDATE a ...
  FROM (SELECT a ... JOIN b ... GROUB BY ...) AS c
  WHERE a.some_column=c.some_column; 

END $$;

The solution was to add : ANALYZE a; right after the INSERT INTO a statement.

I was interested in seeing the query plan, and I got an idea from ChatGPT on how to display the result of EXPLAIN inside a DO BEGIN END block.

DO 
DECLARE
  q text;
  r text;
BEGIN $$
  INSERT INTO a  ... -- many many rows, e.g. 1 million

  q = 'UPDATE a ...
  FROM (SELECT a ... JOIN b ... GROUB BY ...) AS c
  WHERE a.some_column=c.some_column';

  FOR r IN EXECUTE 'EXPLAIN ' || q
  LOOP
    RAISE NOTICE '%', r;
  END LOOP;
END $$;

The query plan for the first version

NOTICE:  Update on a  (cost=2355271.49..1300407104.89 rows=0 width=0)
NOTICE:    ->  Nested Loop  (cost=2355271.49..1300407104.89 rows=858585 width=86)
NOTICE:          Join Filter: (a.some_column = c.some_column)
NOTICE:          ->  Seq Scan on a  (cost=0.00..16320.85 rows=858585 width=14)
NOTICE:          ->  Materialize  (cost=2355271.49..1297825633.93 rows=200 width=104)
NOTICE:                ->  Subquery Scan on c  (cost=2355271.49..1297825632.93 rows=200 width=104)

i.e for each row form a there will be a loop in c.

The query plan after ANALYZE

NOTICE:  Update on a  (cost=682681.45..725641.23 rows=0 width=0)
NOTICE:    ->  Hash Join  (cost=682681.45..725641.23 rows=200 width=86)
NOTICE:          Hash Cond: (a.some_column = c.some_column)
NOTICE:          ->  Seq Scan on a  (cost=0.00..37022.45 rows=2261645 width=14)
NOTICE:          ->  Hash  (cost=682678.95..682678.95 rows=200 width=104)
NOTICE:                ->  Subquery Scan on c  (cost=682674.45..682678.95 rows=200 width=104)