I once encountered a curious situation where a particular DO $$ BEGIN ... END $$; block in PostgreSQL executed forever the first time I ran it. But then — surprisingly — every subsequent execution was relatively fast and uneventful.
The same behavior repeated when I deployed to a new environment: the first run took ages, and then all future runs were fast again.
CREATE TABLE a (...);
DO BEGIN $$
INSERT INTO a ... -- many many rows, e.g. 1 million
UPDATE a ...
FROM (SELECT a ... JOIN b ... GROUB BY ...) AS c
WHERE a.some_column=c.some_column;
END $$;
The solution was to add : ANALYZE a; right after the INSERT INTO a statement.I was interested in seeing the query plan, and I got an idea from ChatGPT on how to display the result of EXPLAIN inside a DO BEGIN END block.
DO DECLARE q text; r text; BEGIN $$ INSERT INTO a ... -- many many rows, e.g. 1 million q = 'UPDATE a ... FROM (SELECT a ... JOIN b ... GROUB BY ...) AS c WHERE a.some_column=c.some_column'; FOR r IN EXECUTE 'EXPLAIN ' || q LOOP RAISE NOTICE '%', r; END LOOP; END $$;The query plan for the first version
NOTICE: Update on a (cost=2355271.49..1300407104.89 rows=0 width=0) NOTICE: -> Nested Loop (cost=2355271.49..1300407104.89 rows=858585 width=86) NOTICE: Join Filter: (a.some_column = c.some_column) NOTICE: -> Seq Scan on a (cost=0.00..16320.85 rows=858585 width=14) NOTICE: -> Materialize (cost=2355271.49..1297825633.93 rows=200 width=104) NOTICE: -> Subquery Scan on c (cost=2355271.49..1297825632.93 rows=200 width=104)
i.e for each row form a there will be a loop in c.
The query plan after ANALYZE
NOTICE: Update on a (cost=682681.45..725641.23 rows=0 width=0) NOTICE: -> Hash Join (cost=682681.45..725641.23 rows=200 width=86) NOTICE: Hash Cond: (a.some_column = c.some_column) NOTICE: -> Seq Scan on a (cost=0.00..37022.45 rows=2261645 width=14) NOTICE: -> Hash (cost=682678.95..682678.95 rows=200 width=104) NOTICE: -> Subquery Scan on c (cost=682674.45..682678.95 rows=200 width=104)
Niciun comentariu:
Trimiteți un comentariu