vineri, 6 iunie 2025

Unexpected Behavior: PL/pgSQL Block Ran Forever the First Time

 I once encountered a curious situation where a particular DO $$ BEGIN ... END $$; block in PostgreSQL executed forever the first time I ran it. But then — surprisingly — every subsequent execution was relatively fast and uneventful.

The same behavior repeated when I deployed to a new environment: the first run took ages, and then all future runs were fast again.


CREATE TABLE a (...);

DO BEGIN $$
  INSERT INTO a  ... -- many many rows, e.g. 1 million

  UPDATE a ...
  FROM (SELECT a ... JOIN b ... GROUB BY ...) AS c
  WHERE a.some_column=c.some_column; 

END $$;
The solution was to add : ANALYZE a; right after the INSERT INTO a statement.

I was interested in seeing the query plan, and I got an idea from ChatGPT on how to display the result of EXPLAIN inside a DO BEGIN END block.

DO 
DECLARE
  q text;
  r text;
BEGIN $$
  INSERT INTO a  ... -- many many rows, e.g. 1 million

  q = 'UPDATE a ...
  FROM (SELECT a ... JOIN b ... GROUB BY ...) AS c
  WHERE a.some_column=c.some_column';

  FOR r IN EXECUTE 'EXPLAIN ' || q
  LOOP
    RAISE NOTICE '%', r;
  END LOOP;
END $$;
The query plan for the first version

NOTICE:  Update on a  (cost=2355271.49..1300407104.89 rows=0 width=0)
NOTICE:    ->  Nested Loop  (cost=2355271.49..1300407104.89 rows=858585 width=86)
NOTICE:          Join Filter: (a.some_column = c.some_column)
NOTICE:          ->  Seq Scan on a  (cost=0.00..16320.85 rows=858585 width=14)
NOTICE:          ->  Materialize  (cost=2355271.49..1297825633.93 rows=200 width=104)
NOTICE:                ->  Subquery Scan on c  (cost=2355271.49..1297825632.93 rows=200 width=104)
i.e for each row form a there will be a loop in c.

The query plan after ANALYZE


NOTICE:  Update on a  (cost=682681.45..725641.23 rows=0 width=0)
NOTICE:    ->  Hash Join  (cost=682681.45..725641.23 rows=200 width=86)
NOTICE:          Hash Cond: (a.some_column = c.some_column)
NOTICE:          ->  Seq Scan on a  (cost=0.00..37022.45 rows=2261645 width=14)
NOTICE:          ->  Hash  (cost=682678.95..682678.95 rows=200 width=104)
NOTICE:                ->  Subquery Scan on c  (cost=682674.45..682678.95 rows=200 width=104)

vineri, 23 mai 2025

Cleaner HTTP Request Assertions with AssertJ

I switched to using AssertJ's fluent API with .satisfies() for verifying MockWebServer requests—and I love it. It keeps the code concise and readable, and best of all, I didn’t need to define an extra variable just to run my assertions.

Original (JUnit-style):

RecordedRequest request = mockWebServer.takeRequest();
assertEquals("/graph", request.getPath());
assertTrue(request.getBody().readUtf8().contains("mutation AddSomething"));

Improved (AssertJ-style):

assertThat(mockWebServer.takeRequest()).satisfies(request -> {
    assertThat(request.getPath()).isEqualTo("/graph");
    assertThat(request.getBody().readUtf8()).contains("mutation AddSomething");
});

marți, 20 mai 2025

Extracting First Names from full_name with LLaMA 3 and Ollama in Kotlin

Assume we have a users table containing the columns full_name, last_name, and first_name. In 10% of the cases, the first_name field is NULL, but we observe that the first name is embedded within the full_name field, just without a clear delimiter (e.g., 'DUMBRAVEANURADU'). Or we can make the problem more difficult by considering compound first names, which are also not separated by spaces from the last name (e.g., 'IONESCUANDREIMIHAIL', 'POPESCOANAMARIA', 'VASILESCUIONAELENA', 'TOMAIOANDANIEL', 'MOLDOVANMARIAGABRIELA').

I would like to tell my experience of solving such a problem.

1. Export all rows where first_name is NULL into a data.csv file (PgAdmin > Save results to file).

2. I started Ollama and loaded the LLaMA 3 model locally

docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
docker exec -it 57b53e2bc095 ollama pull llama3 
3. I Created a Kotlin Project and Asked ChatGPT for Code Fragments
build.gradle.kts
dependencies {
    implementation("com.squareup.okhttp3:okhttp:4.12.0")
    implementation("com.fasterxml.jackson.module:jackson-module-kotlin:2.17.0")
    implementation("com.fasterxml.jackson.core:jackson-databind:2.17.0")
    testImplementation(kotlin("test"))
}

4. I Asked ChatGPT How to Construct the Prompt
Main.kt
import okhttp3.MediaType.Companion.toMediaType
import okhttp3.OkHttpClient
import okhttp3.Request
import okhttp3.RequestBody.Companion.toRequestBody
import com.fasterxml.jackson.module.kotlin.jacksonObjectMapper
import com.fasterxml.jackson.module.kotlin.readValue
import java.io.File
import java.io.PrintWriter
import java.util.concurrent.TimeUnit

fun main() {
    val client = OkHttpClient.Builder()
        .connectTimeout(10, TimeUnit.SECONDS)  // Time to establish connection
        .readTimeout(30, TimeUnit.SECONDS)     // Time to wait for server response
        .writeTimeout(15, TimeUnit.SECONDS)    // Time to send request data
        .build()

    val mapper = jacksonObjectMapper()

    val inputStream = object {}.javaClass.getResourceAsStream("/data-1746619055858.csv")
        ?: error("File not found in resources")

    val writer = PrintWriter(File("output.csv"))

    writer.println(listOf("id", "full_name", "last_name", "first_name", "sql")
        .joinToString(",") { "\"$it\"" })

    inputStream.bufferedReader().useLines { lines ->
        lines
            .drop(1) // Skip the header line
            .take(100)
            .forEach {
                val columns = it.split(",")
                    .map { it.trim('"') } // Remove surrounding quotes

                val id = columns[0]
                val fullName = columns[1]
                
                println("ID: $id, Name: $fullName")

                val json = mapper.writeValueAsString(
                    mapOf(
                        "model" to "llama3",
//                        "system" to "The following string contains a full name in uppercase without clear delimiters between the last name(s) and the first name(s).\nPlease analyze the name and split it into two parts:\n\nlastName: which can include compound surnames.\n\nfirstName: whichANDREI-MIHAIL may also be a compound name.\nTry to detect where the first name begins even if it is attached directly to the last name (e.g., DUMBRAVEANURADU → DUMBRAVEANU + RADU).\nReturn the result in JSON format like:\n{ \"lastName\": \"...\", \"firstName\": \"...\" }",
                        "system" to """
                            The following string contains a full name in uppercase without clear delimiters between the last name(s) and the first name(s).
                            Your task is to analyze and split it into two parts:
                            - "lastName: which can include compound surnames"
                            - "firstName: which may also be a compound name
                            Try to detect where the first name begins even if it is attached directly to the last name (e.g., DUMBRAVEANURADU → DUMBRAVEANU + RADU).     
                            Important: Return only a JSON object with the following format:
                            { "lastName": "...", "firstName": "..." }
                            Do not explain or add any other text. Only return valid JSON.
                           
                            Examples:
                            - "DUMBRAVEANURADU" should be split into "lastName": "DUMBRAVEANU", "firstName": "RADU"
                            - "IONESCUANDREIMIHAIL" should be split into "lastName": "IONESCU", "firstName": "ANDREI-MIHAIL"
                            - "POPESCOANAMARIA" should be split into "lastName": "POPESCO", "firstName": "ANA-MARIA"                                                       
                        """.trimIndent(),
//                        "prompt" to participantName,
                        "prompt" to "Name: $participantName\nReturn JSON only.",
                        "stream" to false
                    )
                )
                

                val request = Request.Builder()
                    .url("http://localhost:11434/api/generate")
                    .post(json.toRequestBody("application/json".toMediaType()))
                    .build()

                client.newCall(request).execute().use { response ->
                    if (!response.isSuccessful) {
                        println("Unexpected response: $response")
                    } else {
                        val body = response.body?.string()
                        val parsed = mapper.readValue<Map<String, Any>>(body!!)
                        println(parsed["response"])
                        val response =  mapper.readValue<Map<String, Any>>(parsed["response"].toString())
                        val lastName = response?.get("lastName")
                        val firstName = response?.get("firstName")

                        val sql = """
                            UPDATE users
                            SET last_name = '$lastName',
                                first_name = '$firstName'
                            WHERE id = '$id' AND full_name = '$fullName';
                        """.trimIndent().replace("\n", " ")
                        writer.println(listOf(caseNumber, participantName, response?.get("lastName"), response?.get("firstName"), sql)
                            .joinToString(",") { "\"$it\"" })
                    }
                }
            }
    }

    writer.close()
}

5. Execute and Analyze the Output File

vineri, 16 mai 2025

Generating Structured Output with OpenAI Java SDK

When working with OpenAI's API responses in Java, it's often useful to request structured output instead of plain text. This enables better parsing, validation, and downstream processing. The OpenAI Java SDK supports this through JSON Schema-based formats.

Reference: StructuredOutputsExample.java (GitHub) 

https://github.com/openai/openai-java/blob/main/openai-java-example/src/main/java/com/openai/example/StructuredOutputsExample.java

            ResponseFormatTextJsonSchemaConfig.Schema schema = ResponseFormatTextJsonSchemaConfig.Schema.builder()
                    .putAdditionalProperty("type", JsonValue.from("object"))
                    .putAdditionalProperty(
                        "properties", JsonValue.from(Map.of("employees", Map.of("type", "array", "items", Map.of("type", "string")))))
                    .putAdditionalProperty("required", JsonValue.from(List.of("employees")))
                    .putAdditionalProperty("additionalProperties", JsonValue.from(false))
                    .build();
            createParams = createParams.toBuilder()
//                .instructions(assistant.instructions())
                .text(ResponseTextConfig.builder()
                    .format(ResponseFormatTextJsonSchemaConfig.builder()
                        .name("employee-list")
                        .schema(schema)
                        .build())
                    .build())
                .build();
                
        openAiClient.getClient().responses().create(createParams).output().stream()
                .flatMap(item -> item.message().stream())
                .flatMap(message -> message.content().stream())
                .flatMap(content -> content.outputText().stream())
                .forEach(outputText -> System.err.println(outputText.text()));       

Convert Assistant defined response format. 

Assistant assistant = openAiClient.getClient().beta()
            .assistants()
            .retrieve(AssistantRetrieveParams.builder()
            .assistantId(openAiAssistantId)
                .build());
ResponseFormatJsonSchema.JsonSchema jsonSchema = assistant.responseFormat().get()
                .asResponseFormatJsonSchema()
                .jsonSchema();
                
                
ResponseFormatTextJsonSchemaConfig.Schema schema = ResponseFormatTextJsonSchemaConfig.Schema.builder()
                .additionalProperties(jsonSchema.schema().get()._additionalProperties())
                .build();
        ResponseCreateParams createParams = ResponseCreateParams.builder()
            .input(input)
            .model(model)
            .text(ResponseTextConfig.builder()
                .format(ResponseFormatTextJsonSchemaConfig.builder()
                    .name(jsonSchema.name())
                    .schema(schema)
                    .build())
                .build())
            .build();                

luni, 5 mai 2025

Configure svenstaro/keycloak-http-webhook-provider for Keycloak 24.0.5

To configure svenstaro/keycloak-http-webhook-provider with Keycloak 24.0.5, I followed these steps:

Clone the repository: https://github.com/svenstaro/keycloak-http-webhook-provider

Create a Dockerfile in the root of the repository with the following content to build a custom Keycloak image that includes the webhook provider:

FROM quay.io/keycloak/keycloak:24.0.5-0

COPY target/keycloak_http_webhook_provider.jar /opt/keycloak/providers/

ENTRYPOINT ["/opt/keycloak/bin/kc.sh", "start-dev"]

Build the JAR

Build the Docker image:

docker buildx build --debug -t keycloak-http-webhook-provider .

Run the container with the required environment variables to configure the webhook:

docker run -e KC_SPI_EVENTS_LISTENER_HTTP_WEBHOOK_SERVER_URL=https://your-webhook-url \
           -e KC_SPI_EVENTS_LISTENER_HTTP_WEBHOOK_USERNAME=youruser \
           -e KC_SPI_EVENTS_LISTENER_HTTP_WEBHOOK_PASSWORD=yourpassword \
           -e KEYCLOAK_ADMIN=admin \
           -e KEYCLOAK_ADMIN_PASSWORD=admin \
           -p 8080:8080 keycloak-http-webhook-provider

luni, 28 aprilie 2025

Short Example: Streaming OpenAI Chat Completions in Java Using WebClient

 

Here is a short example of how to stream OpenAI chat completions using Spring's WebClient and Flux in Java.

First, set up your WebClient:

 

private final WebClient webClient;

public YourClassName(String openAiBaseUrl, String openAiApiKey) {
    this.webClient = WebClient.builder()
        .baseUrl(openAiBaseUrl)
        .defaultHeader("Authorization", "Bearer " + openAiApiKey)
        .build();
}

Then, send a request to stream chat completions:

public Flux<ChatCompletionResponseChunk> streamChatCompletion(ChatCompletionRequest request) {
    return webClient.post()
        .uri("/chat/completions")
        .header("Accept", "text/event-stream")
        .contentType(MediaType.APPLICATION_JSON)
        .bodyValue(request)
        .retrieve()
        .bodyToFlux(ChatCompletionResponseChunk.class)
        .onErrorResume(error -< {
            // See https://hilla.dev/blog/ai-chatbot-in-java/calling-chatgpt-and-openai-apis-in-spring-boot/
            // The stream terminates with a `[DONE]` message, which causes a serialization error.
            // Ignore this error and return an empty stream instead.
            if (error.getMessage().contains("JsonToken.START_ARRAY")) {
                return Flux.empty();
            } else {
                // If the error is not caused by the `[DONE]` message, propagate the error.
                return Flux.error(error);
            }
        });
}

sâmbătă, 29 martie 2025

Monitoring HikariCP Metrics Without Micrometer or OpenTelemetry

You can quickly monitor HikariCP connection pool metrics without enabling Micrometer or OpenTelemetry. Just follow these two simple steps:

1. Enable Debug Logging in logback-spring.xml. Add one of the following logging configurations: 

For general Hikari pool logging:

<logger name="com.zaxxer.hikari.pool.HikariPool" level="DEBUG" />
For more detailed logging:
<logger name="com.zaxxer.hikari" level="DEBUG" />
2. Set the Housekeeping Interval via VM Option Set the following JVM option to adjust the logging frequency (default is 30s; this reduces it to 5 seconds):
-Dcom.zaxxer.hikari.housekeeping.periodMs=5000

Example Log Output 

Once enabled, you’ll start seeing logs like this:

2025-03-29 16:30:07,869 DEBUG c.z.h.p.HikariPool[HikariPool-1 housekeeper] - HikariPool-1 - Pool stats (total=2, active=0, idle=2, waiting=0)

Breakdown of Pool Stats: 

  • total=2 - The total number of connections in the pool (both active and idle). 
  • active=0 - The number of connections currently in use (0 means no connections are being used). 
  • idle=2 - The number of available (idle) connections that are not in use. 
  • waiting=0 - The number of threads waiting for a connection (0 means no requests are waiting). 

By enabling this, you get real-time insights into your connection pool without additional monitoring tools.