Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc: PL/pgSQL to convert between bytea to crockford base32 #6

Open
coolaj86 opened this issue Sep 19, 2024 · 0 comments
Open

doc: PL/pgSQL to convert between bytea to crockford base32 #6

coolaj86 opened this issue Sep 19, 2024 · 0 comments

Comments

@coolaj86
Copy link
Contributor

coolaj86 commented Sep 19, 2024

My ultimate goal is to convert between UUIDv4 and UUIDv7 to a more friendly prefixed base32 (like what Stripe and many other API providers do).

There may be minor tweaks that could be done (arrays vs substrings vs maps) that could yield significant gains for different use cases (ours is just short 128-bit IDs, not large blobs).

PL/pgSQL function to encode bytea to crockford base32

CREATE
OR REPLACE function bytea_to_crockford_base32 (input_bytea bytea) returns TEXT language plpgsql AS $$
DECLARE
    crockford_chars TEXT := '0123456789abcdefghjkmnpqrstvwxyz';
    input_len INT := LENGTH(input_bytea);
    result TEXT := '';
    byte INTEGER;
    bits INT := 0;
    value INTEGER := 0;
    buffer TEXT[] := '{}';  -- Use an array to accumulate characters
BEGIN
    FOR i IN 0..input_len-1 LOOP
        -- Get each byte and process its 8 bits
        byte := get_byte(input_bytea, i);
        bits := bits + 8;
        value := (value << 8) + byte;

        -- While we have at least 5 bits, map it to a Crockford character
        WHILE bits >= 5 LOOP
            bits := bits - 5;
            buffer := array_append(buffer, substr(crockford_chars, 1 + ((value >> bits) & 31), 1));
        END LOOP;
    END LOOP;

    -- Handle any remaining bits (pad to 5 bits if necessary)
    IF bits > 0 THEN
        buffer := array_append(buffer, substr(crockford_chars, 1 + ((value << (5 - bits)) & 31), 1));
    END IF;

    -- Join all characters in the buffer into a single result string
    result := array_to_string(buffer, '');

    RETURN result;
END;
$$
;

PL/pgSQL function to decode crockford base32 to bytea

CREATE
OR REPLACE function crockford_base32_to_bytea (input_text TEXT) returns bytea AS $$
DECLARE
    decoded BYTEA := '\x';  -- Initialize as an empty bytea string.
    crockford_map TEXT := '0123456789abcdefghjkmnpqrstvwxyz';  -- Crockford Base32 Alphabet (without I, L, O, and U)
    bits_accum INTEGER := 0;  -- Accumulates the bits as we decode.
    bits_count INTEGER := 0;  -- Keeps track of how many bits we have accumulated.
    ch CHAR;
    value INT;
BEGIN
    -- Loop through each character in the input string.
    FOR i IN 1..length(input_text) LOOP
        ch := lower(substr(input_text, i, 1));  -- Extract and uppercase the character.
        -- Get the corresponding value from the crockford_map.
        value := position(ch IN crockford_map) - 1;

        IF value = -1 THEN
            RAISE EXCEPTION 'Invalid character in base32 string: %', ch;
        END IF;

        -- Accumulate the value's bits into the accumulator.
        bits_accum := (bits_accum << 5) | value;
        bits_count := bits_count + 5;

        -- If we've accumulated at least 8 bits, extract a byte.
        WHILE bits_count >= 8 LOOP
            decoded := decoded || chr((bits_accum >> (bits_count - 8)) & 255)::BYTEA;
            bits_count := bits_count - 8;
        END LOOP;
    END LOOP;

    -- If there are leftover bits that don't form a full byte, ignore them (padding).
    RETURN decoded;
END;
$$ language plpgsql
;

Alternate bytea_to_crockford_base32

This is probably not as performant as the other, but it also works.

CREATE
OR REPLACE function bytea_to_crockford_base32_naive (input_bytea bytea) returns TEXT language plpgsql AS $$
DECLARE
    crockford_chars TEXT[] := '{0,1,2,3,4,5,6,7,8,9,a,b,c,d,e,f,g,h,j,k,m,n,p,q,r,s,t,v,w,x,y,z}';
    input_len INT;
    result TEXT := '';
    byte INTEGER;
    bits INT := 0;
    value INTEGER := 0;
BEGIN
    input_len := LENGTH(input_bytea);

    FOR i IN 1..input_len LOOP
        -- Get each byte and process its 8 bits
        byte := get_byte(input_bytea, i - 1);
        bits := bits + 8;
        value := (value << 8) + byte;

        -- While we have at least 5 bits, map it to a Crockford character
        WHILE bits >= 5 LOOP
            bits := bits - 5;
            result := result || crockford_chars[1 + ((value >> bits) & 31)];
        END LOOP;
    END LOOP;

    -- Handle any remaining bits (pad to 5 bits if necessary)
    IF bits > 0 THEN
        result := result || crockford_chars[1 + ((value << (5 - bits)) & 31)];
    END IF;

    RETURN result;
END;
$$
;

License

The code snippets and documentation in this issue are licensed under the CC0-1.0 (Public Domain), as follows:

PostgreSQL Crockford Base32 - convert between bytea and crockford base32

Authored in 2024 by AJ ONeal aj@therootcompany.com
To the extent possible under law, the author(s) have dedicated all copyright
and related and neighboring rights to this software to the public domain
worldwide. This software is distributed without any warranty.

You should have received a copy of the CC0 Public Domain Dedication along with
this software. If not, see https://creativecommons.org/publicdomain/zero/1.0/

@coolaj86 coolaj86 changed the title add PL/pgSQL function to convert between bytea to crockford base32 doc: PL/pgSQL to convert between bytea to crockford base32 Sep 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant