Dudemanguy's Musings

四人囃子 - 一触即発

四人囃子 - 一触即発 (Album Review)

Wayland Isn't Going to Save The Linux Desktop

Wayland Isn't Going to Save The Linux Desktop (Article)

Gothic/Post-punk Visual Kei of the 90s

Gothic/Post-punk Visual Kei of the 90s (Article)

Time for a Change

Time for a Change (Article)

Top Ten Albums of 2019

Top Ten Albums of 2019 (Article)

s6 Deserves More Love

s6 Deserves More Love (Article)

Spotify is Cancer

Spotify is Cancer (Article)

Unexpect - In a Flesh Aquarium

Unexpect - In a Flesh Aquarium (Album Review)

Unexpect - _We, Invaders

Unexpect - _We, Invaders (EP Review)

Gyze - Asian Chaos

Gyze - Asian Chaos (Album Review)

IRON ATTACK! - Japonism

IRON ATTACK! - Japonism (Album Review)

A Fresh Start

A Fresh Start (Article)

Devil Within - Dark Supremacy

Devil Within - Dark Supremacy (Album Review)

ARESZ - Beat Blast Spiral

ARESZ - Beat Blast Spiral (Album Review)

Lux Occulta - My Guardian Anger

Lux Occulta - My Guardian Anger (Album Review)

Valthus - Remains of Memory

Valthus - Remains of Memory (EP Review)

Hidden - Embalm 〜Enbalm After 20 Years〜

Hidden - Embalm 〜Enbalm After 20 Years〜 (Album Review)

La'cryma Christi - Dwellers of a Sandcastle

La'cryma Christi - Dwellers of a Sandcastle (Album Review)

Galneryus - Under the Force of Courage

Galneryus - Under the Force of Courage (Album Review)

Luna Sea - LUV

Luna Sea - LUV (Album Review)

Terror Squad - the wild stream of eternal sin

Terror Squad - the wild stream of eternal sin (Album Review)

Shellshock - 肆 - SHI -

Shellshock - 肆 - SHI - (Album Review)

Lovebites - The Lovebites EP

Lovebites - The Lovebites EP EP review

Regnum Caelorum et Gehenna - Dimersity 03 : Verum cur non Audimus

Regnum Caelorum et Gehenna - Dimersity 03 : Verum cur non Audimus (Album Review)

Hollow Mellow - Reincarnation

Hollow Mellow - Reincarnation (Album Review)

Light Bringer - Heartful Message

Light Bringer - Heartful Message (EP Review)

電気式華憐音楽集団 - DETONATOR

電気式華憐音楽集団 - DETONATOR (Album Review)

ARESZ - GRATING

ARESZ - GRATING (Album Review)

黒夢 - 迷える百合達～Romance of Scarlet～

黒夢 - 迷える百合達～Romance of Scarlet～ (Album Review)

Versailles - The Greatest Hits 2007-2016

Versailles - The Greatest Hits 2007-2016 (Album Review)

D - Neo culture -Beyond the world-

D - Neo culture -Beyond the world- (Album Review)

黒夢 - 亡骸を・・・

黒夢 - 亡骸を・・・(Album Review)

D - Tafel Anatomie

D - Tafel Anatomie (Album Review)

Hizaki - Rosario

Hizaki - Rosario (Album Review)

Watchtower - Concepts of Math: Book One

Watchtower - Concepts of Math: Book One (EP Review)

陰陽座 - 鬼哭転生

陰陽座 - 鬼哭転生 (Album Review)

Dir En Grey - Macabre

Dir En Grey - Macabre (Album Review)

Kamijo - Heart

Kamijo - Heart (Album Review)

MinstreliX - Memoirs

MinstreliX - Memoirs (Album Review)

Octaviagrace - Recollect Storia

Octaviagrace - Recollect Storia (EP Review)

Dir En Grey - Gauze

Dir En Grey - Gauze (Album Review)

Doom - Complicated Mind

Doom - Complicated Mind (Album Review)

Doom - Killing Field

Doom - Killing Field (Album Review)

黒夢 - 生きていた中絶児・・・・

黒夢 - 生きていた中絶児・・・・ (EP Review)

Jupiter - Topaz

Jupiter - Topaz (Single Review)

Jupiter - Blessing of the Future

Jupiter - Blessing of the Future (Single Review)

Hizaki - Dance with grace

Hizaki - Dance with grace (EP Review)

Hizaki - Maiden Ritual (EP Review)

Mysterious Priestess - 夢国ノ義士

Mysterious Priestess - 夢国ノ義士 (Album Review)

Mysterious Priestess - Agency of Fate

Mysterious Priestess - Agency of Fate (Album Review)

人間椅子 - 無限の住人

人間椅子 - 無限の住人 (Album Review)

D - The Name of the ROSE

D - The Name of the ROSE (Album Review)

陰陽座 - 魑魅魍魎

陰陽座 - 魑魅魍魎 (Album Review)

Gauntlet - Birthplace of Emperor

Gauntlet - Birthplace of Emperor (Album Review)

MergingMoon - Kamikakushi〜神隠し

MergingMoon - Kamikakushi〜神隠し (Album Review)

Versailles - Versailles

Versailles - Versailles (Album Review)

Art of Gradation - Concentration

Art of Gradation - Concentration (Album Review)

Versailles - Holy Grail

Versailles - Holy Grail (Album Review)

人間椅子 - 二十世紀葬送曲

人間椅子 - 二十世紀葬送曲 (Album Review)

ARESZ - SKILL

ARESZ - SKILL (Album Review)

Versailles - Jubilee -Method of Inheritance-

Versailles - Jubilee -Method of Inheritance- (Album Review)

Dir En Grey - Missa

Dir En Grey - Missa (EP Review)

Gargoyle - 禊

Gargoyle - 禊 (Album Review)

X Japan - Art of Life

X Japan - Art of Life (Album Review)

Octaviagrace - Resonant Cinema

Octaviagrace - Resonant Cinema (Album Review)

Loszeal - Ideal World

Loszeal - Ideal World (Album Review)

Vrain - Rendez Blue

Vrain - Rendez Blue (Album Review)

愛狂います - 心臓。

愛狂います - 心臓。(Album Review)

Jizue - Novel

Jizue - Novel (Album Review)

Jupiter - The History of Genesis

Jupiter - The History of Genesis (Album Review)

Unexpect - Utopia

Unexpect - Utopia (Album Review)

GOTOphobia considered harmful (in C)

2023-02-26T00:00:00+00:00

+ +

Everybody and their grandpa knows (the meme title of) Dijkstra's +Letters to the editor: go to statement considered harmful +(submitted under the title: A case against the goto statement), +but most forget the context of the 60s in which it was written, +things we take for granted were a novelty back then.

+ +

A lot programmers learnt the craft in a world where goto was the main method +of flow control; even in structured languages it was easy for them to fall back +on the learned bad habits and techniques. +On the other hand, today we have the very opposite situation: programmers not +using goto when it's appropriate and abusing other constructs, what ironically +makes code only less readable. They overfocus on the WHAT ("remove goto") +rather than the WHY ("because it improves readability and maintainability").

+ +

Academic teachers parroting "goto evil" while not really understanding the +language they teach only worsens the matter [speaking from experience]. Because +who needs to learn good practices and discipline, right? It's obviously better +to just ignore the topic entirely and let the students later wonder why they get +attacked by velociraptors.

+ +

+
A "goto" is not, in and of itself, dangerous – it is a language feature, +one that directly translates to the jump instructions implemented in machine +code. Like pointers, operator overloading, and a host of other "perceived" +evils in programming, "goto" is widely hated by those who've been bitten by +poor programming. Bad code is the product of bad programmers; in my +experience, a poor programmer will write a poor program, regardless of the +availability of "goto."
+ +
If you think people can't write spaghetti code in a "goto-less" language, I +can send you some lovely examples to disabuse you of that notion. ;)
+ +
Used over short distances with well-documented labels, a "goto" can be more +effective, faster, and cleaner than a series of complex flags or other +constructs. The "goto" may also be safer and more intuitive than the +alternative. A "break" is a goto; a "continue" is a "goto" – these are +statements that move the point of execution explicitly.
+ +
~ Scott Robert Ladd
+

+ +

Linux kernel is one thing, but if even such restrictive coding standard +as MISRA C (2012 edition) can downgrade the prohibition on goto from +required to advisory, I think in regular code we can safely use goto +in judicious manner. Thus I want to present some situations and patterns +where goto could be acceptable (perhaps the best?) choice and you could +maybe want to consider using it.

+ + + +

Error/exception handling & cleanup
+
Restart/retry
- goto-less alternative: loop
- Less trivial example
  - goto version
  - goto-less version
  +
+
Common code in switch statement
+
Nested break, labeled continue
Simple state machines
Jumping into event loop
- goto-less alternative 1: guard flag
- goto-less alternative 2: code duplication
+
Optimizations
Structured Programming with go to Statements

+ +

Error/exception handling & cleanup

+ +

Poster child of using goto – most of the times accepted, often recommended, +sometimes even straight up mandated. This pattern results in a good quality +code, because the operations of the algorithm are structured in a clear order, +while errors and other overhead is handled somewhere else, outside the mainline. +The alternatives make the code less readable as it's hard to spot where the +main code is buried among the error checks.

+ +

From SEI CERT C Coding Standard:

+ +

+
Many functions require the allocation of multiple resources. Failing and +returning somewhere in the middle of this function without freeing all of +the allocated resources could produce a memory leak. It is a common error +to forget to free one (or all) of the resources in this manner, so a goto +chain is the simplest and cleanest way to organize exits while preserving +the order of freed resources.
+

+ +

int* foo(int bar)
+{
+    int* return_value = NULL;
+
+    if (!do_something(bar)) {
+        goto error_1;
+    }
+    if (!init_stuff(bar)) {
+        goto error_2;
+    }
+    if (!prepare_stuff(bar)) {
+        goto error_3;
+    }
+    return_value = do_the_thing(bar);
+
+error_3:
+    cleanup_3();
+error_2:
+    cleanup_2();
+error_1:
+    cleanup_1();
+
+    return return_value;
+}
+

+ +

Randomly taken real-life example from Linux kernel:

// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * MMP Audio Clock Controller driver
+ *
+ * Copyright (C) 2020 Lubomir Rintel <lkundrak@v3.sk>
+ */
+
+static int mmp2_audio_clk_probe(struct platform_device *pdev)
+{
+	struct mmp2_audio_clk *priv;
+	int ret;
+
+	priv = devm_kzalloc(&pdev->dev,
+			    struct_size(priv, clk_data.hws,
+					MMP2_CLK_AUDIO_NR_CLKS),
+			    GFP_KERNEL);
+	if (!priv)
+		return -ENOMEM;
+
+	spin_lock_init(&priv->lock);
+	platform_set_drvdata(pdev, priv);
+
+	priv->mmio_base = devm_platform_ioremap_resource(pdev, 0);
+	if (IS_ERR(priv->mmio_base))
+		return PTR_ERR(priv->mmio_base);
+
+	pm_runtime_enable(&pdev->dev);
+	ret = pm_clk_create(&pdev->dev);
+	if (ret)
+		goto disable_pm_runtime;
+
+	ret = pm_clk_add(&pdev->dev, "audio");
+	if (ret)
+		goto destroy_pm_clk;
+
+	ret = register_clocks(priv, &pdev->dev);
+	if (ret)
+		goto destroy_pm_clk;
+
+	return 0;
+
+destroy_pm_clk:
+	pm_clk_destroy(&pdev->dev);
+disable_pm_runtime:
+	pm_runtime_disable(&pdev->dev);
+
+	return ret;
+}
+

+ +

`goto`-less alternative 1: nested `if`s

+ +

Drawbacks:

nesting (arrow anti-pattern)
potentially duplicated code (see example function from Linux)

+ +

int* foo(int bar)
+{
+    int* return_value = NULL;
+
+    if (do_something(bar)) {
+        if (init_stuff(bar)) {
+            if (prepare_stuff(bar)) {
+                return_value = do_the_thing(bar);
+            }
+            cleanup_3();
+        }
+        cleanup_2();
+    }
+    cleanup_1();
+
+    return return_value;
+}
+

+ +

Example from Linux kernel rewritten

static int mmp2_audio_clk_probe(struct platform_device *pdev)
+{
+    // ...
+    pm_runtime_enable(&pdev->dev);
+
+    ret = pm_clk_create(&pdev->dev);
+    if (!ret) {
+        ret = pm_clk_add(&pdev->dev, "audio");
+        if (!ret) {
+            ret = register_clocks(priv, &pdev->dev);
+            if (!ret) {
+                pm_clk_destroy(&pdev->dev);
+                pm_runtime_disable(&pdev->dev);
+            }
+        } else {
+            pm_clk_destroy(&pdev->dev);
+            pm_runtime_disable(&pdev->dev);
+        }
+    } else {
+        pm_runtime_disable(&pdev->dev);
+    }
+
+    return ret; // original was returning 0 explicitly
+}
+

+ +

And here Microsoft provides us with a lovely example of such "beautiful" nesting +(archived version).

+ +

`goto`-less alternative 2: if not then clean

+ +

Drawbacks:

duplicated code
multiple exit points

+ +

int* foo(int bar)
+{
+    int* return_value = NULL;
+
+    if (!do_something(bar)) {
+        cleanup_1();
+        return return_value;
+    }
+    if (!init_stuff(bar)) {
+        cleanup_2();
+        cleanup_1();
+        return return_value;
+    }
+    if (!prepare_stuff(bar)) {
+        cleanup_3();
+        cleanup_2();
+        cleanup_1();
+        return return_value;
+    }
+
+    cleanup_3();
+    cleanup_2();
+    cleanup_1();
+
+    return do_the_thing(bar);
+}
+

+ +

Example from Linux kernel rewritten

static int mmp2_audio_clk_probe(struct platform_device *pdev)
+{
+    // ...
+    pm_runtime_enable(&pdev->dev);
+
+    ret = pm_clk_create(&pdev->dev);
+    if (ret) {
+        pm_runtime_disable(&pdev->dev);
+        return ret;
+    }
+
+    ret = pm_clk_add(&pdev->dev, "audio");
+    if (ret) {
+        pm_clk_destroy(&pdev->dev);
+        pm_runtime_disable(&pdev->dev);
+        return ret;
+    }
+
+    ret = register_clocks(priv, &pdev->dev);
+    if (ret) {
+        pm_clk_destroy(&pdev->dev);
+        pm_runtime_disable(&pdev->dev);
+        return ret;
+    }
+
+    return 0;
+}
+

+ +

`goto`-less alternative 3: flags

+ +

Drawbacks:

additional variables
"cascading" booleans
potential nesting
potential complicated boolean expressions

+ +

int* foo(int bar)
+{
+    int* return_value = NULL;
+
+    bool flag_1 = false;
+    bool flag_2 = false;
+    bool flag_3 = false;
+
+    flag_1 = do_something(bar);
+    if (flag_1) {
+        flag_2 = init_stuff(bar);
+    }
+    if (flag_2) {
+        flag_3 = prepare_stuff(bar);
+    }
+    if (flag_3) {
+        return_value = do_the_thing(bar);
+    }
+
+    if (flag_3) {
+        cleanup_3();
+    }
+    if (flag_2) {
+        cleanup_2();
+    }
+    if (flag_1) {
+        cleanup_1();
+    }
+
+    return return_value;
+}
+

+ + + +

`goto`-less alternative 3.5: so-far-ok flag

+ +

int foo(int bar)
+{
+    int return_value = 0;
+    bool something_done = false;
+    bool stuff_inited = false;
+    bool stuff_prepared = false;
+    bool oksofar = true;
+
+    if (oksofar) {  // this IF is optional (always execs) but included for consistency
+        if (do_something(bar)) {
+            something_done = true;
+        } else {
+            oksofar = false;
+        }
+    }
+
+    if (oksofar) {
+        if (init_stuff(bar)) {
+            stuff_inited = true;
+        } else {
+            oksofar = false;
+        }
+    }
+
+    if (oksofar) {
+        if (prepare_stuff(bar)) {
+            stuff_prepared = true;
+        } else {
+            oksofar = false;
+        }
+    }
+
+    // Do the thing
+    if (oksofar) {
+        return_value = do_the_thing(bar);
+    }
+
+    // Clean up
+    if (stuff_prepared) {
+        cleanup_3();
+    }
+    if (stuff_inited) {
+        cleanup_2();
+    }
+    if (something_done) {
+        cleanup_1();
+    }
+
+    return return_value;
+}
+

+ +

Example from Linux kernel rewritten

static int mmp2_audio_clk_probe(struct platform_device *pdev)
+{
+    // ...
+    pm_runtime_enable(&pdev->dev);
+
+    bool destroy_pm_clk = false;
+
+    ret = pm_clk_create(&pdev->dev);
+    if (!ret) {
+        ret = pm_clk_add(&pdev->dev, "audio");
+        if (ret) {
+            destroy_pm_clk = true;
+        }
+    }
+    if (!ret) {
+        ret = register_clocks(priv, &pdev->dev);
+        if (ret) {
+            destroy_pm_clk = true;
+        }
+    }
+
+    if (ret) {
+        if (destroy_pm_clk) {
+            pm_clk_destroy(&pdev->dev);
+        }
+        pm_runtime_disable(&pdev->dev);
+        return ret;
+    }
+
+    return 0;
+}
+

Example from Linux kernel rewritten

static int mmp2_audio_clk_probe(struct platform_device *pdev)
+{
+    // ...
+    pm_runtime_enable(&pdev->dev);
+
+    bool destroy_pm_clk = false;
+    bool disable_pm_runtime = false;
+
+    ret = pm_clk_create(&pdev->dev);
+    if (ret) {
+        disable_pm_runtime = true;
+    }
+    if (!ret) {
+        ret = pm_clk_add(&pdev->dev, "audio");
+        if (ret) {
+            destroy_pm_clk = true;
+        }
+    }
+    if (!ret) {
+        ret = register_clocks(priv, &pdev->dev);
+        if (ret) {
+            destroy_pm_clk = true;
+        }
+    }
+
+    if (destroy_pm_clk) {
+        pm_clk_destroy(&pdev->dev);
+    }
+    if (disable_pm_runtime) {
+        pm_runtime_disable(&pdev->dev);
+    }
+
+    return ret;
+}
+

+ +

`goto`-less alternative 4: functions

+ +

Drawbacks:

"Entia non sunt multiplicanda praeter necessitatem"
reading bottom-up instead of top-bottom
may require passing context around

+ +

static inline int foo_2(int bar)
+{
+    int return_value = 0;
+    if (prepare_stuff(bar)) {
+        return_value = do_the_thing(bar);
+    }
+    cleanup_3();
+    return return_value;
+}
+
+static inline int foo_1(int bar)
+{
+    int return_value = 0;
+    if (init_stuff(bar)) {
+        return_value = foo_2(bar);
+    }
+    cleanup_2();
+    return return_value;
+}
+
+int foo(int bar)
+{
+    int return_value = 0;
+    if (do_something(bar)) {
+        return_value = foo_1(bar);
+    }
+    cleanup_1();
+    return return_value;
+}
+

+ +

Example from Linux kernel rewritten

static inline int mmp2_audio_clk_probe_3(struct platform_device* pdev)
+{
+    int ret = register_clocks(priv, &pdev->dev);
+    if (ret) {
+        pm_clk_destroy(&pdev->dev);
+    }
+    return ret;
+}
+
+static inline int mmp2_audio_clk_probe_2(struct platform_device* pdev)
+{
+    int ret = pm_clk_add(&pdev->dev, "audio");
+    if (ret) {
+        pm_clk_destroy(&pdev->dev);
+    } else {
+        ret = mmp2_audio_clk_probe_3(pdev);
+    }
+    return ret;
+}
+
+static inline int mmp2_audio_clk_probe_1(struct platform_device* pdev)
+{
+    int ret = pm_clk_create(&pdev->dev);
+    if (ret) {
+        pm_runtime_disable(&pdev->dev);
+    } else {
+        ret = mmp2_audio_clk_probe_2(pdev);
+        if (ret) {
+            pm_runtime_disable(&pdev->dev);
+        }
+    }
+    return ret;
+}
+
+static int mmp2_audio_clk_probe(struct platform_device* pdev)
+{
+    // ...
+    pm_runtime_enable(&pdev->dev);
+
+    ret = mmp2_audio_clk_probe_1(pdev);
+
+    return ret;
+}
+

+ +

`goto`-less alternative 5: abuse of loops

+ +

Drawbacks:

half of the drawback of goto
half of the drawback of other alternatives
none of the benefits of either of the above
not structural anyway
creates loop which doesn't loop
abuse of one language construct just to avoid using the right tool for the job
less readable
counter intuitive, confusing
adds unnecessary nesting
takes more lines
don't even think about using a legitimate loop somewhere among this mess

+ +

int* foo(int bar)
+{
+    int* return_value = NULL;
+
+    do {
+        if (!do_something(bar)) break;
+        do {
+            if (!init_stuff(bar)) break;
+            do {
+                if (!prepare_stuff(bar)) break;
+                return_value = do_the_thing(bar);
+            } while (0);
+            cleanup_3();
+        } while (0);
+        cleanup_2();
+    } while (0);
+    cleanup_1();
+
+    return return_value;
+}
+

+ +

Example from Linux kernel rewritten

static int mmp2_audio_clk_probe(struct platform_device *pdev)
+{
+    // ...
+    pm_runtime_enable(&pdev->dev);
+
+    do {
+        ret = pm_clk_create(&pdev->dev);
+        if (ret) break;
+
+        do {
+            ret = pm_clk_add(&pdev->dev, "audio");
+            if (ret) break;
+
+            ret = register_clocks(priv, &pdev->dev);
+            if (ret) break;
+        } while (0);
+        pm_clk_destroy(&pdev->dev);
+    } while (0);
+    pm_runtime_disable(&pdev->dev);
+
+    return ret;
+}
+

+ +

Restart/retry

+ +

Common especially on *nix systems when dealing with system calls returning +an error after being interrupted by a signal + setting errno to EINTR +to indicate the it was doing fine and was just interrupted. +Of course, it's not limited to system calls.

+ +

#include <errno.h>
+
+int main()
+{
+retry_syscall:
+    if (some_syscall() == -1) {
+        if (errno == EINTR) {
+            goto retry_syscall;
+        }
+
+        // handle real errors
+    }
+
+    return 0;
+}
+

+ +

I think in this particular case this one level of additional nesting isn't so +bad, but to be fair, without rewriting it I wouldn't be able to fairly present +the goto-less alternative.

+ +

Version with reduced nesting

+ +

#include <errno.h>
+
+int main()
+{
+    int res;
+retry_syscall:
+    res = some_syscall();
+    if (res == -1 && errno == EINTR) {
+        goto retry_syscall;
+    }
+
+    if (res) {
+        // handle real errors
+    }
+
+    return 0;
+}
+

+ +

`goto`-less alternative: loop

+ +

We can of course use a do {} while loop with conditions in while:

+ +

#include <errno.h>
+
+int main()
+{
+    int res;
+    do {
+        res = some_system_call();
+    } while (res == -1 && errno == EINTR);
+
+    if (res == -1) {
+        // handle real errors
+    }
+
+    return 0;
+}
+

+ +

I think both versions are comparatively readable, but goto has slight advantage +by making it immediately clear the looping is not a desirable situation, while +while loop may be misinterpreted as waiting loop.

+ +

Less trivial example

+ +

For those, I'm willing to break the overall monochrome theme of the site and +define colors for syntax highlights. Even with simple parsing done by kramdown +(your code editor would certainty do a better job here), we already notice +labels and goto statements standing out a little from the rest of the code. +Flags on the other hand get lost among other variables.

+ + + +

`goto` version

+ +

#include <string.h>
+
+enum {
+    PKT_THIS_OPERATION,
+    PKT_THAT_OPERATION,
+    PKT_PROCESS_CONDITIONALLY,
+    PKT_CONDITION_SKIPPED,
+    PKT_ERROR,
+    READY_TO_SEND,
+    NOT_READY_TO_SEND
+};
+
+int parse_packet()
+{
+    static int packet_error_count = 0;
+
+    int packet[16] = { 0 };
+    int packet_length = 123;
+    _Bool packet_condition = 1;
+    int packet_status = 4;
+
+    // get packet etc. ...
+
+REPARSE_PACKET:
+    switch (packet[0]) {
+        case PKT_THIS_OPERATION:
+            if (/* problem condition */) {
+                goto PACKET_ERROR;
+            }
+            // ... handle THIS_OPERATION
+            break;
+
+        case PKT_THAT_OPERATION:
+            if (/* problem condition */) {
+                goto PACKET_ERROR;
+            }
+            // ... handle THAT_OPERATION
+            break;
+
+        // ...
+
+        case PKT_PROCESS_CONDITIONALLY:
+            if (packet_length < 9) {
+                goto PACKET_ERROR;
+            }
+            if (packet_condition && packet[4]) {
+                packet_length -= 5;
+                memmove(packet, packet+5, packet_length);
+                goto REPARSE_PACKET;
+            } else {
+                packet[0] = PKT_CONDITION_SKIPPED;
+                packet[4] = packet_length;
+                packet_length = 5;
+                packet_status = READY_TO_SEND;
+            }
+            break;
+
+        // ...
+
+        default:
+PACKET_ERROR:
+            packet_error_count++;
+            packet_length = 4;
+            packet[0] = PKT_ERROR;
+            packet_status = READY_TO_SEND;
+            break;
+    }
+
+    // ...
+
+    return 0;
+}

+ +

`goto`-less version

+ +

#include <string.h>
+
+enum {
+    PKT_THIS_OPERATION,
+    PKT_THAT_OPERATION,
+    PKT_PROCESS_CONDITIONALLY,
+    PKT_CONDITION_SKIPPED,
+    PKT_ERROR,
+    READY_TO_SEND,
+    NOT_READY_TO_SEND
+};
+
+int parse_packet()
+{
+    static int packet_error_count = 0;
+
+    int packet[16] = { 0 };
+    int packet_length = 123;
+    _Bool packet_condition = 1;
+    int packet_status = 4;
+
+    // get packet etc. ...
+
+    _Bool REPARSE_PACKET = true;
+    _Bool PACKET_ERROR = false;
+
+    while (REPARSE_PACKET) {
+        REPARSE_PACKET = false;
+        PACKET_ERROR = false;
+
+        switch (packet[0]) {
+            case PKT_THIS_OPERATION:
+                if (/* problem condition */) {
+                    PACKET_ERROR = true;
+                    break;
+                }
+                // ... handle THIS_OPERATION
+                break;
+
+            case PKT_THAT_OPERATION:
+                if (/* problem condition */) {
+                    PACKET_ERROR = true;
+                    break;
+                }
+                // ... handle THAT_OPERATION
+                break;
+
+                // ...
+
+            case PKT_PROCESS_CONDITIONALLY:
+                if (packet_length < 9) {
+                    PACKET_ERROR = true;
+                    break;
+                }
+                if (packet_condition && packet[4]) {
+                    packet_length -= 5;
+                    memmove(packet, packet+5, packet_length);
+                    REPARSE_PACKET = true;
+                    break;
+                } else {
+                    packet[0] = PKT_CONDITION_SKIPPED;
+                    packet[4] = packet_length;
+                    packet_length = 5;
+                    packet_status = READY_TO_SEND;
+                }
+                break;
+
+                // ...
+
+            default:
+                PACKET_ERROR = true;
+                break;
+        }
+
+        if (PACKET_ERROR) {
+            packet_error_count++;
+            packet_length = 4;
+            packet[0] = PKT_ERROR;
+            packet_status = NOT_READY_TO_SEND;
+            break;
+        }
+    }
+
+    // ...
+
+    return 0;
+}

+ +

Common code in `switch` statement

+ +

This situation may be a good opportunity to check if the code doesn't need to +be refactored altogether; that being said, sometimes you want to have switch +statement where cases make minor changes then run the same code.

+ +

Sure, you could extract the common code into function, but then you need to pass +all the context to it, but that may be inconvenient (for you may need to pass +a lot of parameters or making a dedicated structure, in both cases probably with +pointers) and may increase complexity of the code; in some cases, you may wish +there being only one call to the function instead of multiple.

+ +

So why not just jump to the common code?

+ +

int foo(int v)
+{
+    // ...
+    int something = 0;
+    switch (v) {
+        case FIRST_CASE:
+            something = 2;
+            goto common1;
+        case SECOND_CASE:
+            something = 7;
+            goto common1;
+        case THIRD_CASE:
+            something = 9;
+            goto common1;
+common1:
+            /* code common to FIRST, SECOND and THIRD cases */
+            break;
+
+        case FOURTH_CASE:
+            something = 10;
+            goto common2;
+        case FIFTH_CASE:
+            something = 42;
+            goto common2;
+common2:
+            /* code common to FOURTH and FIFTH cases */
+            break;
+    }
+    // ...
+}
+

+ +

`goto`-less alternative 1: functions

+ +

Drawbacks:

"Entia non sunt multiplicanda praeter necessitatem"
reading bottom-up instead of top-bottom
may require passing context around

+ +

struct foo_context {
+    int* something;
+    // ...
+};
+
+static void common1(struct foo_context ctx)
+{
+    /* code common to FIRST, SECOND and THIRD cases */
+}
+
+static void common2(struct foo_context ctx)
+{
+    /* code common to FOURTH and FIFTH cases */
+}
+
+int foo(int v)
+{
+    struct foo_context ctx = { NULL };
+    // ...
+    int something = 0;
+    ctx.something = &something;
+
+    switch (v) {
+        case FIRST_CASE:
+            something = 2;
+            common1(ctx);
+            break;
+        case SECOND_CASE:
+            something = 7;
+            common1(ctx);
+            break;
+        case THIRD_CASE:
+            something = 9;
+            common1(ctx);
+            break;
+
+        case FOURTH_CASE:
+            something = 10;
+            common2(ctx);
+            break;
+        case FIFTH_CASE:
+            something = 42;
+            common2(ctx);
+            break;
+    }
+    // ...
+}
+

+ +

`goto`-less alternative 2: `if`s

+ +

We can abandon elegance and replace the switch statement with ifs

+ +

int foo(int v)
+{
+    // ...
+    int something = 0;
+    if (v == FIRST_CASE || v == SECOND_CASE || v == THIRD_CASE) {
+        if (v == FIRST_CASE) {
+            something = 2;
+        } else if (v == SECOND_CASE) {
+            something = 7;
+        } else if (v == THIRD_CASE) { // it could be just `else`
+            something = 9;
+        }
+        /* code common to FIRST, SECOND and THIRD cases */
+    } else if (v == FOURTH_CASE || v == FIFTH_CASE) {
+        if (v == FOURTH_CASE) {
+            something = 10;
+        } else {
+            something = 42;
+        }
+        /* code common to FOURTH and FIFTH cases */
+    }
+    // ...
+}
+

+ +

`goto`-less alternative 3: interlacing `if (0)`

+ +

Please, don't, just don't…

+ +

int foo(int v)
+{
+    // ...
+    int something = 0;
+    switch (v) {
+        case FIRST_CASE:
+            something = 2;
+      if (0) {
+        case SECOND_CASE:
+            something = 7;
+      }
+      if (0) {
+        case THIRD_CASE:
+            something = 9;
+      }
+            /* code common to FIRST, SECOND and THIRD cases */
+            break;
+
+        case FOURTH_CASE:
+            something = 10;
+      if (0) {
+        case FIFTH_CASE:
+            something = 42;
+      }
+            /* code common to FOURTH and FIFTH cases */
+            break;
+    }
+    // ...
+}
+

+ +

`goto`-less alternative: capturing lambda

+ +

Yeah, maybe some day…

+ +

Nested `break`, labeled `continue`

+ +

I think this one doesn't require further explanation:

+ +

#include <stdio.h>
+
+int main()
+{
+    for (int i = 1; i <= 5; ++i) {
+        printf("outer iteration (i): %d\n", i);
+
+        for (int j = 1; j <= 200; ++j) {
+            printf("    inner iteration (j): %d\n", j);
+            if (j >= 3) {
+                break; // breaks from inner loop, outer loop continues
+            }
+            if (i >= 2) {
+                goto outer; // breaks from outer loop, and directly to "Done!"
+            }
+        }
+    }
+outer:
+
+    puts("Done!");
+
+    return 0;
+}
+

+ +

We can use analogous mechanism for continue.

+ +

Beej's Guide to C Programming has nice example of using this technique alongside the cleanup one:

+ +

    for (...) {
+        for (...) {
+            while (...) {
+                do {
+                    if (some_error_condition) {
+                        goto bail;
+                    }
+                    // ...
+                } while(...);
+            }
+        }
+    }
+
+bail:
+    // Cleanup here
+

+ +

Without goto, you’d have to check an error condition +flag in all of the loops to get all the way out.

+ +

Simple state machines

+ + + +

The following is a 1:1, not far from verbatim mathematical notation, +implementation of the above state machine:

+ +

_Bool machine(const char* c)
+{
+qA:
+    switch (*(c++)) {
+        case 'x': goto qB;
+        case 'y': goto qC;
+        case 'z': goto qA;
+        default: goto err;
+    }
+
+qB:
+    switch (*(c++)) {
+        case 'x': goto qC;
+        case 'z': goto qB;
+        case '\0': goto F;
+        default: goto err;
+    }
+
+qC:
+    switch (*(c++)) {
+        case 'x': goto qB;
+        case 'y': goto qA;
+        default: goto err;
+    }
+
+F:
+    return true;
+
+err:
+    return false;
+}
+

+ +

Jumping into event loop

+ +

Yeah, yeah, I know jumping into warrants at least a raised eyebrow. +That being said, there are cases when you may want to do just that.

+ +

Here in first iteration program skips increasing variable and goes straight +to allocation. Each following iteration executes code as written, ignoring +completely the label relevant only for the first run; so you do too during +analysis.

+ +

#include <stdio.h>
+#include <fancy_alloc.h>
+
+int main()
+{
+    int* buf = NULL;
+    size_t pos = 0;
+    size_t sz = 8;
+
+    int* temp;
+
+    goto ALLOC;
+    do {
+        if (pos > sz) { // resize array
+            sz *= 2;
+ALLOC:      temp = arrayAllocSmart(buf, sz, pos);
+            /* check for errors */
+            buf = temp;
+        }
+
+        /* do something with buf */
+    } while (checkQuit());
+
+    return 0;
+
+    /* handle errors ... */
+}
+

+ +

`goto`-less alternative 1: guard flag

+ +

I probably says more about the state of my sleep deprived brain than anything +else, but I actually managed to make an honest, very dumb mistake in this +simple snippet. I didn't notice until after examining the assembly output +and seeing way less instructions than expected. Since it's simple, yet quite +severe in consequences, I decided to leave it as an exercise for the reader +to spot the bug (should be easy since you already know about its existence).

+ +

The drawbacks as per usual: nesting and keeping track of flags.

+ +

#include <stdio.h>
+#include <fancy_alloc.h>
+
+int main()
+{
+    int* buf = NULL;
+    size_t pos = 0;
+    size_t sz = 8;
+
+    int ret = 0
+
+    _Bool firstIter = true;
+
+    do {
+        if (pos > sz || firstIter) { // resize array
+            if (!firstIter) {
+                sz *= 2;
+                firstIter = false;
+            }
+
+            int* temp = arrayAllocSmart(buf, sz, pos);
+            /* handle errors ... */
+            buf = temp;
+        }
+
+        /* do something with buf */
+    } while (checkQuit());
+
+    return 0;
+}
+

+ +

`goto`-less alternative 2: code duplication

+ +

The drawback is obvious, thus no further comment.

+ +

#include <stdio.h>
+#include <fancy_alloc.h>
+
+int main()
+{
+    size_t pos = 0;
+    size_t sz = 8;
+
+    int* buf = arrayAllocSmart(NULL, sz, pos);
+    /* handle errors ... */
+
+    do {
+        if (pos > sz) { // resize array
+            sz *= 2;
+            int* temp = arrayAllocSmart(buf, sz, pos);
+            /* handle errors ... */
+            buf = temp;
+        }
+
+        /* do something with buf */
+    } while (checkQuit());
+
+    return 0;
+}
+

+ +

Optimizations

+ + + + + + + +

Structured Programming with go to Statements

+ +

Read at: + [ACM Digital Library] + [PDF] + [HTML]

+ +

If I started from Dijkstra, it's only natural I need to conclude with Knuth.
+Almost anybody who says anything positive about goto refers to this paper. +And rightfully so! To this day it's one of most comprehensive resources +on the topic (it's a go to resource about goto). Perhaps some examples +are quite dated, some concerns less crucial today than back in the days, +but nevertheless it's an excellent read.

+ +

+
One thing we haven't spelled out clearly, however, is what makes some +go to's bad and others acceptable. The reason is that we've really +been directing our attention to the wrong issue, to the objective question of +go to elimination instead of the important subjective question of +program structure. In the words of John Brown, "The act of focusing our +mightiest intellectual resources on the elusive goal of go to-less +programs has helped us get our minds off all those really tough and possibly +unresolvable problems and issues with which today's professional programmer +would otherwise have to grapple." By writing this long article I don't want +to add fuel to the controversy about go to elimination, since that topic +has already assumed entirely too much significance; my goal is to lay that +controversy to rest, and to help direct the discussion towards more fruitful +channels.
+

Few lesser known tricks, quirks and features of C

2023-02-19T00:00:00+00:00

There are some tricks, quirks and features (some quite fundamental to the language!) +which seems to throw even experienced developers off the track. Thus I did a sloppy +job of gathering some of them in this post (in no particular order) with even sloppier +short explanations and/or examples (or quote of thereof).

+ + + + + +

Array pointers
Comma operator
Digraphs, trigraphs and alternative tokens
Designated initializer
Compound literals
Compound literals are lvalues
Multi-character constants
Bit fields
0 bit fields
volatile type qualifier
restrict type qualifier
register type qualifier
Flexible array member
%n format specifier
%.* (minimum field width) format specifier
Other less known format specifiers
Interlacing syntactic constructs
--> "operator"
idx[arr]
Negative array indexes
Constant string concatenation
Using && and || as conditionals
Compile time assumption checking using enums
Ad hoc struct declaration in the return type of a function
"Nested" struct definition is not kept nested
Flat initializer lists
Static array indices in function parameter declarations
Macro Overloading by Argument List Length
Function types
X-Macros
Named function parameters
Combining default, named and positional arguments
Abusing unions for grouping things into namespaces
Matching character classes with sscanf()
Garbage collector
Cosmopolitan Libc
Inline assembly
Object Oriented Programming
Metaprogramming
Evaluate sizeof at compile time by causing duplicate case error

+ +

Array pointers

+ +

Decay-to-pointer makes regular pointers to array usually not needed:

int arr[10];
+
+int* ap0 = arr;        // array decay-to-pointer
+// ap0[2] = ...
+
+int (*ap1)[10] = &arr; // proper pointer to array
+// (*ap1)[2] = ...
+

+ +

But ability to allocate a big multi-dimensional array on heap is nice:

int (*ap3)[90000][90000] = malloc(sizeof *ap3);
+

+ +

With pointers even VLA can find its use (more here):

int (*ap4)[n] = malloc(sizeof *ap4);
+

+ +

Comma operator

+ +

The comma operator is used to separate two or more expressions that are +included where only one expression is expected. When the set of expressions +has to be evaluated for a value, only the right-most expression is considered.

+ +

For example: b = (a=3, a+2); – this code would firstly assign value 3 +to a, and then a+2 would be assigned to variable b. So, at the end, +b would contain value 5 while variable a would be 3.

+ +

On Wikipedia we can find few more examples:

+ +

Digraphs, trigraphs and alternative tokens

+ +

C code may not be portable, but the language itself is probably more portable +than any other; there are system using e.g. EBCDIC encoding instead of ASCII, +to support them C has digraphs and trigraphs – multi-character sequences +treated by the compiler as other characters.

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

Digraph		Trigraph		iso646.h
`<:`	`[`	`??=`	`#`	`and`	`&&`
`:>`	`]`	`??(`	`[`	`and_eq`	`&=`
`<%`	`{`	`??/`	`\`	`bitand`	`&`
`%>`	`}`	`??)`	`]`	`bitor`	`\|`
`%:`	`#`	`??'`	`^`	`compl`	`~`
`%:%:`	`##`	`??<`	`{`	`not`	`!`
——–	———–	`??!`	`\|`	`not_eq`	`!=`
——–	———–	`??>`	`}`	`or`	`\|\|`
——–	———–	`??-`	`~`	`or_eq`	`\|=`
——–	———–	——–	———–	`xor`	`^`
——–	———–	——–	———–	`xor_eq`	`^=`

+ + + +

+ +

Designated initializer

+ +

These allow you to specify which elements of an object (array, structure, union) +are to be initialized by the values following. The order does not matter!

+ +

struct Foo {
+    int x, y;
+    const char* bar;
+};
+
+void f(void)
+{
+    int arr[] = { 1, 2, [5] = 9, [9] = 5, [8] = 8 };
+
+    struct Foo f = { .y = 23, .bar = "barman", .x = -38 };
+
+    struct Foo arr[] = {
+        [10] = {      8,  8,      9 },
+         [8] = {      1,  8,   bar3 },
+        [12] = { .x = 9,     .z = 8 },
+    };
+}
+

+ +

Compound literals

+ +

A compound literal looks like a cast of a brace-enclosed initializer list. +Its value is an object of the type specified in the cast, containing the +elements specified in the initializer.

+ +

#include <stdio.h>
+
+struct Foo { int x, y; };
+
+void bar(struct Foo p)
+{
+    printf("%d, %d", p.x, p.y);
+}
+
+int main(void)
+{
+    bar((struct Foo){2, 3});
+    return 0;
+}
+

+ +

Compound Literals (Using the GNU Compiler Collection (GCC))

+ +

Compound literals are lvalues

+ +

(struct Foo){};
+((struct Foo){}).x = 4;
+&(struct Foo){};
+

+ +

Multi-character constants

+ +

They are implementation dependent and even the standard itself to usually +best avoid them. That being said, using them as self-documenting enums +can be quite handy when you may need to deal with raw memory dumps later on.

+ +

enum state {
+    waiting = 'WAIT',
+    running = 'RUN!',
+    stopped = 'STOP',
+};
+

+ +

For example, on my machine I could localize 'WAIT' like here:

00001120: c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00  .ff...........@.
+00001130: f3 0f 1e fa e9 67 ff ff ff 55 48 89 e5 48 83 ec  .....g...UH..H..
+00001140: 10 c7 45 fc 54 49 41 57 8b 45 fc 89 c6 48 8d 05  ..E.TIAW.E...H..
+00001150: b0 0e 00 00 48 89 c7 b8 00 00 00 00 e8 cf fe ff  ....H...........
+00001160: ff b8 00 00 00 00 c9 c3 f3 0f 1e fa 48 83 ec 08  ............H...

+ +

Bit fields

+ +

Declares a member with explicit width, in bits. Adjacent bit field members may +be packed to share and straddle the individual bytes.

+ +

struct cat {
+    unsigned int legs  : 3;  // 3 bits for legs  (0-4 fit in 3 bits)
+    unsigned int lives : 4;  // 4 bits for lives (0-9 fit in 4 bits)
+};
+

+ +

0 bit fields

+ +

Description from Arm Compiler 6 docs:

+ +

+
A zero-length bit-field can be used to make the following changes:
+
+
Creates a boundary between any bit-fields before the zero-length bit-field +and any bit-fields after the zero-length bit-field. Any bit-fields on +opposite sides of the boundary are treated as non-overlapping memory +locations. This has a consequence for C and C++ programs. The C and C++ +standards require both load and store accesses to a bit-field on one side +of the boundary to not access any bit-fields on the other side of the boundary.
+
Insert padding to align any bit-fields after the zero-length bit-field to +the next available natural boundary based on the type of the zero-length +bit-field. For example, char:0 can be used to align to the next available +byte boundary, and int:0 can be used to align to the next available word boundary.
+
+

+ +

An example taken from the SO answer (with slight changes):

+ +

+
struct bar {
+    unsigned char x : 5;
+    unsigned short  : 0;
+    unsigned char y : 7;
+}
+
+ +
The above in memory would look like this (assuming 16-bit short, ignoring endian):
+ +
char pad pad      short boundary
+ |    |   |        |
+ v    v   v        v
+ xxxxx000 00000000 yyyyyyy0
+
+ +
The zero-length bit field causes the position to move to next short boundary +(or: be placed on the nearest natural alignment for the target platform). +We defined short to be 16-bit, so 16 minus 5 gives 11 bits of padding.
+

+ +

`volatile` type qualifier

+ +

This qualifier tells the compiler that a variable may be accessed by other means +than the current code (e.g. we are dealing with MMIO device), thus to not optimize +away reads and writes to this resource.

+ +

`restrict` type qualifier

+ +

By adding this type qualifier, a programmer hints to the compiler that for +the lifetime of the pointer, no other pointer will be used to access the object +to which it points. This allows the compiler to make optimizations (for example, +vectorization) that would not otherwise have been possible.

+ +

`register` type qualifier

+ +

It suggests that the compiler stores a declared variable in a CPU register +(or some other faster location) instead of in random-access memory. +The location of a variable declared with this qualifier cannot be accessed +(but the sizeof operator can be applied).

+ +

Nowadays register is usually meaningless as modern compilers place variables +in a register if appropriate regardless of whether the hint is given. Sometimes +may it be useful on embedded systems, but even then compiler will probably +provide better optimizations.

+ +

Flexible array member

+ +

From Wikipedia:

+ +

struct vectord {
+    short len;    // there must be at least one other data member
+    double arr[]; // the flexible array member must be last
+
+    // The compiler may reserve extra padding space here,
+    //   like it can between struct members.
+};
+
+struct vectord *vector = malloc(...);
+vector->len = ...;
+for (int i = 0; i < vector->len; ++i) {
+     vector->arr[i] = ...;  // transparently uses the right type (double)
+}
+

+ +

`%n` format specifier

+ +

This StackOverflow answer presents it reasonably well:

+ +

%n returns the current position of the imaginary cursor used when printf() formats its output.

+ +

int pos1, pos2;
+const char* str_of_unknown_len = "we don't care about the length of this";
+
+printf("Write text of unknown %n(%s)%n length\n", &pos1, str_of_unknown_len, &pos2);
+printf("%*s\\%*s/\n", pos1, " ", pos2-pos1-2, " ");
+printf("%*s", pos1+1, " ");
+for (int i = pos1+1; i < pos2-1; ++i) {
+    putc('-', stdout);
+}
+putc('\n', stdout);
+

+ +

will have following output

+ +

Write text of unknown (we don't care about the length of this) length
+                      \                                      /
+                       --------------------------------------
+

+ +

Granted a little bit contrived but can have some uses when making pretty reports.

+ +

`%.*` (minimum field width) format specifier

+ +

Instead of this:

char fmt_buf[MAX_BUF];
+snprintf(fmt_buf, MAX_BUF, "%%.%df", prec);
+printf(fmt_buf, num);
+

do this:

printf("%.*f", prec, num);
+

when you want to pad with variable number of characters.

+ +

Other less known format specifiers

+ +

Have a look at §7.21.6.1 +and §7.21.6.2 +of the draft of C11 standard. You'll find %#, %e, %-, %+, %j, %g, %a and few other interesting specifiers.

+ + + +

Interlacing syntactic constructs

+ +

The following is syntactically correct C code:

#include <stdio.h>
+
+int main()
+{
+    int n = 3;
+    int i = 0;
+
+    switch (n % 2) {
+        case 0:
+            do {
+                ++i;
+        case 1:
+                ++i;
+            } while (--n > 0);
+
+    }
+
+    printf("%d\n", i); // 5
+}
+

+ +

I know gotophobic programmers using it like this:

    switch (x) {
+        case 1:
+            // 1 specific code
+
+      if (0) {
+        case 2:
+            // 2 specific code
+      }
+
+            // common for 1 and 2
+    }
+

+ +

The most famous usage of this quirk/"feature" is Duff's device:

send(to, from, count)
+    register short *to, *from;
+    register count;
+{
+    register n = (count + 7) / 8;
+    switch (count % 8) {
+    case 0: do { *to = *from++;
+    case 7:      *to = *from++;
+    case 6:      *to = *from++;
+    case 5:      *to = *from++;
+    case 4:      *to = *from++;
+    case 3:      *to = *from++;
+    case 2:      *to = *from++;
+    case 1:      *to = *from++;
+            } while (--n > 0);
+    }
+}
+

+ +

`-->` "operator"

+ +

The following is correct C code:

+ +

size_t n = 10;
+while (n --> 0) {
+    printf("%d\n", n);
+}
+

+ +

You may ask, since when C has such operator and the answer is: since never. +--> is not an operator, but two separate operators -- and > written +in a way they look like one. It's possible, because C cares less than more +about whitespace.

+ +

n --> 0 is equivalent of (n--) > 0

+ +

`idx[arr]`

+ +

Square brace notation of accessing array elements is a syntactic sugar for pointer arithmetics:

+ +

arr[5] ≡ *(arr + 5) ≡ *(5 + arr) ≡ 5[arr]

+ +

You absolutely must never use this in actual code… but it's hella fun otherwise!

+ +

// array[index]
+boxes[products[myorder.product].box].weight;
+
+// index[array]
+myorder.product[products].box[boxes].weight;
+

+ +

Negative array indexes

+ +

For quick and dirty debugging purposes I wanted to check if padding at the end +of an array is filled with correct value, but I didn't know where the padding +starts. Thus I did the following:

+ +

int* end = arr + (len - 1);
+if (end[0] == VAL && end[-1] == VAL && end[-5] == VAL) {
+    puts("Correct padding");
+}
+

+ +

Constant string concatenation

+ +

You don't need sprintf() (nor strcat()!) to concatenate strings literals:

+ +

#define WORLD "World!"
+const char* s = "Hello " WORLD "\n"
+                "It's a lovely day, "
+                "innit?";
+

+ +

Using `&&` and `||` as conditionals

+ +

If you write Shell scripts, you know what I mean.

+ +

#include <stdio.h>
+#include <stdbool.h>
+
+int main(void)
+{
+    1 && puts("Hello");
+    0 && puts("I won't");
+    1 && puts("World!");
+    0 && puts("be printed");
+    1 || puts("I won't be printed either");
+    0 || puts("But I will!");
+
+    true && (9 > 2) && puts("9 is bigger than 2");
+
+    isdigit('9') && puts("9 is a digit");
+    isdigit('n') && puts("n is a digit") || puts("n is NOT a digit!");
+
+    return 0;
+}
+

+ +

The compiler will probably scream warnings at you +as it's really uncommon to do this in C code.

+ +

Compile time assumption checking using `enum`s

+ +

#define D 1
+#define DD 2
+
+enum CompileTimeCheck
+{
+    MAKE_SURE_DD_IS_TWICE_D = 1/(2*(D) == (DD)),
+    MAKE_SURE_DD_IS_POW2    = 1/((((DD) - 1) & (DD)) == 0)
+};
+

+ +

Can be useful for libraries with compile-time configurable constants.

+ +

Ad hoc `struct` declaration in the return type of a function

+ +

You can define structs in very (at first glance) random places:

+ +

#include <stdio.h>
+
+struct Foo { int a, b, c; } make_foo(void) {
+    struct Foo ret = { .c = 3 };
+    ret.a = 11 + ret.c;
+    ret.b = ret.a * 3;
+    return ret;
+}
+
+int main()
+{
+    struct Foo x = make_foo();
+    printf("%d\n", x.a + x.b + x.c);
+    return 0;
+}
+

+ +

"Nested" `struct` definition is not kept nested

+ +

#include <stdio.h>
+
+struct Foo {
+    int x;
+    struct Bar {
+        int y;
+    };
+};
+
+int main()
+{
+    struct Bar s = { 34 };  // correct
+    // struct Foo.Bar s;    // wrong
+    printf("%d\n", s.y);
+    return 0;
+}
+

+ +

Flat initializer lists

+ +

int arr[3][3] = { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
+//            = { {1,2,3}, {4,5,6}, {7,8,9} };
+
+
+struct Foo {
+    const char *name;
+    int age;
+};
+
+struct Foo records[] = {
+    "John",   20,
+    "Bertha", 40,
+    "Andrew", 30,
+};
+

+ +

Static array indices in function parameter declarations

+ +

+
Except in certain contexts, an unsubscripted array name (for example, region +instead of region[4]) represents a pointer whose value is the address of the +first element of the array, provided that the array has previously been declared. +An array type in the parameter list of a function is also converted to the +corresponding pointer type. Information about the size of the argument array +is lost when the array is accessed from within the function body.
+ +
To preserve this information, which is useful for optimization, C99 allows you +to declare the index of the argument array using the static keyword. The constant +expression specifies the minimum pointer size that can be used as an assumption +for optimizations. This particular usage of the static keyword is highly prescribed. +The keyword may only appear in the outermost array type derivation and only in +function parameter declarations. If the caller of the function does not abide +by these restrictions, the behavior is undefined.
+ +
The following examples show how the feature can be used.
+ +
int n;
+void foo(int arr[static 10]);       // arr points to the first of at least 10 ints
+void foo(int arr[const 10]);        // arr is a const pointer
+void foo(int arr[const]);           // const pointer to int
+void foo(int arr[static const n]);  // arr points to at least n ints (VLA)
+
+

+ +

void foo(int p[static 1]); is effectively a standard +way to declare that p must be non-null pointer.

+ +

Macro Overloading by Argument List Length

+ +

#include <stdio.h>
+#include "cmoball.h"
+
+#define NoA(...) CMOBALL(FOO, __VA_ARGS__)
+#define FOO_3(x,y,z) "Three"
+#define FOO_2(x,y)   "Two"
+#define FOO_1(x)     "One"
+#define FOO_0()      "Zero"
+
+
+int main()
+{
+    puts(NoA());
+    puts(NoA(1));
+    puts(NoA(1,1));
+    puts(NoA(1,1,1));
+    return 0;
+}
+

+ +

Function types

+ +

Function pointers ought to be well known, but as we know the syntax is bit awkward. +On the other hand, less people know you can (as with most objects in C) create +a typedef for function type.

+ +

#include <stdio.h>
+
+int main()
+{
+    typedef double fun_t(double);
+    fun_t sin, cos, sqrt;
+    fun_t* ftpt = &sqrt;
+
+    printf("%lf\n", ftpt(4)); // 2.000000
+
+    return 0;
+}
+

+ +

X-Macros

+ +

Named function parameters

+ +

struct _foo_args {
+    int num;
+    const char* text;
+};
+
+#define foo(...) _foo((struct _foo_args){ __VA_ARGS__ })
+int _foo(struct _foo_args args)
+{
+    puts(args.text);
+    return args.num * 2;
+}
+
+int main(void)
+{
+    int result = foo(.text = "Hello!", .num = 8);
+    return 0;
+}
+

+ +

Combining default, named and positional arguments

+ +

+
Using compound literals and macros to create named arguments (…):
+ +
typedef struct { int a,b,c,d; } FooParam;
+#define foo(...) foo((FooParam){ __VA_ARGS__ })
+void (foo)(FooParam p);
+
+ +
adding default arguments is also quite easy:
+ +
#define foo(...) foo((FooParam){ .a=1, .b=2, .c=3, .d=4, __VA_ARGS__})
+
+ +
But now positional arguments don't work anymore, and there may be situations +where you want to support both options. But I recently realized, that you can +make them work by adding a dummy parameter:
+ +
typedef struct { int _; int a,b,c,d; } FooParam;
+#define foo(...) foo((FooParam){ .a=1, .b=2, .c=3, .d=4, ._=0, __VA_ARGS__})
+
+ +
Now, foo can be called in the following ways:
+ +
foo();           // a=1, b=2, c=3, d=4
+foo(.a=4, .b=5); // a=4, b=5, c=3, d=5
+foo(4, 5);       // a=4, b=5, c=3, d=5
+foo(4, 5, .d=8); // a=4, b=5, c=3, d=8
+
+ +
The dummy parameter isn't needed when you have arguments that are required to be passed by name:
+ +
typedef struct { int alwaysNamed; int a,b,c,d; } FooParam;
+#define foo(...) foo((FooParam){.a=1,.b=2,.c=3,.d=4, .alwaysNamed=5, __VA_ARGS__})
+
+

+ +

Abusing unions for grouping things into namespaces

+ +

+
Suppose that you have a struct with a bunch of fields, and you want to deal +with some of them all together at once under a single name; perhaps you want +to conveniently copy them as a block through struct assignment.
+ +
By using unions you can access both a.field2 and a.sub (and a.field2 +is the same as a.sub.field2) without any macros.
+ +
struct a {
+    int field1;
+    union {
+        struct {
+            int field2;
+            int field3;
+        };
+        struct {
+            int field2;
+            int field3;
+        } sub;
+    };
+};
+
+

+ +

Matching character classes with `sscanf()`

+ +

From this comment on Reddit:

+ +

+
sscanf() can be used as an ersatz "regex" (not really, only character classes) matcher. +For example, one can write something like this to check if the input consists of letters of underscores:
+ +
int len = 0;
+char buf[256];
+int read_token = sscanf(input, "%255[a-zA-Z_]", buf, &len);
+if (read_token) { /* do something */ }
+
+ +
or skip whitespace characters:
+ +
int len = 0;
+char buf[256];
+sscanf(input, "%255[\r\n]%n", buf, &len);
+input += len;
+
+

+ +

Garbage collector

+ +

Boehm GC is a library providing garbage collector for C and C++

+ +

Cosmopolitan Libc

+ +

Description from project's website:

+ +

+
Cosmopolitan Libc makes C a build-once run-anywhere language, like Java, +except it doesn't need an interpreter or virtual machine. Instead, it +reconfigures stock GCC and Clang to output a POSIX-approved polyglot format +that runs natively on Linux + Mac + Windows + FreeBSD + OpenBSD + NetBSD + BIOS +with the best possible performance and the tiniest footprint imaginable.
+

+ +

Inline assembly

+ +

For a high-level language C communicates quite well with low-level world. You +can write Assembly code and link it against program written in C quite easily. +In addition to that, many compilers offer as an extension (listed as common +in Annex J of the C Standard) a feature called inline assembly, typically +introduced to the code by the asm keyword.

+ +

Object Oriented Programming

+ +

Metaprogramming

+ +

C11 added _Generic to language, but turns out metaprogramming +by inhumanely abusing the preporcessor is possible even in pure C99: +meet Metalang99 library.

+ +

#include <datatype99.h>
+
+datatype(
+    BinaryTree,
+    (Leaf, int),
+    (Node, BinaryTree *, int, BinaryTree *)
+);
+
+int sum(const BinaryTree *tree) {
+    match(*tree) {
+        of(Leaf, x) return *x;
+        of(Node, lhs, x, rhs) return sum(*lhs) + *x + sum(*rhs);
+    }
+
+    return -1;
+}
+

+ +

Evaluate `sizeof` at compile time by causing duplicate case error

+ +

Assume you are working on embedded system or generally on something +where getting a printf() output may not be trivial task.

+ +

int foo(int c)
+{
+    switch (c) {
+        case sizeof (struct Foo): return c + 1;
+        case sizeof (struct Foo): return c + 2;
+    }
+}
+

+ +

Adding such simple function anywhere in your code may (depending on compiler) +produce an error message telling us the result of sizeof operator.

error: duplicate case value '16'
+        case sizeof(struct Foo): return c + 2;
+             ^
+

When VLA in C doesn't smell of rotten eggs

2023-02-10T00:00:00+00:00

An earlier version of my Pitfalls of VLA in C article contained +an example of useful case of VLA, which I pulled out of it as I decided the two +– although I'd be overjoyed being presented with more – cases where VLA +are clearly useful, deserve their dedicated, if low effort, post.

+ +

Size check when passing to function

+ +

"Only" a bit over two decades after the introduction of VLA to C language, +GCC started giving warnings about passing to functions bigger than declared +size of arrays when we actually decide to utilize VLA syntax in parameters.

+ +

#include <stdio.h>
+
+void f(const size_t size, const int buf[static size]);
+
+int main(void)
+{
+    int arr[50] = { 0 };
+    f(10, arr);  // acceptable
+    f(50, arr);  // correct
+    f(100, arr); // *WARNING*
+    return 0;
+}
+

+ +

Added bonus: explicit size annotation

+ +

Multidimensional arrays

+ +

Dynamically allocating multi-dimensional arrays where the inner dimensions +are not known until runtime is really simplified using VM types. +It isn't even as unsafe as aVLA since there's no arbitrary stack allocation.

+ +

int (*arr)[n][m] = malloc(sizeof *arr); // `n` and `m` are variables with dimensions
+if (arr) {
+    // (*arr)[i][j] = ...;
+    free(arr);
+}
+

+ +

The VLA-free alternatives aren't as sexy:

+ +

piecemeal allocation with malloc() +

int** arr = malloc(n * (sizeof *arr));
+if (arr) {
+for (int i = 0; i < n; ++i) {
+    arr[i] = malloc(m * (sizeof *arr[i]));
+}
+// arr[i][j] = ...
+for (int i = 0; i < n; ++i) {
+    free(arr[i]);
+}
+free(arr);
+}
+

1D array with offsets +

int* arr = malloc(n * m * (sizeof *arr));
+if (arr) {
+// arr[i*n + j] = ...
+free(arr);
+}
+

big fixed array +

int arr[SAFE_SIZE][SAFE_SIZE]; // SAFE_SIZE must be safe for SAFE_SIZE*SAFE_SIZE
+// arr[i][j] = ...;
+

A brief overview of pseudo-random number generators and testing of our own simple generator

2022-05-03T00:00:00+00:00

Pitfalls of VLA in C

2021-07-05T00:00:00+00:00

+
It generates much more code, and much slower code (and more fragile code), + than just using a fixed key size would have done ~ Linus Torvalds
+

+ +

VLA (variable-length array, an array – array, not just block of memory +acting like one – that has size determined during runtime instead of at compile +time) is a feature introduced to C with the revision C99 of the standard. +A very useful feature one may think, and indeed… in some cases… +But since the world we live in is less than ideal, one needs to know well what +are the pitfalls of using VLA in their code before doing so.

+ +

If you want to know the few cases when VLA may actually +be useful you can check my other blogpost.

+ +

A fair share of the text here will focus on problems caused by automatic VLA, +thus to further reflect on that an abbreviation aVLA will be used when +refferng to those cases.

+ +

Allocation on stack

+ +

Let's address the elephant in the room: aVLA usually are allocated on stack. +This is the source of the most of the problems, the source of discontent among +programmers, the reason why even allowing any VLA into the codebase is usually +a code smell.

+ +

Let's consider a painfully simple, very favourable to aVLA, example:

#include <stdio.h>
+
+int main(void) {
+    int n;
+    scanf("%d", &n);
+    char arr[n];
+    printf("%d", arr[0]);
+    return 0;
+}
+

+ +

As we can see, it takes a number from user then makes array of that size. Compile +and try it. Check how big values you can input before getting segfault caused +by stack overflow. In my case, it was around 8 MiB. +How much is that? One raw image? a MP3 or two? few seconds of video? And the program +wasn't doing anything meaningful - what if it wasn't just main()? Maybe a recursive +function? The limit shrinks tremendously.

+ +

And you don't have any (portable, standard) way to react after a stack +overflow - the program already crashed, you lost control. So you either need +to make elaborate checks before declaring an array or betting that user won't +input too large values (the outcome of such gamble ought to be obvious).

+ +

So the programmer must ensure that aVLA size doesn't exceed some safe maximum, +but in reality, if you know safe maximum, there is rarely any reason for not using +it always.

+ +

Worst of it is…

+ +

… that segfault is actually one of the best outcomes of improperly handled aVLA. +The worst case is an exploitable vulnerability, where attacker may choose a value +that causes an array to overlap with other allocations, giving them control over +those values as well. A security nightmare.

+ + + +

So how to fix this example?

+ +

What if I need to let user define size and creating ridiculously large fixed +array would be too wasteful? It's simple: use malloc()!

#include <stdio.h>
+#include <stdlib.h>
+
+int main(void) {
+    int n;
+    scanf("%d", &n);
+    char* arr = malloc(n * (sizeof *arr));
+    printf("%d", arr[0]);
+    free(arr);
+    return 0;
+}
+

+ +

In this case I was able to request over 4.5 GB before segfault. Almost few orders +of magnitude more! But I still got the segfault, right? Well, the difference +is in getting at least some* chance of checking the value returned by malloc() +and thus being able to, for example, inform the user about the error:

    char* arr = malloc(n * (sizeof *arr));
+    if (arr == NULL) {
+        perror("malloc()"); // output: "malloc(): Cannot allocate memory"
+    }
+

+ + +

"but I cannot use `malloc()`!"

+ +

I've encountered a counterargument, that as C is often used as a systems/embedded +language, there are situations where using malloc() may not even be possible.

+ +

I'm basically going to repeat myself here, but it is really important:

+ +

+
Such device rather is not going to have a lot of stack either. So instead of +allocating dynamically, you (probably) should determine how much you need and +just always use that fixed amount.
+
+
When using aVLA on system with small amounts of stack, it's really easy to make +something which seems to work, but which blows your stack if your function gets +called from a deep call stack combined with the large amount of data.
+
+
If you always allocate fixed amounts of stack space everywhere, and you test +it, you know you're good. If you dynamically allocate on stack, you have to +test all your code paths with all the largest sizes of allocated space, which +is much harder and much easier to make a mistake. Don't make it even easier to +shoot yourself in the foot for no real advantage.
+

+ +

Creation by accident

+ +

Unlike most other dangerous C functionality, aVLA doesn't have the barrier +of being not known. Many newbies learn to use them via trial and error, but +don't learn about the pitfalls.
+The following is a simple mistake I observed even experienced developers making +(especially those with C++ background); it will silently create an aVLA when +it's clearly not necessary:

const int n = 10;
+int A[n];
+

Thankfully, any half-decent compiler would notice and optimize aVLA away, but… +what if it doesn't notice? Or what if, for some reason (safety?), the optimizations +were not turned on? But it surely isn't so much worse, right? Well…

+ +

Way slower than fixed size

+ +

Without compiler optimizations a function with aVLA from previous +example will result in 7 times more Assembly +instructions than its fixed size counterpart +before moving past the array definition (look at the body before jmp .L2). +But it's without optimizations, with them the produced Assembly is exactly the same.

+ +

So an example where aVLA is not used by mistake:

#include <stdio.h>
+void bar(int*, int);
+
+void foo(int n) {
+
+#if VLA
+    int A[n];
+#else
+    int A[1000];  // Let's make it bigger than 10! (or there won't be what to examine)
+#endif
+
+    for (int i = n; i--;) {
+        scanf("%d", &A[i]);
+    }
+    bar(A, n);
+}
+
+int main(void) {
+    foo(10);
+    return 0;
+}
+

For our educational purposes in this example, -O1 level of optimisation will +work best (as Assembly will be clearer and -O2 won't help aVLA's case here +really much).

+ +

When we compile aVLA version, before instructions corresponding to for loop, we get:

push    rbp
+mov     rbp, rsp
+push    r14
+push    r13
+push    r12
+push    rbx
+mov     r13d, edi
+movsx   r12, edi       ; here aVLA "starts"...
+sal     r12, 2         ;
+lea     rax, [r12+15]  ;
+and     rax, -16       ;
+sub     rsp, rax       ;
+mov     r14, rsp       ; ... and there "ends"
+

+ +

The aVLA-free version on the other hand generates:

push    r12
+push    rbp
+push    rbx
+sub     rsp, 4000      ; this is caused by array definition
+mov     r12d, edi
+

+ +

So not only fixed array spawns less code, but also way simpler code. +Why, aVLA even causes more overhead at the beginning of the function. +It's not so much more in the grand scheme of things, but it still isn't +just a pointer bump.

+ +

But are those differences significant enough to care? +Yes, they are.

+ +

No initialization

+ +

To add more to the issue with inadvertent aVLA, the following isn't allowed:

int n = 10;
+int A[n] = { 0 };
+

Even with optimizations, initialisation isn't allowed for aVLA. So despite +wanting fixed size array and compiler being technically able to provide one, +it's won't work (and if it does… it's breaking the specification…).

+ +

Mess for compiler writers

+ +

Few months ago I saved a comment +on Reddit listing problems encountered with VLA from compiler writer perspective. +I'll allow myself to cite the listed issues:

+ +

+
+
A VLA applies to a type, not an actual array. So you can create a typedef +of a VLA type, which "freezes" the value of the expression used, even if +elements of that expression change at the time the VLA type is applied
+
VLAs can occur inside blocks, and inside loops. This means allocating and +deallocating variable-sized data on the stack, and either screwing up all +the offsets, or needing to do things indirectly via pointers.
+
You can use goto into and out of blocks with active VLAs, with some things +restricted and some not, but the compiler needs to keep track of the mess.
+
VLAs can be used with multi-dimensional arrays.
+
VLAs can be used as pointer targets (so no allocation is done, but it still +needs to keep track of the variable size).
+
Some compilers allow VLAs inside structure definitions (I really have no idea +how that works, or at what point the VLA size is frozen, so that all instances +have the same VLA(s) sizes.)
+
A function can have dozens of VLAs active at any one time, with some being +created or destroyed at different times, or conditionally, or in loops.
+
sizeof needs to be specially implemented for VLAs, and all the necessary +info (for actual VLAs, VLA-types, and hybrid VLA/fixed-size types and +arrays and pointed-to VLAs).
+
'VLA' is also the term used for multi-dimensional array parameters, where +the dimensions are passed by other parameters.
+
On Windows, with some compilers (GCC at least), declaring local arrays which +make the stack frame size over 4 KiB, mean calling a special allocator +(__chkstk()), as the stack can only grow a page at a time. When a VLA is +declared, since the compiler doesn't know the size, it needs to call +__chkstk for every such function, even if the size turns out to be small.
+
+

+ +

And believe me, if you take a stroll around some C forums (or the meeting of +standard committee [sic!]) you will see even more different complaints.

+ +

Reduced portability

+ +

Due to all previously presented problems, some compiler providers decided to +not fully support C99. The primary example is Microsoft with its MSVC. +The C Standard Committee also noticed the problem and with C11 revision +all instances of VLAs were made optional; C2x is partially reverts that decision +mandating VM types (aVLA are still optional; there is even a slight sentiment +towards deprecating them entirely, but removing something from the, nomen omen, +standard is way harder than putting it in).

+ +

That means code using a VLA won't necessarily be compiled by a C11 compiler, +so you need, assuming you target for portability, to check whether it is +supported with __STDC_NO_VLA__ macro and make version without (a)VLA as +fallback. Wait… if you need to implement VLA-free version either way then +what's the point of doubling the code and creating VLA in the first place?!

+ + + +

(nitpick) Breaking conventions

+ +

This one is more of a nitpick, but still another reason to dislike VLA. There +is a widely used convention of first passing object then its parameters, what +in terms of arrays means:

void foo(int** arr, int n, int m) { /* arr[i][j] = ... */ }
+

+ +

C99 specified that array sizes need to be parsed immediately when encountered +within a function definition's parameter list, what means that when using VLA +you cannot do an equivalent of the above:

void foo(int arr[n][m], int n, int m) { /* arr[i][j] = ... */ } // INVALID!
+

+ +

You need to break up with the convention and write:

void foo(int n, int m, int arr[n][m]) { /* arr[i][j] = ... */ }
+

+ +

Alternatively, you could use the obsolete syntax (obsolescent even in +ANSI C; finally removed in C2x), but that would be pointless, as +compilers don't make parameters checks in such case, so any benefits +from using VLA would be lost.

void foo(int[*][*], int, int);
+void foo(arr, n, n)
+    int n;
+    int m;
+    int arr[n][m]
+{
+    // arr[i][j] = ...
+}
+

+ + + +

Conclusion

+ +

In short, refrain from using VLA and avoid automatic VLA like devil avoids +holy water; if your compiler has it, rather compile with -Wvla flag +or similar (and definitely with -Wvla-larger-than=0 - this allows VM types, +while warning about aVLA).

+ +

If you find yourself in one of the situations where VLA (or VM type) is a valid/good +solution, of course, do use them, but keep in mind the limits I've outlined here.

+ + + +

Steps to learn Vim

2021-01-23T00:00:00+00:00

Learn how to ask a good question and to type it into search engine before asking on forums
Do vimtutor - it's a 30-minute tutorial that teaches the most basic Vim functionality hands-on
RTFM! User manual (:h user-manual) will guide you through every feature from basic to advanced
:help and :helpgrep to find more detailed documentation of specific feature
:h faq - Frequently Asked Questions
:h quickref - quick reference guide
idiomatic vimrc by romainl

+ + + +

Make Vim follow XDG Base Directory specification

2020-12-13T00:00:00+00:00

XDG Base Directory specification, $XDG_CONFIG_HOME etc. Great thing - configs +separated from user data and cache, no clutter in home directory. Unfortunately, +many programs still don't respect it, including Vim. But what would be our favourite +text editor if we wouldn't be able to reconfigure it!

+ +

TL;DR

+ +

Into shell config (e.g. in ~/.profile):

export VIMINIT="set nocp | source ${XDG_CONFIG_HOME:-$HOME/.config}/vim/vimrc"
+

+ +

At the top of vimrc:

" XDG support
+
+if empty($MYVIMRC) | let $MYVIMRC = expand('<sfile>:p') | endif
+
+if empty($XDG_CACHE_HOME)  | let $XDG_CACHE_HOME  = $HOME."/.cache"       | endif
+if empty($XDG_CONFIG_HOME) | let $XDG_CONFIG_HOME = $HOME."/.config"      | endif
+if empty($XDG_DATA_HOME)   | let $XDG_DATA_HOME   = $HOME."/.local/share" | endif
+if empty($XDG_STATE_HOME)  | let $XDG_STATE_HOME  = $HOME."/.local/state" | endif
+
+set runtimepath^=$XDG_CONFIG_HOME/vim
+set runtimepath+=$XDG_DATA_HOME/vim
+set runtimepath+=$XDG_CONFIG_HOME/vim/after
+
+set packpath^=$XDG_DATA_HOME/vim,$XDG_CONFIG_HOME/vim
+set packpath+=$XDG_CONFIG_HOME/vim/after,$XDG_DATA_HOME/vim/after
+
+let g:netrw_home = $XDG_DATA_HOME."/vim"
+call mkdir($XDG_DATA_HOME."/vim/spell", 'p', 0700)
+
+set backupdir=$XDG_STATE_HOME/vim/backup | call mkdir(&backupdir, 'p', 0700)
+set directory=$XDG_STATE_HOME/vim/swap   | call mkdir(&directory, 'p', 0700)
+set undodir=$XDG_STATE_HOME/vim/undo     | call mkdir(&undodir,   'p', 0700)
+set viewdir=$XDG_STATE_HOME/vim/view     | call mkdir(&viewdir,   'p', 0700)
+
+if !has('nvim') " Neovim has its own special location
+  set viminfofile=$XDG_STATE_HOME/vim/viminfo
+endif
+

+ +

Step-by-step

Relocating vimrc

+ +

To begin with, since version 7.3.1178, Vim will search for ~/.vim/vimrc if +~/.vimrc is not found. So let's move the file there.

+ +

Let's move our ~/.vim to $XDG_CONFIG_HOME/vim. Now we need to command Vim +to read config from this new location prior to ~/.vim. There are three ways +to do it.

+ +

Shell alias

+ +

Pretty straightforward method. Shell will just substitute command vim with the alias body

+ +

alias vim='vim -u ${XDG_CONFIG_HOME:-$HOME/.config}/vim/vimrc'
+

+ +

Downside? Works only in shell

+ +

`VIMINIT` environmental variable

+ +

export VIMINIT="set nocp | source ${XDG_CONFIG_HOME:-$HOME/.config}/vim/vimrc"
+

+ +

Cons? If you wish for Neovim and Vim configurations to still be separated, then:

+ +

export VIMINIT="if has("nvim") | so ${XDG_CONFIG_HOME:-$HOME/.config}/nvim/init.vim | else | set nocp | so ${XDG_CONFIG_HOME:-$HOME/.config}/vim/vimrc | endif"
+

+ +

Wrapper script

+ +

Save the following code as vim in $HOME/.local/bin * +and make is executable with chmod +x vim

+ +

#!/usr/bin/env sh
+
+for dir in $(echo "$PATH" | tr ":" "\n" | grep -Fxv "$(dirname $0)"); do
+    if [ -x "$dir/vim" ]; then
+        exec "$dir/vim" -u "${XDG_CONFIG_HOME:-$HOME/.config}"/vim/vimrc "$@"
+    fi
+done
+

+ +

Doesn't affect Neovim and works outside shell, but you need to carry it together +with your config

+ +

* Remember to add it to the beginning of PATH environment variable.
+ It can be also other location of your choice instead of $HOME/.local/bin

+ +

Now the code in our vimrc

+ +

First of all, although not mandatory, let's set $MYVIMRC variable:

if empty($MYVIMRC) | let $MYVIMRC = expand('<sfile>:p') | endif
+

+ +

Let's define fallback locations in case XDG_* variables are not set.

if empty($XDG_CACHE_HOME)  | let $XDG_CACHE_HOME  = $HOME."/.cache"       | endif
+if empty($XDG_CONFIG_HOME) | let $XDG_CONFIG_HOME = $HOME."/.config"      | endif
+if empty($XDG_DATA_HOME)   | let $XDG_DATA_HOME   = $HOME."/.local/share" | endif
+if empty($XDG_STATE_HOME)  | let $XDG_STATE_HOME  = $HOME."/.local/state" | endif
+

+ +

Let's add entries to runtimepath:

set runtimepath^=$XDG_CONFIG_HOME/vim
+set runtimepath+=$XDG_DATA_HOME/vim
+set runtimepath+=$XDG_CONFIG_HOME/vim/after
+

+ +

$XDG_CONFIG_HOME/vim and $XDG_CONFIG_HOME/vim/after are just equivalents of +~/.vim and ~/.vim/after, but $XDG_DATA_HOME/vim is brand new - there we +will keep downloadables (like plugins and spell files), Netrw bookmarks etc.

+ +

Let's set directory for Vim8 build-in packages:

set packpath^=$XDG_DATA_HOME/vim
+set packpath+=$XDG_DATA_HOME/vim/after
+

+ +

Netrw is just as easy:

let g:netrw_home = $XDG_DATA_HOME."/vim"
+

+ +

What about spellings? Well, this one is more tricky, because it isn't controlled +by any option. Instead it searches for spell directory in whole runtime path. +If none is found then it falls back to ~/.vim/spell. So let's create one at +desired location ourselves!

call mkdir($XDG_DATA_HOME."/vim/spell", 'p', 0700)
+

+ +

So far so good. We are left with state (backup, undo, swap, viminfo, view). +Vim doesn't create directories for them (even for defaults), so we will need +to do it ourselves - thankfully VimL has mkdir() function.

set backupdir=$XDG_STATE_HOME/vim/backup | call mkdir(&backupdir, 'p', 0700)
+set directory=$XDG_STATE_HOME/vim/swap   | call mkdir(&directory, 'p', 0700)
+set undodir=$XDG_STATE_HOME/vim/undo     | call mkdir(&undodir,   'p', 0700)
+set viewdir=$XDG_STATE_HOME/vim/view     | call mkdir(&viewdir,   'p', 0700)
+
+if !has('nvim') " Neovim has its own location which already complies with XDG specification
+  set viminfofile=$XDG_STATE_HOME/vim/viminfo
+endif
+

+ +

Congratulations! Now your Vim is configured with accordance to XDG Base Directory specification.

+ +

Sources

Best aspects of C language

2020-11-03T00:00:00+00:00

How comes that, after over half a century, C is still a relatively popular and +widely used language when others have withered into obscurity? Why, over all this +time, was nothing able to fully replace it? Why is it still taught in schools?

+ +

Let's have a look at some of the best, in my opinion, aspects of the language +(although not all) that contributed to such a state of affairs.

+ +

Spirit of C

+ +

Let's start with a quote from document C99RationaleV5.10:

+ +

+
The C89 Committee kept as a major goal to preserve the traditional spirit of C. +There are many facets of the spirit of C, but the essence is a community +sentiment of the underlying principles upon which the C language is based. +Some of the facets of the spirit of C can be summarized in phrases like:
+
+
Trust the programmer.
+
Don’t prevent the programmer from doing what needs to be done.
+
Keep the language small and simple.
+
Provide only one way to do an operation.
+
Make it fast, even if it is not guaranteed to be portable.
+
+

+ +

"Mid-level"

+ +

In regards of level, there are two types of languages: low and high.

+ +

Low-level languages are close to the hardware, the only closer thing to CPU +would be electricity itself. Those languages are divided into machine code and +Assembly. The former is a stream of raw, usually binary, data. If somebody is +required to work with it, usually does it using more "readable" hexadecimal form.

+ +

Second-generation languages - Assembly - provide one abstraction level on top +of the machine code. Those languages are mostly only a mapping of human-readable +symbols, including symbolic addresses, to opcodes, addresses, numeric constants, +strings and so on. Also are different for each processor.

+ +

How do high-level languages, providing more abstraction, compare? Quoting Wikipedia:

+ +

+
In contrast to low-level programming languages, it may use natural language +elements, be easier to use, or may automate (or even hide entirely) significant +areas of computing systems (e.g. memory management), making the process of +developing a program simpler and more understandable than when using a +lower-level language. The amount of abstraction provided defines how +"high-level" a programming language is.
+

+ +

In big oversimplification: low = more machine friendly, high = more human friendly.

+ +

C is high-level, but back when it was created, most of the work was still being +done in low-level Assembly. As a result, C has lower level of abstraction than +other (still) widely used languages and is often utilized for low-level +programming, hence I like to call it "mid-level".

+ +

You can also easily (with less language bloat) compile C code to Assembly and +examine what instructions processor will execute.

+ +

And if it there is a need, many popular C compilers offers you the option +to level down and use inline Assembly to squeeze everything out of CPU. +It's a feature not really implemented with many other languages.

+ +

Fairly simple

+ +

Low-level languages are harder to program in. Not because they are more +complicated, but because they are more error prone and thus require way +more commitment, memorizing and fiddling.

+ +

C is mid-level, so "by definition" it's easier. But there comes the surprise, +learning it is easier compared to higher level languages! Why? Because of not +extensive syntax it doesn't take so much to learn the basics. +Loops, functions, structures, pointers, variables, types - the core of language. +Intense week to get the general idea. The rest is "just" maths and CS theorem.

+ +

But, but, but, but! Don't get me wrong!
+Language is simple, programming not necessarily!
+To master anything you will need a lot more practice!
+A lot! And it's truth for anything out there!

+ +

Fast, lightweight and flexible

+ +

Standard C library is small compared to other languages (e.g. Java). It's small +enough for you to try to memorize all functions successfully (not that it would +be a huge benefit). Yeah, many things should be deprecated long ago, there +obviously is some bloat (try to maintains something without bloat for few years, +let alone few decades), but there is not much enough of it to hinder the performance.

+ +

And what if libc is still too much? Nothing stands in the way of not using it +at all! Just don't include any of its headers - not even simple printf() will +be present. Replace it with any other library of your choice.

+ +

Maturity, emphasis on proper memory management, inline Assembly, small abstraction +and little bloat gives programmer really good control over the program.

+ +

This makes C an ideal choice for OS kernels (Linux, Windows NT or macOS's XNU to +name a few) or other languages (e.g. Python). That's also why C is so popular +on embedded systems, where you cannot afford to waste resources.

+ +

Standard, no blessed implementation

+ +

This one relates to previous and next point. The C programming language is definied +basically only by a document published by International Organization for Standardization +every few years. Contrary to languages like Python, Rust or Java, there is no partucullar +implementation that is the C language.

+ +

Combine it with flexibility and you have a language which can easily target any platform.

+ +

Ubiquity = portability

+ +

Does there exist any (still) significant platform with no C compiler available? +Yes, those work exclusively on Assembly only; for all others, C is available. +C programs are present on your high-end gaming PC, on NASA spacecrafts and +in ticket machines. Literally everywhere. C software runs the world.

+ +

In accordance to previous paragraphs, C is peculiarly strong choice for +microcontrollers and other forms of embedded systems, which surround us every day.

+ +

And have you heard about FFI? +Turns out many other languages like to have some kind of compatibility with C.

+ +

You don't need to worry if you will be able to use this language somewhere as +for 99% you can! (Although it doesn't mean you should…) It means, that while +code may not be 100% portable, you will be a portable programmer.

+ +

The influencer

+ +

C has both directly and indirectly influenced innumerous amount of languages. +C++, Java, Go, D, Rust, Perl, even PHP and Python - those are but few examples.

+ +

Obviously, knowledge of C isn't needed to learn any of them and sometimes +may even push you to use not the best practices.

+ +

Nevertheless, I think it's beneficial to remember the roots. And if you are +cautious, familiarity with C might give you some foothold. It's especially the +case with C++.

+ +

Rich collection of libraries

+ +

I suspect all this talk about fastness, lightness, mid-level, Assembly etc. might +have give you an idea, you will need to implement everything yourself. There may +indeed not be any LinkedHashMap +or other functionalities like garbage collection available for C… except… not entirely.

+ +

C is mature and popular language, so while those features aren't build-in, +believe me, name a thing and somebody somewhere already created library for +it (although if think about something too obscure to find, but it does exists).

+ +

You want garbage collector? Boehm GC has you covered. +TUI? Nothing like timeless ncurses. +Examples can be listed almost infinitely: +GTK, +PDCurses, +libcurl, +ALSA, +Genann, +libsoundio, +SDL, +SQLite, +getopt, +OpenGL, +inih, +GMP, +cJSON, +MuPDF, +libXDGdirs, +OpenSSL…

+ +

It's very universal language - you can program basically anything: web server, +video game (e.g. classics from id Software), +operating system, other programming language or wrapper forcing Firefox to obey +XDG Base Directory specification, +because when I'm an administrator, the programs will do exactly what I told them to! +There were madlads doing WebDev in C via CGI Scripts (and nowadays with WebAssembly).

+ +

However, please, remember - the fact you can, doesn't mean you should. For example, +if you want to create a video game, you really ought to turn your eyes to C++. +And you should know, that…

+ +

C++ is highly backward compatible

+ +

Why do I even make the whole point out of C++ here? Because it's one of most +widely used languages today and you encountering it is more than certain.

+ +

In contrary to other languages embracing C compatibility, C++ was created as its +direct descendant and committee goes to great lengths to keep the "copy-paste" +compatibility with it - in most cases you can compile C code as C++ just fine.

+ +

Don't be mistaken, +C++ isn't by any means a superset of C - the code +isn't always going to work with C++ and a good C code isn't necessary good C++ code. Consider example:

int* x = malloc(10 * sizeof (*x));
+

Proper way in C, but in C++ there ought to be (int*) before malloc(), +for it to work, not to mention you should use new int[10] instead.

+ +

Although in most cases you can use C library safely in your C++ project.

+ +

All examples from the previous point not only can be, but often are +used in such way.

+ +

Even libraries already compiled with C compiler can be made compatible with C++, +thanks to extern "C" +linkage specifier.

+ +

Safety

+ +

In my opinion, this comment +by Reddit user u/tim36272 catches this point perfectly:

+ +

+
You're thinking of things like type safety, garbage collection etc.
+ +
I'm talking about safety in terms of people dying. Things like garbage +collection are the opposite of life safety. What if your airplane decided it +needed to free up memory ten seconds from touchdown so it ran the garbage +collector? What if running the garbage collector caused a valve to respond +0.1 seconds late to a command, which caused a chain reaction resulting in +a hydraulic line bursting and losing control of the rudder?
+ +
C can be safe because it does exactly what the programmer tells it to do, +nothing more and nothing less. There's no magic going on behind the scenes +which could have complex interactions with other behind the scenes magic.
+ +
A common example is std::vector from C++. This container expands as needed +to accommodate as many elements as you need. But you have a limited amount of +memory on the system, so you need to do static analysis to determine the +maximum size of that vector. And you need to be sure that you have enough +memory for that plus everything else in your system.
+ +
Well, now you've eliminated a lot of the convenience of using std::vector. +You might as well just allocate that max size to it and avoid all the overhead +std::vector imposes by growing in size.
+ +
The other main advantage of std::vector are templates. If you were to use a +template in safety critical code you'd need to prove that the code generated by +the compiler is correct for every template. Now that you're diving down into +all this auto-generated machine code, it would be easier to just write the +code yourself and avoid the complexity introduced by the compiler's template +generator.
+ +
So, if we eliminate all the usefulness of std::vector, why use it at all?
+ +
Repeat that process for most features in most languages and voila! You're back at C
+

+ +

Important note: if you want such safety, you throw portability out of the window!

+ +

Preprocessor

+ +

C (and its direct derivatives like C++ or Object-C) is the only language +I know of, which includes a lexical preprocessor in its specification.

+ +

Understandable, considering the fact that many newer languages contain mechanisms +which make preprocessing partially obsolete.
+And in the need there are always fallbacks:

using other language (or even itself) as preprocessor (e.g. Python as preprocessor to Java)
using external preprocessor (e.g. m4) +
- …or C preprocessor (yes, there is nothing stopping you from preprocessing JavaScript with C compiler!)
+

+ +

I will repeat the link with proper title: +The C Preprocessor in Javascript? - +I really recommend reading this short text, as I think it's enought to understand +why having a standarized, portable preprocessor is a good thing in C.

+ +

Program in C song

+ +

Lyrics

+ + Ariel, listen to me + OO languages? + It's a mess. + Programming in C is better than anything they got over there. + + The syntax might seem much sweeter + Where objects and subtypes play + But frills like inheritance + Will only get in the way! + Admire C's simple landscape + Efficiently dangerous! + No templates or fancy pitfalls + ... like Java and C++! + + Program in C + Program in C + + Pointers, assembly, + Manage your memory + With malloc() and free()! + Don't sink your app with runtime bloat + Software in C will stay afloat + Do what you want there + Close to the hardware! + + Program in C! + +

+ +

Conclusion

+ +

Learning C is a valuable experience and may be really worth it. If not as your +first language, then as second, third, fourth or whatever. There are +advantages, but (as always) also some disadvantages; at least trying won't hurt. +Give it a chance, who knows, you may truly get to like it.

+ +

And don't believe people saying "C is dead". Love it or hate, C is still kicking +and the amount of crucial projects will keep it relevant for few next decades too.

+ +

… a blot on the landscape

+ +

C was created in year 1972 on the foundation of B language, so over the years +it acquired some quirks (memcpy() is defined in string header!), some things +became obsolete, some useless and are kept only for compatibility with old code.

+ +

A beginner is likely to burn a lot of time chasing down strange behavior caused +by memory corruption with no idea how to reason about, what may lead actually +big discouraged to programming in general. There is little to none mechanism +to prevent programmer from shooting themselves in the foot.

+ +

It's also important to take into account that C is not the introduction to +Computer Science. Learning none of languages is. You need to study it properly +to get a true understanding of this vast field. If not formal education in +university, then online one will suffice too. The Internet is full of resources +(you may find a few on my list).

Dudemanguy's Musings

四人囃子 - 一触即発

Wayland Isn't Going to Save The Linux Desktop

Gothic/Post-punk Visual Kei of the 90s

Time for a Change

Top Ten Albums of 2019

s6 Deserves More Love

Spotify is Cancer

Unexpect - In a Flesh Aquarium

Unexpect - _We, Invaders

Gyze - Asian Chaos

IRON ATTACK! - Japonism

A Fresh Start

Devil Within - Dark Supremacy

ARESZ - Beat Blast Spiral

Lux Occulta - My Guardian Anger

Valthus - Remains of Memory

Hidden - Embalm 〜Enbalm After 20 Years〜

La'cryma Christi - Dwellers of a Sandcastle

Galneryus - Under the Force of Courage

Luna Sea - LUV

Terror Squad - the wild stream of eternal sin

Shellshock - 肆 - SHI -

Lovebites - The Lovebites EP

Regnum Caelorum et Gehenna - Dimersity 03 : Verum cur non Audimus

Hollow Mellow - Reincarnation

Light Bringer - Heartful Message

電気式華憐音楽集団 - DETONATOR

ARESZ - GRATING

黒夢 - 迷える百合達 ～Romance of Scarlet～

Versailles - The Greatest Hits 2007-2016

D - Neo culture -Beyond the world-

黒夢 - 亡骸を・・・

D - Tafel Anatomie

Hizaki - Rosario

Watchtower - Concepts of Math: Book One

陰陽座 - 鬼哭転生

Dir En Grey - Macabre

Kamijo - Heart

MinstreliX - Memoirs

Octaviagrace - Recollect Storia

Dir En Grey - Gauze

Doom - Complicated Mind

Doom - Killing Field

黒夢 - 生きていた中絶児・・・・

Jupiter - Topaz

Jupiter - Blessing of the Future

Hizaki - Dance with grace

Hizaki - Maiden Ritual (EP Review)

Mysterious Priestess - 夢国ノ義士

Mysterious Priestess - Agency of Fate

人間椅子 - 無限の住人

D - The Name of the ROSE

陰陽座 - 魑魅魍魎

Gauntlet - Birthplace of Emperor

MergingMoon - Kamikakushi〜神隠し

Versailles - Versailles

Art of Gradation - Concentration

Versailles - Holy Grail

人間椅子 - 二十世紀葬送曲

ARESZ - SKILL

Versailles - Jubilee -Method of Inheritance-

Dir En Grey - Missa

Gargoyle - 禊

X Japan - Art of Life

Octaviagrace - Resonant Cinema

Loszeal - Ideal World

Vrain - Rendez Blue

愛狂います - 心臓。

Jizue - Novel

Jupiter - The History of Genesis

Unexpect - Utopia

GOTOphobia considered harmful (in C)

Error/exception handling & cleanup

goto-less alternative 1: nested ifs

goto-less alternative 2: if not then clean

goto-less alternative 3: flags

goto-less alternative 3.5: so-far-ok flag

goto-less alternative 4: functions

goto-less alternative 5: abuse of loops

黒夢 - 迷える百合達～Romance of Scarlet～

`goto`-less alternative 1: nested `if`s

`goto`-less alternative 2: if not then clean

`goto`-less alternative 3: flags

`goto`-less alternative 3.5: so-far-ok flag

`goto`-less alternative 4: functions

`goto`-less alternative 5: abuse of loops

`goto`-less alternative: loop

`goto` version

`goto`-less version

Common code in `switch` statement

`goto`-less alternative 1: functions

`goto`-less alternative 2: `if`s

`goto`-less alternative 3: interlacing `if (0)`

`goto`-less alternative: capturing lambda

Nested `break`, labeled `continue`

`goto`-less alternative 1: guard flag

`goto`-less alternative 2: code duplication

`volatile` type qualifier

`restrict` type qualifier

`register` type qualifier

`%n` format specifier

`%.*` (minimum field width) format specifier

`-->` "operator"

`idx[arr]`

Using `&&` and `||` as conditionals

Compile time assumption checking using `enum`s

Ad hoc `struct` declaration in the return type of a function

"Nested" `struct` definition is not kept nested

Matching character classes with `sscanf()`

Evaluate `sizeof` at compile time by causing duplicate case error

"but I cannot use `malloc()`!"