Git bisect run techniques

git bisect run lets us find the breaking commit in O(log(N)) time for N commits, by doing a binary search through commits to determine the one which broke things. It is extremely useful, but the thing which often takes a long time is figuring out which command to use to reliably determine whether a commit is good or bad. This article explains some techniques to help with this task.¹

Script file

Put anything but the most trivial command in a script file, for easier reading and writing. This way we won’t need to deal with an extra layer of complexity and bugs. A quick template:

cat > bisect.bash << 'EOF'
#!/usr/bin/env bash

set -o errexit # Exit on the first non-zero exit code
set -o noclobber # Don't allow redirecting to existing files, for safety
set -o nounset # Treat dereferencing non-existing variables as an error
set -o pipefail # Fail fast when using pipes
shopt -s failglob # Fail when a glob does not match any files
shopt -s inherit_errexit # Apply errexit setting to command substitutions

# Setup & build
# For example, (re-)install prerequisites, reset DB, etc
# Exit code 125 means the code can't be tested, and Git skips the current commit
COMMAND || exit 125
[…]

# Test
COMMAND
[…]
EOF
chmod u+x bisect.bash

This way, once we set up the commands, the Git bisect run is always just git bisect run ./bisect.bash.

Search for text in command output

grep --quiet is probably the simplest way to search through standard output of a command for a value. But for maximum debugging potential we probably want to print the original output of the command as-is (slightly modified for readability):

{
    COMMAND |
        tee -p /dev/fd/3 |
        grep --quiet REGEX
} 3>&1

The above means that the commit will be considered good if any line of standard output of COMMAND matches REGEX. Examples:

my_command() {
    echo 'stdout content'
    echo 'stderr content' >&2
}

# Good commit because it finds "stdout"
{
    my_command |
        tee -p /dev/fd/3 |
        grep --quiet stdout
} 3>&1

# Bad commit because "stderr" occurs on a different file descriptor
{
    my_command |
        tee -p /dev/fd/3 |
        grep --quiet stderr
} 3>&1

# Bad commit because "other" does not appear anywhere
{
    my_command |
        tee -p /dev/fd/3 |
        grep --quiet other
} 3>&1

# Prints "stderr content"
{
    {
        my_command |
            tee -p /dev/fd/3 |
            grep --quiet stdout
    } 3>&1
} > /dev/null

# Prints "stdout content"
{
    {
        my_command |
            tee -p /dev/fd/3 |
            grep --quiet stdout
    } 3>&1
} 2> /dev/null

Negating the exit code

To treat a commit as bad if any line of standard output of COMMAND matches REGEX²:

{
    if COMMAND | tee -p /dev/fd/3 | grep --quiet REGEX; then
        false
    else
        true
    fi
} 3>&1

Examples:

my_command() {
    echo 'stdout content'
    echo 'stderr content' >&2
}

# Bad commit because it finds "stdout"
{
    if my_command | tee -p /dev/fd/3 | grep --quiet stdout; then
        false
    else
        true
    fi
} 3>&1

# Good commit because we're not searching through standard error
{
    if my_command | tee -p /dev/fd/3 | grep --quiet stderr; then
        false
    else
        true
    fi
} 3>&1

# Good commit because "other" does not appear anywhere
{
    if my_command | tee -p /dev/fd/3 | grep --quiet other; then
        false
    else
        true
    fi
} 3>&1

Search through standard error

To search through standard error instead of standard output, we need to swap standard error and standard output at the start and end of the pipeline, while telling tee to use another file descriptor for the copy of standard output³:

{
    COMMAND 3>&2 2>&1 1>&3- |
        tee -p /dev/fd/4 |
        grep --quiet REGEX
} 3>&2 2>&1 1>&3- 4>&1

Examples:

my_command() {
    echo 'stdout content'
    echo 'stderr content' >&2
}

# Bad commit because "stdout" occurs on a different file descriptor
{
    my_command 3>&2 2>&1 1>&3- |
        tee -p /dev/fd/4 |
        grep --quiet stdout
} 3>&2 2>&1 1>&3- 4>&1

# Good commit because it finds "stderr"
{
    my_command 3>&2 2>&1 1>&3- |
        tee -p /dev/fd/4 |
        grep --quiet stderr
} 3>&2 2>&1 1>&3- 4>&1

# Bad commit because "other" does not appear anywhere
{
    my_command 3>&2 2>&1 1>&3- |
        tee -p /dev/fd/4 |
        grep --quiet other
} 3>&2 2>&1 1>&3- 4>&1

# Prints "stderr content"
{
    {
        my_command 3>&2 2>&1 1>&3- |
            tee -p /dev/fd/4 |
            grep --quiet stdout
    } 3>&2 2>&1 1>&3- 4>&1
} > /dev/null

# Prints "stdout content"
{
    {
        my_command 3>&2 2>&1 1>&3- |
            tee -p /dev/fd/4 |
            grep --quiet stdout
    } 3>&2 2>&1 1>&3- 4>&1
} 2> /dev/null

Interactive program outputs

Sometimes the output is coming from an interactive program. It’s kind of tedious to deal with shutting down the program manually for each commit, so let’s kill the program after some generous timeout, assuming that it would’ve had the time to output what we were looking for if it was going to:

{
    {
        timeout DURATION COMMAND || [[ $? -eq 124 ]]
    } |
        tee -p /dev/fd/3 |
        grep --quiet REGEX
} 3>&1

Examples:

my_app() {
    echo 'stdout content'
    echo 'stderr content' >&2
    sleep infinity
}
export -f my_app # Propagate the function to the child `bash` shell

# Good commit because it finds "stdout"
{
    {
        timeout 0.5s bash -c my_app || [[ $? -eq 124 ]]
    } |
        tee -p /dev/fd/3 |
        grep --quiet stdout
} 3>&1

# Bad commit because "stderr" occurs on a different file descriptor
{
    {
        timeout 0.5s bash -c my_app || [[ $? -eq 124 ]]
    } |
        tee -p /dev/fd/3 |
        grep --quiet stderr
} 3>&1

# Bad commit because "other" does not appear anywhere
{
    {
        timeout 0.5s bash -c my_app || [[ $? -eq 124 ]]
    } |
        tee -p /dev/fd/3 |
        grep --quiet other
} 3>&1

# Prints "stderr content"
{
    {
        {
            timeout 0.5s bash -c my_app || [[ $? -eq 124 ]]
        } |
            tee -p /dev/fd/3 |
            grep --quiet stdout
    } 3>&1
} > /dev/null

# Prints "stdout content"
{
    {
        {
            timeout 0.5s bash -c my_app || [[ $? -eq 124 ]]
        } |
            tee -p /dev/fd/3 |
            grep --quiet stdout
    } 3>&1
} 2> /dev/null

All of the above

To reproduce this issue⁴ and verify the fix, we need to check for a warning in the standard error of digiKam, a GUI program. This is the resulting script, with some annotations:

#!/usr/bin/env bash

set -o errexit -o noclobber -o nounset -o pipefail
shopt -s failglob inherit_errexit

# Reusable & reproducible build command; will not rebuild the second time around
build=(nix-build --attr digikam --no-out-link)
"${build[@]}" || exit 125

# If "ExifTool process cannot be started" appears on standard error within 10
# seconds, this is a bad commit.
# ShellCheck insists that the *whole* command exit code must be negated.
{
    if {
        timeout 10s "$("${build[@]}")/bin/digikam" 3>&2 2>&1 1>&3- || [[ $? -eq 124 ]]
    } | tee -p /dev/fd/4 | grep --quiet 'ExifTool process cannot be started'; then
        false
    else
        true
    fi
} 3>&2 2>&1 1>&3- 4>&1

Verifying that the test failed before the fix:

$ git checkout 14d2848c9251bc92c475ba3c0d379b5696c4e04b~1 # before fix
Previous HEAD position was […]
HEAD is now at b0baa71c8a72 […]
$ ./bisect.bash # Should show the error message
/nix/store/6g7i3267qdj5brdd3y5yp667r2vfzh90-digikam-8.4.0
qt.qpa.plugin: Could not find the Qt platform plugin "wayland" in ""
digikam.metaengine: ExifTool process cannot be started ( "" )
digikam.general: DK_PLUGIN_PATH env.variable detected. We will use it to load plugin...
digikam.facedb: Cannot found faces engine model "shapepredictor.dat"
digikam.facedb: Faces recognition feature cannot be used!
digikam.facedb: Cannot found faces engine DNN model "openface_nn4.small2.v1.t7"
digikam.facedb: Faces recognition feature cannot be used!
Path override failed for key base::DIR_APP_DICTIONARIES and path '/nix/store/6g7i3267qdj5brdd3y5yp667r2vfzh90-digikam-8.4.0/bin/qtwebengine_dictionaries'
Path override failed for key base::DIR_APP_DICTIONARIES and path '/nix/store/i9c1lffqmhm8ham6wapnbm015s2zyz25-qtwebengine-6.7.2/libexec/qtwebengine_dictionaries'
Path override failed for key base::DIR_APP_DICTIONARIES and path '/nix/store/i9c1lffqmhm8ham6wapnbm015s2zyz25-qtwebengine-6.7.2/libexec/qtwebengine_dictionaries'
kf.xmlgui: Unhandled container to remove :  Digikam::DigikamApp
$ echo $? # Should've failed
1

Verifying that the test succeeds after the fix:

$ git checkout 14d2848c9251bc92c475ba3c0d379b5696c4e04b # after fix
Previous HEAD position was b0baa71c8a72 […]
HEAD is now at 14d2848c9251 digikam: wrap with exiftool
$ ./bisect.bash # Should *not* show the error message
/nix/store/qjllzrspb5dmm8yywr4rb54ffps41nah-digikam-8.4.0
qt.qpa.plugin: Could not find the Qt platform plugin "wayland" in ""
digikam.general: DK_PLUGIN_PATH env.variable detected. We will use it to load plugin...
digikam.facedb: Cannot found faces engine model "shapepredictor.dat"
digikam.facedb: Faces recognition feature cannot be used!
digikam.facedb: Cannot found faces engine DNN model "openface_nn4.small2.v1.t7"
digikam.facedb: Faces recognition feature cannot be used!
Path override failed for key base::DIR_APP_DICTIONARIES and path '/nix/store/qjllzrspb5dmm8yywr4rb54ffps41nah-digikam-8.4.0/bin/qtwebengine_dictionaries'
Path override failed for key base::DIR_APP_DICTIONARIES and path '/nix/store/i9c1lffqmhm8ham6wapnbm015s2zyz25-qtwebengine-6.7.2/libexec/qtwebengine_dictionaries'
Path override failed for key base::DIR_APP_DICTIONARIES and path '/nix/store/i9c1lffqmhm8ham6wapnbm015s2zyz25-qtwebengine-6.7.2/libexec/qtwebengine_dictionaries'
kf.xmlgui: Unhandled container to remove :  Digikam::DigikamApp
$ echo $? # Should've succeeded
0

This article assumes that you’re using GNU tools. If that means nothing to you, then you are probably using them. ↩
Normally we would just use ! to negate the exit code of a command, but this does not play nicely with errexit. ↩
This can probably be simplified to not use file descriptor 4, but that’s enough file descriptor fiddling for now. ↩
This issue seems to have existed basically forever, because a bisect with the previous release of the repo didn’t find a relevant commit. ↩