Daily Source Reading: `echo`

`echo`

For the first entry in our Daily Source Reading exercise to read and learn from, we are picking the simplest one to start with when exploring BSD code: echo!

While not a particularly exciting program, it does make for a really good exercise as we will see. Especially once we compare it between OpenBSD and FreeBSD.

(Btw: if while reading this, you are wondering why the syntax highlighting is so minimal, see this post.

OpenBSD

Alright, let's start with the simplest solution. Sponsored by OpenBSD.

if (pledge("stdio", NULL) == -1)
    err(1, "pledge");

pledge(2) is an OpenBSD specific syscall to limit the permissions of the running program to only allow input\/output for example. More permissions can always be taken away later, but never re-enabled while the program is running.

In this case, everything but stdio is taken away. So echo pledges that it will only perform input and output operations on standard I/O.

If pledge fails, we abort.

/* This utility may NOT do getopt(3) option parsing. */
if (*++argv && !strcmp(*argv, "-n")) {
    ++argv;
    nflag = 1;
}
else
    nflag = 0;

This part is just some simple argument parsing. Not very exciting, I know, so let's move on from that.

while (*argv) {
    (void)fputs(*argv, stdout);
    if (*++argv)
        putchar(' ');
}
if (!nflag)
    putchar('\n');

return 0;

Yes, that really is all there is to it. We just iterate over the remaining arguments in argv, print them to standard out via fputs and throw some spaces inbetween.

If nflag was not set, we end with an additional newline and we are done.

(Note: the naming of nflag is a bit confusing in my opinion. I read that name as newline flag, meaning we WANT a newline. Not that we want to avoid one. But I get it, the n option was set, so it's the nflag)

Are there more performant ways to do this? Yes, absolutely, and we will soon see how. Is it needed though? Not really. echo is not really a tool that requires a high degree of performance. If you were to throw a billion words at echo, I'd

question your sanity
it might exceed the maximum amount of arguments allowed to be passed to a program anyway

FreeBSD

if (caph_limit_stdio() < 0 || caph_enter() < 0)
    err(1, "capsicum");

caph_limit_stdio and caph_enter are both functions provided by capsicum, a tool similar to pledge. The idea here is the same and maps 1:1, we only allow standard I/O access.

veclen = (argc >= 2) ? (argc - 2) * 2 + 1 : 0;

if ((vp = iov = malloc((veclen + 1) * sizeof(struct iovec))) == NULL)
    err(1, "malloc");

Now this is where we see the first difference. It looks like FreeBSD prefers to allocate ahead of time. iovec here is nothing else than a base address and a length. It's defined in sys/sys/_iovec.h. We'll include it here so it's easier to understand:

struct iovec {
    void    *iov_base;  /* Base address. */
    size_t   iov_len;   /* Length. */
};

Nothing fancy, just a very forward struct definition. Moving on

while (argv[0] != NULL) {
    size_t len;

    len = strlen(argv[0]);

We start iterating over all remaining arguments and get the length of each word.

/*
 * If the next argument is NULL then this is the last argument,
 * therefore we need to check for a trailing \c.
 */
if (argv[1] == NULL) {
    /* is there room for a '\c' and is there one? */
    if (len >= 2 &&
        argv[0][len - 2] == '\\' &&
        argv[0][len - 1] == 'c') {
        /* chop it and set the no-newline flag. */
        len -= 2;
        nflag = 1;
    }
}

Then we gotta do some bookkeeping, which also introduces another difference to OpenBSD. If we have reached the end of the input, we check if the last two characters of that word are \c. If it is, we omit printing a newline at the end. It basically does the same as the -n flag. It's a remainder from System V and ensures compatibility. OpenBSD chose to ignore that compatibility for easier code maintenance.

    vp->iov_base = *argv;
    vp++->iov_len = len;
    if (*++argv) {
        vp->iov_base = space;
        vp++->iov_len = 1;
    }
}

The remainder of the loop is filling the iovec array with the pointer to each word and its length. While we haven't reached the end yet, we also add spaces.

That was a whole bunch of setup for something that is already longer than the entire OpenBSD implementation. Both have their upshots as we see in a bit.

if (!nflag) {
    veclen++;
    vp->iov_base = newline;
    vp++->iov_len = 1;
}

This is equivalent to the OpenBSD version. Just check if nflag is not set, and add a newline into the mix.

while (veclen) {
    int nwrite;

    nwrite = (veclen > IOV_MAX) ? IOV_MAX : veclen;
    if (writev(STDOUT_FILENO, iov, nwrite) == -1)
        err(1, "write");
    iov += nwrite;
    veclen -= nwrite;
}
return 0;

For each entry, we use writev to write nwrite number of iovec entries. writev is a POSIX call that can take a whole bunch of buffers (our iovec entries) and write them as one atomic operation. The goal with that is to reduce the amount of syscalls that have to be made, making it more performant.

And with that, we are done echo-ing to the standard output.

Conclusion

So, clearly both approaches here are very orthogonal to each other.

OpenBSD obviously favors the simple and easily maintainable solution. It's not that they don't have access to writev (they do), but sometimes simplicity is king. And let's be fair, echo is rarely a performance critical bottleneck. It makes the code easier to reason about and that is the main focus.

FreeBSD on the other hand very much favors performance here, even for tools like echo. For echo's performance itself, it might not be that critical, but it reduces the overall load of system calls that other tools might wanna use and reduces switching into kernel context.

With echo, it was easy enough to make the writeup include every line, but going forward this will become a bit more tricky. Once it comes to that, I will only include the critical / important parts.

But as we have seen, even a simple tool like echo can turn into a great learning experience.

Daily Source Reading: echo

echo

OpenBSD

FreeBSD

Conclusion

Daily Source Reading: `echo`

`echo`