Daily Source Reading: echo
echo
For the first entry in our Daily Source Reading exercise to read and
learn from, we are picking the simplest one to start with when
exploring BSD code: echo!
While not a particularly exciting program, it does make for a really
good exercise as we will see. Especially once we compare it between
OpenBSD and FreeBSD.
(Btw: if while reading this, you are wondering why the syntax
highlighting is so minimal, see this post.
OpenBSD
Alright, let's start with the simplest solution. Sponsored by
OpenBSD.
if (pledge("stdio", NULL) == -1) err(1, "pledge");
pledge(2) is an OpenBSD specific syscall to limit the permissions of
the running program to only allow input\/output for example. More
permissions can always be taken away later, but never re-enabled while
the program is running.
In this case, everything but stdio is taken away. So echo
pledges that it will only perform input and output operations on
standard I/O.
If pledge fails, we abort.
/* This utility may NOT do getopt(3) option parsing. */ if (*++argv && !strcmp(*argv, "-n")) { ++argv; nflag = 1; } else nflag = 0;
This part is just some simple argument parsing. Not very exciting, I know, so let's move on from that.
while (*argv) { (void)fputs(*argv, stdout); if (*++argv) putchar(' '); } if (!nflag) putchar('\n'); return 0;
Yes, that really is all there is to it. We just iterate over the
remaining arguments in argv, print them to standard out via fputs
and throw some spaces inbetween.
If nflag was not set, we end with an additional newline and we are done.
(Note: the naming of nflag is a bit confusing in my opinion. I
read that name as newline flag, meaning we WANT a newline. Not that
we want to avoid one. But I get it, the n option was set, so it's
the nflag)
Are there more performant ways to do this? Yes, absolutely, and we
will soon see how. Is it needed though? Not really. echo is not
really a tool that requires a high degree of performance. If you were
to throw a billion words at echo, I'd
- question your sanity
- it might exceed the maximum amount of arguments allowed to be passed to a program anyway
FreeBSD
if (caph_limit_stdio() < 0 || caph_enter() < 0) err(1, "capsicum");
caph_limit_stdio and caph_enter are both functions provided by
capsicum, a tool similar to pledge. The idea here is the same and
maps 1:1, we only allow standard I/O access.
veclen = (argc >= 2) ? (argc - 2) * 2 + 1 : 0; if ((vp = iov = malloc((veclen + 1) * sizeof(struct iovec))) == NULL) err(1, "malloc");
Now this is where we see the first difference. It looks like
FreeBSD prefers to allocate ahead of time. iovec here is nothing
else than a base address and a length. It's defined in
sys/sys/_iovec.h. We'll include it here so it's easier to
understand:
struct iovec { void *iov_base; /* Base address. */ size_t iov_len; /* Length. */ };
Nothing fancy, just a very forward struct definition. Moving on
while (argv[0] != NULL) { size_t len; len = strlen(argv[0]);
We start iterating over all remaining arguments and get the length of each word.
/* * If the next argument is NULL then this is the last argument, * therefore we need to check for a trailing \c. */ if (argv[1] == NULL) { /* is there room for a '\c' and is there one? */ if (len >= 2 && argv[0][len - 2] == '\\' && argv[0][len - 1] == 'c') { /* chop it and set the no-newline flag. */ len -= 2; nflag = 1; } }
Then we gotta do some bookkeeping, which also introduces another
difference to OpenBSD. If we have reached the end of the input, we
check if the last two characters of that word are \c. If it is, we
omit printing a newline at the end. It basically does the same as the
-n flag. It's a remainder from System V and ensures
compatibility. OpenBSD chose to ignore that compatibility for easier
code maintenance.
vp->iov_base = *argv;
vp++->iov_len = len;
if (*++argv) {
vp->iov_base = space;
vp++->iov_len = 1;
}
}
The remainder of the loop is filling the iovec array with the
pointer to each word and its length. While we haven't reached the end
yet, we also add spaces.
That was a whole bunch of setup for something that is already longer
than the entire OpenBSD implementation. Both have their upshots as
we see in a bit.
if (!nflag) { veclen++; vp->iov_base = newline; vp++->iov_len = 1; }
This is equivalent to the OpenBSD version. Just check if nflag is
not set, and add a newline into the mix.
while (veclen) { int nwrite; nwrite = (veclen > IOV_MAX) ? IOV_MAX : veclen; if (writev(STDOUT_FILENO, iov, nwrite) == -1) err(1, "write"); iov += nwrite; veclen -= nwrite; } return 0;
For each entry, we use writev to write nwrite number of iovec
entries. writev is a POSIX call that can take a whole bunch of
buffers (our iovec entries) and write them as one atomic operation.
The goal with that is to reduce the amount of syscalls that have to be
made, making it more performant.
And with that, we are done echo-ing to the standard output.
Conclusion
So, clearly both approaches here are very orthogonal to each other.
OpenBSD obviously favors the simple and easily maintainable
solution. It's not that they don't have access to writev (they do),
but sometimes simplicity is king. And let's be fair, echo is rarely
a performance critical bottleneck. It makes the code easier to reason
about and that is the main focus.
FreeBSD on the other hand very much favors performance here, even
for tools like echo. For echo's performance itself, it might not
be that critical, but it reduces the overall load of system calls that
other tools might wanna use and reduces switching into kernel context.
With echo, it was easy enough to make the writeup include every
line, but going forward this will become a bit more tricky. Once it
comes to that, I will only include the critical / important parts.
But as we have seen, even a simple tool like echo can turn into a
great learning experience.