Daily Source Reading: ed [Part 4 - Regexps]
ed regexps
The final strech! We are very close to ending our journey in the
wonderful land of ed. We have looked at its overall structure, its
buffers and its commands. Now it is time for the one command we
skipped over in our last part: s, the substitution command.
do { switch (*ibufp) { case '\n': sflags |=SGF; break; case 'g': sflags |= SGG; ibufp++; break; case 'p': sflags |= SGP; ibufp++; break; case 'r': sflags |= SGR; ibufp++; break; case '0': case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9': STRTOI(sgnum, ibufp); sflags |= SGF; sgflag &= ~GSG; /* override GSG */ break; default: if (sflags) { seterrmsg("invalid command suffix"); return ERR; } } } while (sflags && *ibufp != '\n');
This first part performs some additional option parsing for the s
command. It behaves very similar to what you'd expect from getopt
or the likes.
These flags are for the case that we are repeating a substitution.
For example sg to repeat the last substitution with the global flag
enabled.
if (sflags && !pat) { seterrmsg("no previous substitution"); return ERR; } else if (sflags & SGG) sgnum = 0; /* override numeric arg */ if (*ibufp != '\n' && *(ibufp + 1) == '\n') { seterrmsg("invalid pattern delimiter"); return ERR; }
Then it's off to some quick sanity checking. To repeat something, we require a previous substitution to have happened.
tpat = pat; SPL1(); if ((!sflags || (sflags & SGR)) && (tpat = get_compiled_pattern()) == NULL) { SPL0(); return ERR; } else if (tpat != pat) { if (pat) { regfree(pat); free(pat); } pat = tpat; patlock = 1; /* reserve pattern */ } SPL0();
Here we go and first lock down interrupts so we can proceed without
any disturbance. If we are not repeating anything or we are repeating
with the r flag, we go and recompile the pattern. In some
implementations this is a hand-crafted regexp engine. OpenBSD here
uses the regex library to keep it simple. We don't have to re-invent
everything.
If the new pattern is different from the old one, we release the old one and save the new pattern into it.
if (!sflags && extract_subst_tail(&sgflag, &sgnum) < 0) return ERR; else if (isglobal) sgflag |= GLB; else sgflag &= ~GLB; if (sflags & SGG) sgflag ^= GSG; if (sflags & SGP) { sgflag ^= GPR; sgflag &= ~(GLS | GNP); }
So, if we are not repeating anything (i.e. sflags is not set), we
extract the subtitution text and any flags after it. The text we want
to replace was already done in the pattern compilation part.
do { switch (*ibufp) { case 'p': sgflag |= GPR; ibufp++; break; case 'l': sgflag |= GLS; ibufp++; break; case 'n': sgflag |= GNP; ibufp++; break; default: n++; } } while (!n);
This is plain parsin of any last printing related flags. Not
interesting, but it's there. A lot of the seeming "complexity" of
the s command code stems mostly from the option parsing.
if (check_addr_range(current_addr, current_addr) < 0) return ERR; GET_COMMAND_SUFFIX(); if (!isglobal) clear_undo_stack(); if (search_and_replace(pat, sgflag, sgnum) < 0) return ERR; break;
And here finally we check if we are good to go and then perform our
search_and_replace. The implementation of this one exceeds my
concentration today (came down with a nasty cold). But as we have
seen, the code is not dark magic and does not require any blood
sacrifices.
Shorter version
If we break all that code down into a simple digestible version, it would look something like this (very, very simplified):
case 's': parse_repeat_flags(); compile_or_reuse_pattern(); parse_substitution_tail(); apply_repeat_flag_toggles(); execute_substitution(); break;
Conclusion
I think this is it for now in our trip through ed.
I highly recommend trying to use it for a bit. Build some experience
with it. There are systems that don't even have a vi installed. So
knowing ed can potentially save your bacon if your network switch
gets hosed or other obscure scenarios.
It will also explain the heritage of a few commands in vi and vim.
What's up next? I haven't decided yet. Due to being sick at the
moment, it will probably something small, yet interesting. I am
thinking maybe the yes command. You'd be surprised what stark
differences there can be.