April 13th, 2024 UCL

Tool Command Language: Macros And Blocks

More work on the tool command language (of which I need to come up with a name: I can't use the abbreviation TCL), this time working on getting multi-line statement blocks working. As in:

echo "Here"
echo "There"

I got a little wrapped up about how I can configure the parser to recognise new-lines as statement separators. I tried this in the past with a hand rolled lexer and ended up peppering NL tokens all around the grammar. I was fearing that I needed to do something like this here. After a bit of experimentation, I think I've come up with a way to recognise new-lines as statement separators without making the grammar too messy. The unit tests verifying this so far seem to work.

// Excerpt of the grammar showing all the 'NL' token matches.
// These match a new-line, plus any whitespace afterwards.

type astStatements struct {
    First *astPipeline   `parser:"@@"`
    Rest  []*astPipeline `parser:"( NL+ @@ )*"`
}

type astBlock struct {
    Statements []*astStatements `parser:"LC NL? @@ NL? RC"`
}

type astScript struct {
    Statements *astStatements `parser:"NL* @@ NL*"`
}

I'm still using a stateful lexer as it may come in handy when it comes to string interpolation. Not sure if I'll add this, but I'd like the option.

Another big addition today was macros. These are much like commands, but instead of arguments being evaluated before being passed through to the command, they're deferred and the command can explicitly request their evaluation whenever. I think Lisp has something similar: this is not that novel.

This was used to implement the if command, which is now working:

set x "true"
if $x {
  echo "Is true"
} else {
  echo "Is not true"
}

Of course, there are actually no operators yet, so it doesn't really do much at the moment.

This spurred the need for blocks. which is a third large addition made today. They're just a group of statements that are wrapped in an object type. They're "invokable" in that the statements can be executed and produce a result, but they're also a value that can be passed around. It jells nicely with the macro approach.

Must say that I like the idea of using macros for things like if over baking it into the language. It can only add to the "embed-ability" of this, which is what I'm looking for.

Finally, I did see something interesting in the tests. I was trying the following test:

echo "Hello"
echo "World"

And I was expecting a Hello and World to be returned over two lines. But only World was being returning. Of course! Since echo is actually producing a stream and not printing anything to stdout, it would only return World.

I decided to change this. If I want to use echo to display a message, then the above script should display both Hello and World in some manner. The downside is that I don't think I'll be able to support constructs like this, where echo provides a source for a pipeline:

# This can't work anymore
echo "Hello" | toUpper

I mean, I could probably detect whether echo is connected to a pipe (the parser can give that information). But what about other commands that output something? Would they need to be treated similarly?

I think it's probably best to leave this out for now, and have a new construct for providing literals like this to a pipe. Heck, maybe just having the string itself would be enough:

"hello" | toUpper

Anyway, that's all for today.