Domain-Specific Languages (DSLs) are a new buzzword these days. But DSLs have been around for a long time, and are not always computer related. For example, various notations for chess moves, like N-QB3 or Nf6, have been in use since at least the 1700s, and probably earlier. They can describe a chess game much more concisely than any natural language. Some SWAT teams use a visual DSL called SWATCOM, a series of hand signals that can say things like, “I see two men and one woman with guns, moving towards you.” These languages are very concise, but at a cost of being very narrow in scope. For example, neither chess notation nor SWATCOM can say “How did you like the new Britney Spears video?”
This same characteristic applies to computer DSLs. A programmer can use a standard programming language, such as C++, Java, or assembly language, to write a program. But sometimes you need to have programs written by subject-matter experts who are not programmers. Teaching them a general-purpose language is a major undertaking. But you can probably teach them a simple language tailored to their domain.
The RTSTRAN story
In the mid-1970s, I was working for a division of General Motors that needed a way to describe vehicle wiring diagrams to an automated test system. If you have ever looked at the wiring diagram in the back of the owner’s manual for your car, it is drawn like a plate of spaghetti, with devices located based on where they are in the car, and lines crisscrossing each other to connect the devices. It would be very difficult to enter this kind of diagram into the computer without making mistakes. Clearly, we needed another approach.
The first step was to get the Electrical department to redraw the electrical diagrams in the style of industrial ladder diagrams. These diagrams have vertical lines representing the power supply and ground, and a series of horizontal lines that go from the power supply wire to the ground, showing all the devices in between. There are as many lines as are required to describe all the circuits. No attempt is made to represent where the device is physically located.
Here is an example of how a horn circuit looked in the new style of diagram:
Once we had the wiring diagrams in this new form, we developed a domain-specific language, RTSTRAN, that could describe them. Here is how the diagram above could be described in this language:
Since the wiring diagrams were drawn in a standard way, it was fairly easy to teach someone how to convert them to RTSTRAN and enter them into the test system.
By the way, if I were doing this today, three decades later, I would consider having the engineers enter the diagrams using a graphical ladder diagram editor, which would eliminate re-keying the information completely. Here is one example of such an editor.
There are several other available ladder diagram editors, both proprietary and open source, as well as compilers that will compile the diagrams into XML. For more information about them, do a search on the Internet for “IEC 61131-3”, “LD”, and “compiler”. (IEC 61131-3 is an international standard that defines programming languages for programmable logic controllers. The LD section describes graphical ladder diagram representations.)
Once we had the diagrams entered in RTSTRAN, we needed to process them in order to convert them to tables that would be downloaded to the minicomputer that ran the test system. W.H. McKeeman had published a book called A Compiler Generator. It described, and provided the source code for, a compiler for a PL/I-like language called XPL, as well as a grammar analyzer that made parsing tables that could be used to process other languages. These were written in XPL. (It was a self-compiling compiler).
Luckily, Mary Pickett, Heinrik Schultz, and Fred Krull at General Motors Research Labs were working on various industrial computer languages and had ported the XPL compiler and analyzer to PL/I, so they could run on the IBM 370.
Using these programs, along with the RTSTRAN grammar (which was written in BNF), produced a language translator with stubs for processing each element of the language. All I had to do was fill in the processing in each stub, which was a lot easier than writing a compiler from scratch would have been.
Here is what the beginning of the RTSTRAN grammar specification looked like. (In all, there were 111 rules):
Today, there are several open-source compiler-generating tools such available, such as the following:
You might be thinking “That’s nice, but I don’t have any electrical wiring diagrams to process in my product.” That may be, but before you reject the idea of a DSL, think about whether you have any complicated configuration information that the customer has to enter. For example, does the customer need to describe his network to your product, or perhaps the devices in his or her data center? For complicated input of this nature, a DSL can make it much easier and faster for the customer to enter the necessary information.
Implementing a DSL does not have to be as complicated as writing a full compiler. For example, in the late 1980s, my colleague, Ron Colmone, worked on a security product for a minicomputer. He developed a DSL that, with a nod to COBOL, was nicknamed ColmoneBOL. The purpose of the DSL was to process command packets sent up from the minicomputer and issue the appropriate commands on the mainframe security system. There were many types of command packets, and the DSL provided the building blocks necessary to add conditional logic to some very powerful runtime functions. The DSL also provided a debugging/trace facility that assisted in diagnostics and debugging.
In this case, the DSL was processed by the MVS High-Level Assembler, using a macro that did an
AREAD for the entire script, and then generated blocks that could be executed by the interpreter.
Domain-Specific Languages by Martin Fowler with Rebecca Parsons
This brings us to the main subject of this review, Domain-Specific Languages, by Martin Fowler with Rebecca Parsons. It was published in 2011 as part of the Martin Fowler Signature Series. The book begins with some introductory material to ease the reader into the topic of DSLs, then provides a number of chapters about various aspects of DSLs.
Chapter 1 begins with a hypothetical example that is simple and fun, but also lays out some of the reasons one might want to use a DSL. It describes a “Gothic Security System”, based on old Gothic movies, where one opens a secret panel in the wall by pulling the candle holder at the top of the stairs and tapping the wall twice, or something like that. It discusses how one might design a language so this system could be easily adapted for each haunted mansion where it is installed.
In the example, a customer wants a secret panel to open when she closes her bedroom door, opens a drawer in her dresser, and turns on the bedside light. The toy DSL for this project defines the events, shown below in green (the designations like D1CL refer to the inputs from the various sensors), the commands, shown below in red (the designations like PNUL refer to outputs to control locks and such), and then a series of states, shown below in blue. Each state lists events that cause a change of state, and what the new state is. They can optionally specify commands to be issued while in that state. The “resetEvents” section lists events that immediately put the system back into the idle state.
Here is a diagram that shows the state transitions of the finite state machine described by the above language. (The commands issued in each state are not shown.) The reset events are not shown, as they would make the diagram difficult to read. They would appear as lines from each node below the “idle” node, going back to the “idle” node and labeled with “doorOpened”.
The above language, while perhaps not intuitive, is concise, but flexible, since it can trigger actions based any series of events (e.g., close the door, open each dresser door in order, tap on the wall three times, turn on the TV, close each dresser door in order, etc.). And it doesn’t have a lot of syntactic noise, like semicolons and continuation characters.
The book mentions the possibility of using XML to configure the Gothic Security system, and quickly rejects it. XML has a lot of syntactic noise, and a lot of opportunities for syntax errors, with its nested tags, and opening and closing angle brackets that have to match, and that sort of thing. This all makes it fairly unfriendly to humans, particularly those who are not programmers by trade.
The book then talks about internal versus external DSLs. An external DSL is a standalone language, like the one described above. An internal DSL (sometimes referred to as an embedded DSL) is a general-purpose language, like Java or Ruby, that is warped, through special usage, into a DSL. For example, here is the same description of a Gothic Security system, but described in specially formatted Ruby.
This looks very much like the original DSL, with a bit more syntactic noise, but it is actually valid Ruby. The keywords event, command, resetEvents, and state are actually methods. The blocks like “state :unlockedPanel do” use a Ruby construct where the contents of the block, method calls and all, are passed to the method.
Thus it is possible to tack the user’s configuration specification onto the end of the program that defines the methods, and run the whole thing through the Ruby interpreter. This has pros and cons: you don’t have to write a separate compiler for your DSL, but it is syntactically noisier, and the interpreter will probably give confusing messages if the user makes any errors.
The rest of Chapter 1 deals with semantic models, code generation, language workbenches, and visualization. This is followed by 56 chapters in six parts:
- Common Topics
- External DSL Topics
- Internal DSL Topics
- Alternative Computational Models
- Code Generation
With 57 chapters total, the book seems a bit intimidating at first, and it would be, except for the way it is organized, which is similar to the “design patterns” books in the Martin Fowler Signature Series. The book consists of very short chapters, often just three or four pages, that talk about a particular topic. When that topic is referenced, the page where its chapter starts is given in parentheses, like this:
With each event declaration, I can create an event from the Semantic Model and put it into a Symbol Table (165).
Using the page number in the reference, rather than a chapter or section number, or referring to a footnote, means you do not have to go to the table of contents or look somewhere else to find the page you want. And the book has a built-in ribbon to hold your place, so if you want to spend some time in the referenced chapter, you do not have keep your finger on the page you came from to hold your position. This seems like a simple thing, but in practice, it makes it much easier to follow the references.
The short chapters, each about a single topic, make it easy to look up what you need to, and skip things you already know or do not need. If you are implementing a DSL on a deadline, rather than taking a college course on compilers, this organization works much better than the older compiler books that would have a large chapter on each general area of compiler design, long on theory and short on practice. This is a book for the working programmer who needs to implement a DSL.
Now, you may never need to implement a DSL. But it is also possible that doing so would make your life a lot easier, and you have just not considered the possibility. This book will help you decide, and show you some of the possibilities, without bogging you down with too much theory.
One parting thought: some of us still spend most of our working lives coding in assembly language. It is possible to implement a DSL in assembly language. (In fact, ColmoneBOL, mentioned earlier, was implemented in assembly language.) But there are so many more tools and facilities available in high-level languages, not to mention faster coding (studies have shown that programmers code about the same number of lines of code per day, regardless of whether they are coding in assembly language or a high-level languages, and high-level languages do more per line) and fewer error possibilities, that the cost of learning a new language may be outweighed by the improved efficiency it brings.