How To Create Your Own Programming Language (Using Ruby and/or Java)
Create Your Own
Programming Language is
an interesting new information product and community by Marc-André Cournoyer (of
Thin fame) that promises to teach you how to
create a simple programming language. The official site is well worth checking out, even if you don't
want to buy it, as it's a great example of how to sell a product like this. Create
Your Own Programming Language costs $39.99 and has a two-month money-back guarantee.
What You Get - 2 Different Approaches
The package includes a 53-page PDF (only 44 pages in the earlier
copy I received), a pile of source code (for two different types of
bootstrapped languages), and a screencast, along with access to a community of
other users. Out of the box, you can create your own "programming
language" (of sorts) with a single shell script.
Two different types of approaches are provided. The first is a
pure Ruby lexer, parser, and interpreter that lets you build your programming
language using mostly Ruby. This is what the PDF covers. The second is a JVM
(Java Virtual Machine) based language that provides a bootstrap and execution
environment upon which you can build a higher performance language; this is
what the screencast covers. In both cases, the default languages are ultra-bare
Ruby variants of sorts.
Quick Results Rather Than Detail
The PDF is short but well produced. It leads you through
building a new Ruby-like language called "Awesome" upon the Ruby-powered lexer, parser, and interpreter supplied in the package. It lacks detail
but covers the broad concepts well with a focus on quick results rather than
detail or inane computer science.
Likewise, the screencast video isn't an "everything from
start to finish" production in the PeepCode manner. It's only 11 minutes
long and moves incredibly quickly. The screencast covers adding a
"while" construct to the JVM-backed language, as well as a
"substring" string method and "eval". The video is good to
get a "high level" view of what's involved (and is probably worth
watching before opening the book, just to get a feel) but to get the most out of
it you need to be either familiar with the terminology and concepts being
covered (partially covered in the PDF) or ready to hit the pause button a lot.
It Whets The Appetite; Great For Dabblers
Create Your Own
Programming Language is
suitably titled but potentially misleading since although it does let you
create your "own" programming language, the resulting language is
within a small gamut of what could be considered to be a "programming
language." No, CYOPL isn't going to meet lofty technical expectations but
it provides a great way to dip your toes into the waters of creating a
language, and I wouldn't hesitate to recommend it to those who want to have a
dabble and learn a few things.
If, however, you consider yourself a bit of a hotshot and want to really dig deep into building compilers and programming languages, the materials in CYOPL lack the detail and the frankly intimidating level of knowledge you'd need to really design and build a robust programming language. If that's you, buy a copy of Compilers: Principles, Techniques and Tools by Aho, Lam, Sethi, and Ullman and enjoy the ride - it's an awesome book and considered canonical in the compiler construction field.
Alternatively, read
Loren Segal's awesome (free) online Writing Your Own Toy
Compiler Using Flex, Bison, and LLVM series.
Free Sample Chapter
Let’s say you have
the following code:
1 print( 2 "I ate", 3 3, 4
pies 5 )
Once this code goes through the lexer,
it will look something like this:
1 [IDENTIFIER print]
["("]
2 [STRING "I ate"]
[","]
3 [NUMBER 3] [","]
4 [IDENTIFIER pies]
5 [")"]
What the lexer does is split the
code into atomic units (tokens) and tag each one with the type of token it
contains. This job can be done by some parsers, as we’ll see in the next
chapter, but separating it into two distinct processes makes it simpler for us
developers and easier to understand.
Lexers can be implemented using regular
expressions, but more appropriate tools exist. Each of these tools take a
grammar that will be compiled into the actual lexer. The format of these
grammars are all alike. Regular expressions on the left-hand side are
repeatedly matched, in order, against the next portion of the input code
string.
When a match is found,
the action on the right is taken. Learn more
about Programming Code.
No comments:
Post a Comment