What is an enumeration?
(If you know, you can skip to the next section.) First you need to know what a constant is. A constant is a named value that is not intended to be changed. LB has a good number of Windows constants predefined for us to use, like _NULL and _SW_SHOW. However, LB doesn't allow us to define out own true constants. We are fprced to use variables, like LVS.REPORT. That's not too bad unless you're using SUBs and FUNCTIONS, then you have to either defines them all GLOBAL or remember to define them inside each SUB or FUNCTION that uses one.
Sometimes you need a bunch of constants. You may not really care what each constant's value is; you just want each one to have a unique value. Many languages have enumeration types.
Code: |
(* Pascal *)
TYPE token = ( END_OF_LINE, FOR_STMT, WHILE_STMT ); VAR T: token; BEGIN T := FOR_STMT; ... // C and C++ enum token { END_OF_LINE, FOR_STMT, WHILE_STMT }; token T = FOR_STMT; ... ' Visual Basic Enum token END_OF_LINE FOR_STMT WHILE_STMT End Enum Dim T As token ... T = FOR_STMT |
What I used in LB
When I wrote my VFScript language in LB a few years back I used STRUCTs as a substitte for enumerated types. It looks something like the following:
Code: |
Struct token, EndOfLine As Word, ForStmt As Word, WhileStmt As Word |
That sure is complicated, but after all of that is done we have ourselves a bunch of constants that can be used anywhere in the program because STRUCTs are global.
Code: |
T = token.ForStmt.struct |
Besides being difficult to read and edit the declaration, and difficult to understand for almost any person, STRUCTs are really slow. See here for further explanation and code examples.
In addition to parsing tokens, VFScript needed error reporting. I needed to associate some text with each token so when the VFScript programmer did something wrong, VFScript could tell him or her what was wrong. (I suppose I could have made VFScript act like LB itself and just say "Syntax Error," but if you can make a program friendlier, why not do it?) I used an array initialized from DATA statements: yet a bigger nightmare to add new tokens. And now I have to keep two sections of code perfectly synchronized, otherwise the user receives an erroneous error message. Not only is that potentially confusing, it's also ironic. A better way of doing this--and the only way in many languages--is explicit assignments.
Code: |
TName$(token.EndOfLine.struct) = "End of line"
TName$(token.ForStmt.struct) = "FOR" TName$(token.WhileStmt.struct) = "WHILE" |
Some non-solutions
Obviously the fastest expression for LB to evaluate is a literal number (e.g. 12). However, a number in isolation has little meaning, so we want a way to give a pseudo-enumeration some names.
My first thought was to use code comments. I would make a list of comments in a block like so:
Code: |
' 5_'End of line
' 10_'FOR ' 15_'WHILE |
Code: |
If token = 10_'FOR
Then ... |
Code: |
If token = /for/ ... |
Code: |
token = 10'FOR |
The problem with the preceding method is that LB's (and every BASIC I know) comments are a one-per-line deal. If only LB had an inline form of comment like C (/* */) or Pascal ((* *) or { }). Well, how about this?
Code: |
token = val("10 FOR")
... If token = val("10 FOR") Then |
Code: |
token = hexdec("A FOR")
... If token = hexdec("A FOR") Then |
Anyway, by now I was getting pretty annoyed because I realized that none of these methods have yet addressed how to display the text given an enumeration constant. I needed to rethink this.
A viable solution
So now I'm re-evaluating my goals for this algorithm. If strings are too slow, how can I use something numeric to comment on something numeric? Ohh, I have numberic variables I can use. But how to use them?
I had the sneaking suspicion that DATAs were going to be involved. So what come naturally with DATA? Well, they can be separated by commas. Okay, if I put on each DATA line a number and some text that just happens to follow the rules for forming numeric variable names in LB, I might search it with relative ease. But what about the "constant" side of the problem? I'd need some expression with a comma in it--a function call--but what's a function I can pass a number and a numeric variable that will always be zero and get back the number? MAX is the answer.
So then the question is, "Is MAX up to the task?." As it turns out, MAX is only about 1.2 times slower than a literal number in my benchmarks. I think that's plenty fast for what it gives us. We still have to Find/Replace and due to the naming restrictions the messages the user gets won't be as "friendly."
Here's some example code.
Code: |
Function TokenName$( Token )
Do Read t, n$ Loop Until t = Token TokenName$ = n$ Data 5, END.OF.LINE Data 10, FOR.STMT Data 15, WHILE.STMT End Function T = max(10, FOR.STMT) |
It's certainly not a perfect solution. Do you think you can do better? If so, post it here and now.