In the on-going saga of developing FELT (a mammoth transpiler, v0.1 is viewable at http://felt-lang.com, I had need to implement simple type inference...once introduced it touches everything; there are no half measures so for now I have had to make a compromise to avoid getting side-tracked right now.
What is type inference anyway ?
Putting it simply, it’s the ability of a compiler or an interpreter to work out what the data type of an expression or variable is from the context in which it is used rather than having it explicitly declared by the developer.
For example, in the following code, FELT would already know that the type of
int from its internal knowledge base, if you decided to call that something else just to be awkward, well, that would be your own silly fault.
From the above, we (note, -we- not my language yet!) can see that defining the variable
n to have the initial value of
argc implies that they will have the same data type,
int to be precise. Then when the generic
EMIT instruction sees its only argument, all it has to do is ask for the type. Note that it is not fooled by the global scoped definition of argc as a float either, I have already implemented lexical scoping and that is working well.
Where it breaks down, for now.
I broke it with a simple test case that had this in it:
And the nub of the problem is this: in order to know what printf format specifier is needed for the second argument, I would have to know the types of
foo, bar, baz and if they were different types, then I would also now be forced into knowing the type coercion / promotion rules for C as well. As you can see, things mushroom like mad! :D
Don’t need it right now
The reason why I have spent a lot of time on it but not enough time to absolutely smash it is that I have a plan of execution regarding the order of instructions and how to go about squaring the circle to provide complete translation of FELT into C and this is basically, an irksome diversion from that plan.
At the moment the render phase will throw an exception stating that you need to provide the type as it has failed if it can’t work it out. This feels bad and I will fix it eventually, but not today. Not this week. Maybe this month. It will have a good go by examining the lexical scoping of the variable, and also by checking to see if the string is a hex number, a floating point number or decimal number, using “%x”, “%f” and “%i” accordingly but this may not work in all cases.
Call printf directly for C
The language contains the #F:WHEN instruction which lets you have control over what code makes it through to the output stream, so assuming C needed special treatment we could do this instead.
Sadly the current implementation of this instruction does not have an “else” so you would have to explicitly provide an implementation. It was initially designed to allow different code rendering inside a function not to provide complete different functions, but fear not it has been fixed in the new version and will do the right thing as if by magic.
<> which renders an HTML tag with attributes and body.
The function is DEFUN-ed into scope, it has the name
<> which is fine by me, and it says it has one argument we call
body contains the functions arguments. We then array shift the first two arguments away (the tag name and attributes) leaving us with the body content.
Currently you assign a type to a variable like this, “varname/type” (without the quotes) and type is stored and used when needed. I could change the syntax so that the emit argument uses the “starred form”, something the parser already allows for but is not currently used. The starred form? Yes! All instructions can be entered as
foo*… this goes back the first implementation. I might drop this feature but not until I see how much I used it. Basically it lets you choose an alternative internal handling of the instruction, for example, DEFUN defines a function, DEFUN* defines an anonymous function, allowing closures and local functions to be used.
Not a problem. I already have a solution in mind but I really need to make the code generation process my top priority. For the most part, the type inference and lexical scoping is “good enough” and I always knew that the “C” target would be the hardest to do which is why I chose it first…once I have it done the rest of the targets will be a walk in the park.