Why Generics Constraints ? - Felix John COLIBRI.
- abstract : do generics really require constraints ?
- key words : generics - type parameters - constraints - compilation
- software used : Windows XP Home, Rad Studio 2007, Delphi 2009
- hardware used : Pentium 2.800Mhz, 512 M memory, 140 G hard disc
- scope : Rad Studio 2007, Delphi 2009
- level : Delphi developer
- plan :
1 - Craig STUNTZ vs Sergey ANTONOV
Craig STUNTZ published a series of blog posts about generics: The comments were even more interesting. There was a heated debate led by Sergey ANTONOV whether constraints were required at all for Delphi Generics.
The main point being that the compiler has all the information at hand, and could, before generating binary code, decide without any constraint whether "a+b" is legal or not.
Here is our interpretation of those explanations.
2 - Generics Constraints Crash Course Generics are used in two steps
- a first piece of code defines some algorithms (say a container like a
tList, or a dictionary with key-value pairs), like this
type c_generic_stack<T>= class |
m_generic_array: Array of T;
constructor create_generic_stack(p_length: Integer);
procedure push(p_gen: T);
function f_pop: T;
end; // c_generic_stack
procedure c_generic_stack<T>.push(p_gen: T);
if m_top_of_stack< Length(m_generic_array)
end; // push
- in order to use this algorithm (push a value on a stack in our case), the
"true" code must define the actual type which will be used in place of the generic T
var l_c_integer_stack: c_generic_stack<Integer>; |
var l_c_string_stack: c_generic_stack<string>;
In the previous example we handled the T instances globally, simply pushing them on the stack. Very often we would like to perform some operations, like comparisons or arithmetics on T values. Let's assume we want to locate a cell
in an array: we must be able to compare cell values with some target values. To do this (Delphi for .Net):
The discussion was whether this constraint technique was necessary or not in Delphi 2009.
To decide this, the key points are
- how does a compiler translate the generic code into binary code
- can the compiler generate the binary without using constraints
- if he can, should he ?
3 - From Interpreter to Compiler
Since the discussion revolved around how C++, C# and Delphi Win32 generated the code, lets quickly present the alternatives.
Basically the programmer types source code, and the customer runs binary code.
To translate from one to the other, here are some of the possible techniques:
In this figure
- 3+'4' represents some operation (could be total:= amount+ taxes, or with
litteral values, etc)
- push_i 3 is the translation in some pseudo-code (translation in the simple, standardized instruction set of a virtual processor)
- the virtual code is interpreted or translated into binary code of some
- MOV AX, 3 is the representation of this binary code. We represented the "assembler" side, the real stuff being a sequence of bytes ($A9 etc). This is the only thing that the processor will ever understand. Everything
before, including our source code are intermediate steps
- green arrow represent the handling on the programmer's side
- blue arrow, the handling at the customer's side
- red bar the compiler detected errors, fuchsia arrow the run-time detected errors
To sum some of the routes:
- with a pure Basic interpreter
- the programmer only writes the source code
- on the user PC, the interpreter analyzes and runs the code. Errors (usually all called "syntax error") are detected there
- with the Apple ][ UCSD system
- the source code is translated in some "pseudo processor code", and all
type checking is performed at this time.
- the interpreter runs this code (by calling routine which translate each P-code instruction in binary), and runtime errors are flagged there
- with the C# environment
- on the developer's site:
- the source code is translated in MSIL (Microsoft Intermediate Language), and all type checking is performed at this time.
- when the programmers tests his code, the Just In Time Compiler
translates this Byte Code (another name for P-code) into binary
- this binary is deployed, and runtime error catched then
- with source to binary compiler, like Turbo Pascal, Delphi, usually all C / C++ compilers
- the source code is translated into binary and all type checking is performed at this time (except for C compilers which do not check anything, since any programmer writing code in C is by definition a
system programmer, and we all know those never make any error, so therefore there's nothing to check anyway).
- this binary is deployed, and runtime error catched then
Also notice that 3+'4' could be accepted by a Basic interpreter (which would automatically convert '4' into 4, whereas the other compilers would complain about adding a numeric and a string.
Compiling Generics - Generics implementation
4.1 - The Goal When we use generics, there is an additional step in the loop:
- the developer writes two pieces of code
- the generic code, telling about, say, adding a value of some generic type A, and another value of type B (noted A+B, although in the code you add "values of type A" to "values of type B". The concrete types A and B are not specified at this stage
- the "specialization code", where he specifies that type A and B actually are: Integer, Double, tDateTime, Complex number or whatever
- at the customer end of the transformation, a usual, only binary can be used.
So the transformation must convert the "A / Integer" in a processor instruction ADD, but "A / Double" into a call to the FPU
Somewhere the generic code and the actual types must be mixed together to
generate the binary code. Here are 3 ways to do it
4.2 - C++ templates In C++, the generic classes are just some kind of templates.
When the compiler finds some actual type, he reads the blueprint of the
algorithm in the template, and compiles this template replacing the generic type (A) with the actual type (Integer)
This can be represented like this:
The ETH people stressed over and over again that this kind of "macro generics" were not "true generics", and their Oberon language had more "compiled generics".
However this techniques allows a lot of flexibility when you are writing
generics, since there is theoretically no need to impose any limitation on the operations performed on generic types, since by the time the code reaches the compiler they have been replaced by the macro processor in generic-free code,
and the compiler performs his usual type checks on this code.
4.3 - The C# implementation A we explained in our Delphi .Net generics
tutorial, C# uses constraints. By putting constraints about what T can or cannot do, the compiler can check whether the generic code does respect those constraints.
When some class specializes a generic class, additional tests check that the
actual type meets the generic ancestor's constraints.
As we wrote in our tutorial,
generics are implemented at the .Net intermediate language level (IL:
Intermediate Language= C# pseudo code) and the CLR level (Common Language Runtime: the library managing the code, containing, the IL-to-native compiler, the type checker, the loader, the memory manager etc).
The intermediate language contains :
When the intermeditate code is compiled into binary code (x386 assembler)
- the parameterized types, along with the standard types
- markers for the type arguments
- informations about generics included in the IL meta data
- when the code defines type arguments, the metadata is used to update the generic metatdata with the argument metadata
- the JIT compiler can then perform its type checking
- if the type argument is a value type (Integer, Double etc), the parameters are replaced with the actual type, and the corresponding code is generated.
Therefore, there is no boxing / unboxing for those actual types. In addition, if the type is used in some other places, the compiler uses a reference pointing to the compiled code
- if the type argument is a reference type (classe, arrays, lists etc), the type parameter is replaced by tObject. The native code uses a reference pointing to the object, and this without any casting.
This can be represented like this:
4.4 - The Delphi 2009 Generics Delphi 2009 follows the C# constraint technique.
So in Delphi, the compiler
- transforms the generic code in units
- for the units deriving new classes from generic classes by providing the actual type, the generic units are read back and the binary generated
This can be represented as follows:
The big difference is that in Delphi 2009, the compiler is the sole involved in generating the code, whereas the C# system uses 2 steps:
- the compiler which generates the IL
- the JIT compiler which transforms this in binary code.
4.5 - With or Without Constraints ? In C++ there is no need for constraints since the compiler has everything to check the code.
In C#, since there are 2 compilers, they had the choice
- either you use no constraints, and write code with any operator, say addition, and no checks. The JIT then flags the inconsistencies
- or you want earlier checks, and use constraints to perform them.
In Delphi there was the same choice :
Craig STUNTZ clearly prefers the constraints technique, explaining that
- the errors are flagged earlier. So this will be a big help to the developer who writes the generic Unit
- even more compelling, if there is no constraint, when the code with the actual type finds an error, there is a temptation to change the generic code to fix this problem, but maybe thereby invalidate other Units using the
same generic ancestor. In addition the user of the generic library would have to dwelve in the generic code, which he maybe would not like to do.
4.6 - Security vs Flexibility
We had this discussion for a long time, and it is going to become even more important with the coming features
On the security side:
On the flexibility side:
- if you can spot some error, the compiler can. And, as Niklaus WIRTH
emphasized, he must, and stop right there, and tell you about, and don't move until you correct it. "The compiler will shoot into the foot" (the quote from the joke Shoot yourself in the foot)
- the general idea being that a mistake caught by the compiler will cost 100
time less than taking a plane to the customer premisses, understand and fix the bug, not talking about inviting him to a nice restaurant to try to forget about the whole incident.
On one hand we want to be as expressive as possible, and the code should
present what we want to achieve in the most natural way, on the other, we are happy that the compiler catches inconsitencies as soon and as thoroughly as possible.
5 - Your comments are welcome
- we welcome any comment, criticism, enhancement, other sources or reference suggestion. Just send an e-mail to firstname.lastname@example.org.
- or more simply, enter your (anonymous or with your e-mail if you want an answer) comments below and clic the "send" button
6 - References Just a couple of links:
And don't forget to watch Craig's presentation at CodeRage III: Tuesday, December 2 at 8:45 Pst. He will present a more general point of view about "Functional Programming in Delphi 2009".
We all see the writing on the wall: functional programming is the obvious way to go. For one main reason: mathematics. In the same way that relational databases did overcome the other (hierarchical, navigational)
database models (because of Codd's mathematical insight), the functional programming model will allow us to perform, some day, program validation and verification. No longer fiddling to check whether it works, but quietly moving
from concept to implementation.
7 - The author Felix John COLIBRI works at the Pascal Institute. Starting with Pascal in 1979, he then became involved with Object
Oriented Programming, Delphi, Sql, Tcp/Ip, Html, UML. Currently, he is mainly active in the area of custom software
development (new projects, maintenance, audits, BDE migration, Delphi Xe_n migrations, refactoring), Delphi Consulting and Delph training. His web site features tutorials, technical papers about programming with full downloadable source
code, and the description and calendar of forthcoming Delphi, FireBird, Tcp/IP, Web Services, OOP / UML, Design Patterns, Unit Testing training sessions.