Felix Colibri- Why are Generics Constraints required

Home

Why Generics Constraints ? - Felix John COLIBRI.

abstract : do generics really require constraints ?

key words : generics - type parameters - constraints - compilation

software used : Windows XP Home, Rad Studio 2007, Delphi 2009

hardware used : Pentium 2.800Mhz, 512 M memory, 140 G hard disc

scope : Rad Studio 2007, Delphi 2009

level : Delphi developer

plan :

Craig STUNTZ vs Sergey ANTONOV

A Generics Constraints Crash Course

From Interpreter to Compiler

Compiling Generic code

Your comments are Welcome

1 - Craig STUNTZ vs Sergey ANTONOV

Craig STUNTZ published a series of blog posts about generics:

D2009 Generics and Type Constraints (2008 08 29), where he presents the Delphi version of the classical generic calculator, using a generic tAdder Class
Building a Generic Statistics Library - part 1 to 5, which is a very concrete presentation of higher order functions.

The comments were even more interesting. There was a heated debate led by Sergey ANTONOV whether constraints were required at all for Delphi Generics. The main point being that the compiler has all the information at hand, and could, before generating binary code, decide without any constraint whether "a+b" is legal or not.

Here is our interpretation of those explanations.

2 - Generics Constraints Crash Course

Generics are used in two steps

a first piece of code defines some algorithms (say a container like a tList, or a dictionary with key-value pairs), like this

type c_generic_stack<T>= class
m_generic_array: Array of T;
m_top_of_stack: Integer;

                           constructor create_generic_stack(p_length: Integer);
                           procedure push(p_gen: T);
                           function f_pop: T;
                         end; // c_generic_stack

// ooo

procedure c_generic_stack<T>.push(p_gen: T);
  begin
    if m_top_of_stack< Length(m_generic_array)
      then begin
          m_generic_array[m_top_of_stack]:= p_gen;
          Inc(m_top_of_stack);
        end;
  end; // push

in order to use this algorithm (push a value on a stack in our case), the "true" code must define the actual type which will be used in place of the generic T

var l_c_integer_stack: c_generic_stack<Integer>;

l_c_integer_stack:= c_generic_stack<Integer>.create_generic_stack(5);
l_c_integer_stack.push(111);

var l_c_string_stack: c_generic_stack<string>;

l_c_string_stack:= c_generic_stack<string>.create_generic_stack(5);
l_c_string_stack.push('abc');

In the previous example we handled the T instances globally, simply pushing them on the stack. Very often we would like to perform some operations, like comparisons or arithmetics on T values. Let's assume we want to locate a cell in an array: we must be able to compare cell values with some target values. To do this (Delphi for .Net):

we look for an Interface which defines equality. This is the iEquatable Interface which has an Equals(T) Function

In the generic Class, we constraint the generic type T to use this Interface, and can then use the Equals function on all T values:

type c_find_array<T_data: iEquatable<T_data> >=
         class
           m_array: Array of T_data;
           m_count: Integer;

           constructor create_find_array(p_length: Integer);
           procedure add_to_array(p_cell: T_data);
           function f_index_of(p_cell: T_data): Integer;
         end; // c_find_array

// ooo

function c_find_array<T_data>.f_index_of(p_cell: T_data): Integer;
  var l_index: Integer;
  begin
    Result:= -1;

    for l_index:= 0 to m_count- 1 do
    begin
      if m_array[l_index].Equals(p_cell)
        then begin
            Result:= l_index;
            Break;
          end;
    end; // for l_index
  end; // f_index_of

when we want to use the generic array, the actual type must implement the iEquals Interface. It turns out that in .Net, all the usual value types are equatable: Integer, Double, String etc. So we could write

var g_c_integer_array: c_find_array<Integer>;

g_c_integer_array:= c_find_array<Integer>.create_find_array(10);

g_c_integer_array.add_to_array(111);
g_c_integer_array.add_to_array(222);

writeln('index of 333 : ', f_index_of(222));

The discussion was whether this constraint technique was necessary or not in Delphi 2009.

To decide this, the key points are

how does a compiler translate the generic code into binary code
can the compiler generate the binary without using constraints
if he can, should he ?

3 - From Interpreter to Compiler

Since the discussion revolved around how C++, C# and Delphi Win32 generated the code, lets quickly present the alternatives.

Basically the programmer types source code, and the customer runs binary code. To translate from one to the other, here are some of the possible techniques:

interpreter_vs_compiler

In this figure

3+'4' represents some operation (could be total:= amount+ taxes, or with litteral values, etc)
push_i 3 is the translation in some pseudo-code (translation in the simple, standardized instruction set of a virtual processor)
the virtual code is interpreted or translated into binary code of some concrete processor
MOV AX, 3 is the representation of this binary code. We represented the "assembler" side, the real stuff being a sequence of bytes ($A9 etc). This is the only thing that the processor will ever understand. Everything before, including our source code are intermediate steps

And

green arrow represent the handling on the programmer's side
blue arrow, the handling at the customer's side
red bar the compiler detected errors, fuchsia arrow the run-time detected errors

To sum some of the routes:

with a pure Basic interpreter
- the programmer only writes the source code
- on the user PC, the interpreter analyzes and runs the code. Errors (usually all called "syntax error") are detected there
with the Apple ][ UCSD system
- the source code is translated in some "pseudo processor code", and all type checking is performed at this time.
- the interpreter runs this code (by calling routine which translate each P-code instruction in binary), and runtime errors are flagged there
with the C# environment
- on the developer's site:
  - the source code is translated in MSIL (Microsoft Intermediate Language), and all type checking is performed at this time.
  - when the programmers tests his code, the Just In Time Compiler translates this Byte Code (another name for P-code) into binary
- this binary is deployed, and runtime error catched then
with source to binary compiler, like Turbo Pascal, Delphi, usually all C / C++ compilers
- the source code is translated into binary and all type checking is performed at this time (except for C compilers which do not check anything, since any programmer writing code in C is by definition a system programmer, and we all know those never make any error, so therefore there's nothing to check anyway).
- this binary is deployed, and runtime error catched then

Also notice that 3+'4' could be accepted by a Basic interpreter (which would automatically convert '4' into 4, whereas the other compilers would complain about adding a numeric and a string.

Compiling Generics - Generics implementation

4.1 - The Goal

When we use generics, there is an additional step in the loop:

the developer writes two pieces of code
- the generic code, telling about, say, adding a value of some generic type A, and another value of type B (noted A+B, although in the code you add "values of type A" to "values of type B". The concrete types A and B are not specified at this stage
- the "specialization code", where he specifies that type A and B actually are: Integer, Double, tDateTime, Complex number or whatever
at the customer end of the transformation, a usual, only binary can be used. So the transformation must convert the "A / Integer" in a processor instruction ADD, but "A / Double" into a call to the FPU

Somewhere the generic code and the actual types must be mixed together to generate the binary code. Here are 3 ways to do it

4.2 - C++ templates

In C++, the generic classes are just some kind of templates.

When the compiler finds some actual type, he reads the blueprint of the algorithm in the template, and compiles this template replacing the generic type (A) with the actual type (Integer)

This can be represented like this:

cpp_generics

The ETH people stressed over and over again that this kind of "macro generics" were not "true generics", and their Oberon language had more "compiled generics".

However this techniques allows a lot of flexibility when you are writing generics, since there is theoretically no need to impose any limitation on the operations performed on generic types, since by the time the code reaches the compiler they have been replaced by the macro processor in generic-free code, and the compiler performs his usual type checks on this code.

4.3 - The C# implementation

A we explained in our Delphi .Net generics tutorial, C# uses constraints. By putting constraints about what T can or cannot do, the compiler can check whether the generic code does respect those constraints.

When some class specializes a generic class, additional tests check that the actual type meets the generic ancestor's constraints.

As we wrote in our tutorial,

generics are implemented at the .Net intermediate language level (IL: Intermediate Language= C# pseudo code) and the CLR level (Common Language Runtime: the library managing the code, containing, the IL-to-native compiler, the type checker, the loader, the memory manager etc).
The intermediate language contains :

the parameterized types, along with the standard types

markers for the type arguments

informations about generics included in the IL meta data

When the intermeditate code is compiled into binary code (x386 assembler)

when the code defines type arguments, the metadata is used to update the generic metatdata with the argument metadata

the JIT compiler can then perform its type checking

if the type argument is a value type (Integer, Double etc), the parameters are replaced with the actual type, and the corresponding code is generated. Therefore, there is no boxing / unboxing for those actual types. In addition, if the type is used in some other places, the compiler uses a reference pointing to the compiled code

if the type argument is a reference type (classe, arrays, lists etc), the type parameter is replaced by tObject. The native code uses a reference pointing to the object, and this without any casting.

This can be represented like this:

c#_generics

4.4 - The Delphi 2009 Generics

Delphi 2009 follows the C# constraint technique.

So in Delphi, the compiler

transforms the generic code in units
for the units deriving new classes from generic classes by providing the actual type, the generic units are read back and the binary generated

This can be represented as follows:

delphi_win32

The big difference is that in Delphi 2009, the compiler is the sole involved in generating the code, whereas the C# system uses 2 steps:

the compiler which generates the IL
the JIT compiler which transforms this in binary code.

4.5 - With or Without Constraints ?

In C++ there is no need for constraints since the compiler has everything to check the code.

In C#, since there are 2 compilers, they had the choice

either you use no constraints, and write code with any operator, say addition, and no checks. The JIT then flags the inconsistencies
or you want earlier checks, and use constraints to perform them.

In Delphi there was the same choice :

no constraints on the generic code, and the compilation of the units with the actual type will ferret out the errors. However this will be performed only during the compilation of the .EXE, and the compiler can even display the incorrect line.

The only difference with the current Delphi 2009 "with constraints" choice is that no checks are performed on the type when compiling the generic Units (our red circle)
or with constraints, and the compiler can reject mistakes even while compiling the generic code

Craig STUNTZ clearly prefers the constraints technique, explaining that

the errors are flagged earlier. So this will be a big help to the developer who writes the generic Unit
even more compelling, if there is no constraint, when the code with the actual type finds an error, there is a temptation to change the generic code to fix this problem, but maybe thereby invalidate other Units using the same generic ancestor. In addition the user of the generic library would have to dwelve in the generic code, which he maybe would not like to do.

4.6 - Security vs Flexibility

We had this discussion for a long time, and it is going to become even more important with the coming features

On the security side:

if you can spot some error, the compiler can. And, as Niklaus WIRTH emphasized, he must, and stop right there, and tell you about, and don't move until you correct it. "The compiler will shoot into the foot" (the quote from the joke Shoot yourself in the foot)
the general idea being that a mistake caught by the compiler will cost 100 time less than taking a plane to the customer premisses, understand and fix the bug, not talking about inviting him to a nice restaurant to try to forget about the whole incident.

On the flexibility side:

writing the generic code without having to bother about which Interface has the required operation is obviously both quicker and more readable
years ago, I remember people mentioning Python as a good language which did not require Types, while still guaranteeing type safety

the same goes for Type Inference (from the Delphi .Net draft documentation by Yooichi TAGAWA, with our identifier notation style):

Type t_my_procedure<Y> = Procedure(p_1, p_2: Y) Of Object;

     c_my_class = Class
                    Procedure my_procedure<T>(p_a, p_b: T);
                    Procedure test;
                  end; // c_my_class

Procedure c_my_class.my_procedure<T>(p_a, p_b: T);
  begin
    Write(p_a.ToString, p_b.ToString);
  end; // my_procedure

Procedure c_my_class.test;
  begin
    my_procedure<String>('Hello', 'World');
    my_procedure('Hello', 'World');

    my_procedure<Integer>(10, 20);
    my_procedure(10, 20);
  end; // test

// ooo

var l_my_procedure: t_my_Procedure<Integer>;

l_my_procedure := my_procedure<Integer>;
l_my_procedure(40, 50);

It certainly is more readable to avoid to type <Integer> or <String> before each call of a generic method

On one hand we want to be as expressive as possible, and the code should present what we want to achieve in the most natural way, on the other, we are happy that the compiler catches inconsitencies as soon and as thoroughly as possible.

5 - Your comments are welcome

we welcome any comment, criticism, enhancement, other sources or reference suggestion. Just send an e-mail to fcolibri@felix-colibri.com.
or more simply, enter your (anonymous or with your e-mail if you want an answer) comments below and clic the "send" button

Name :

E-mail :

Comments * :

6 - References

Just a couple of links:

D2009 Generics and Type Constraints (2008 08 29), where he presents the Delphi version of the classical generic calculator, using a generic tAdder Class
Building a Generic Statistics Library - part 1 to 5, which is a very concrete presentation of higher order functions.
Shoot yourself in the foot
Delphi .Net generics tutorial which I wrote when Rad Studio 2007 (Delphi .Net) came out

And don't forget to watch Craig's presentation at CodeRage III: Tuesday, December 2 at 8:45 Pst. He will present a more general point of view about "Functional Programming in Delphi 2009".

We all see the writing on the wall: functional programming is the obvious way to go. For one main reason: mathematics. In the same way that relational databases did overcome the other (hierarchical, navigational) database models (because of Codd's mathematical insight), the functional programming model will allow us to perform, some day, program validation and verification. No longer fiddling to check whether it works, but quietly moving from concept to implementation.

7 - The author

Felix John COLIBRI works at the Pascal Institute. Starting with Pascal in 1979, he then became involved with Object Oriented Programming, Delphi, Sql, Tcp/Ip, Html, UML. Currently, he is mainly active in the area of custom software development (new projects, maintenance, audits, BDE migration, Delphi Xe_n migrations, refactoring), Delphi Consulting and Delph training. His web site features tutorials, technical papers about programming with full downloadable source code, and the description and calendar of forthcoming Delphi, FireBird, Tcp/IP, Web Services, OOP / UML, Design Patterns, Unit Testing training sessions.

Created: oct-07. Last updated: jul-15 - 98 articles, 131 .ZIP sources, 1012 figures
Copyright © Felix J. Colibri http://www.felix-colibri.com 2004 - 2015. All rigths reserved
Back: Home Papers Training Delphi developments Links Download

Felix J COLIBRI

+ Home

+ articles_with_sources
+ database

+ web_internet_sockets

    + oop_components
      – virtual_constructor
      – generics_tutorial

– generics_constraints

      – livebindings_spelunking
    + uml_design_patterns
    + debug_and_test

    + graphic
    + controls
    + colibri_utilities
    + colibri_helpers
    + delphi
    + firemonkey
    + compilers

+ delphi_training

+ delphi_developments

+ sweet_home

– download_zip_sources

+ links

RSS feed

Blog