menu
  Home  ==>  papers  ==>  oop_components  ==>  why_generics_constraints   

Why Generics Constraints ? - Felix John COLIBRI.




1 - Craig STUNTZ vs Sergey ANTONOV

Craig STUNTZ published a series of blog posts about generics: The comments were even more interesting. There was a heated debate led by Sergey ANTONOV whether constraints were required at all for Delphi Generics. The main point being that the compiler has all the information at hand, and could, before generating binary code, decide without any constraint whether "a+b" is legal or not.

Here is our interpretation of those explanations.




2 - Generics Constraints Crash Course

Generics are used in two steps
  • a first piece of code defines some algorithms (say a container like a tList, or a dictionary with key-value pairs), like this

    type c_generic_stack<T>= class
                               m_generic_arrayArray of T;
                               m_top_of_stackInteger;

                               constructor create_generic_stack(p_lengthInteger);
                               procedure push(p_genT);
                               function f_popT;
                             end// c_generic_stack

    // ooo

    procedure c_generic_stack<T>.push(p_genT);
      begin
        if m_top_of_stackLength(m_generic_array)
          then begin
              m_generic_array[m_top_of_stack]:= p_gen;
              Inc(m_top_of_stack);
            end;
      end// push

  • in order to use this algorithm (push a value on a stack in our case), the "true" code must define the actual type which will be used in place of the generic T

    var l_c_integer_stackc_generic_stack<Integer>;

      l_c_integer_stack:= c_generic_stack<Integer>.create_generic_stack(5);
      l_c_integer_stack.push(111);

    var l_c_string_stackc_generic_stack<string>;

      l_c_string_stack:= c_generic_stack<string>.create_generic_stack(5);
      l_c_string_stack.push('abc');




In the previous example we handled the T instances globally, simply pushing them on the stack. Very often we would like to perform some operations, like comparisons or arithmetics on T values. Let's assume we want to locate a cell in an array: we must be able to compare cell values with some target values. To do this (Delphi for .Net):
  • we look for an Interface which defines equality. This is the iEquatable Interface which has an Equals(T) Function

    In the generic Class, we constraint the generic type T to use this Interface, and can then use the Equals function on all T values:

    type c_find_array<T_dataiEquatable<T_data> >=
             class
               m_arrayArray of T_data;
               m_countInteger;

               constructor create_find_array(p_lengthInteger);
               procedure add_to_array(p_cellT_data);
               function f_index_of(p_cellT_data): Integer;
             end// c_find_array

    // ooo

    function c_find_array<T_data>.f_index_of(p_cellT_data): Integer;
      var l_indexInteger;
      begin
        Result:= -1;

        for l_index:= 0 to m_count- 1 do
        begin
          if m_array[l_index].Equals(p_cell)
            then begin
                Result:= l_index;
                Break;
              end;
        end// for l_index
      end// f_index_of

  • when we want to use the generic array, the actual type must implement the iEquals Interface. It turns out that in .Net, all the usual value types are equatable: Integer, Double, String etc. So we could write

    var g_c_integer_arrayc_find_array<Integer>;

      g_c_integer_array:= c_find_array<Integer>.create_find_array(10);

      g_c_integer_array.add_to_array(111);
      g_c_integer_array.add_to_array(222);

      writeln('index of 333 : 'f_index_of(222));




The discussion was whether this constraint technique was necessary or not in Delphi 2009.

To decide this, the key points are

  • how does a compiler translate the generic code into binary code
  • can the compiler generate the binary without using constraints
  • if he can, should he ?


3 - From Interpreter to Compiler

Since the discussion revolved around how C++, C# and Delphi Win32 generated the code, lets quickly present the alternatives.

Basically the programmer types source code, and the customer runs binary code. To translate from one to the other, here are some of the possible techniques:

interpreter_vs_compiler

In this figure

  • 3+'4' represents some operation (could be total:= amount+ taxes, or with litteral values, etc)
  • push_i 3 is the translation in some pseudo-code (translation in the simple, standardized instruction set of a virtual processor)
  • the virtual code is interpreted or translated into binary code of some concrete processor
  • MOV AX, 3 is the representation of this binary code. We represented the "assembler" side, the real stuff being a sequence of bytes ($A9 etc). This is the only thing that the processor will ever understand. Everything before, including our source code are intermediate steps
And
  • green arrow represent the handling on the programmer's side
  • blue arrow, the handling at the customer's side
  • red bar the compiler detected errors, fuchsia arrow the run-time detected errors


To sum some of the routes:
  • with a pure Basic interpreter
    • the programmer only writes the source code
    • on the user PC, the interpreter analyzes and runs the code. Errors (usually all called "syntax error") are detected there
  • with the Apple ][ UCSD system
    • the source code is translated in some "pseudo processor code", and all type checking is performed at this time.
    • the interpreter runs this code (by calling routine which translate each P-code instruction in binary), and runtime errors are flagged there
  • with the C# environment
    • on the developer's site:
      • the source code is translated in MSIL (Microsoft Intermediate Language), and all type checking is performed at this time.
      • when the programmers tests his code, the Just In Time Compiler translates this Byte Code (another name for P-code) into binary
    • this binary is deployed, and runtime error catched then
  • with source to binary compiler, like Turbo Pascal, Delphi, usually all C / C++ compilers
    • the source code is translated into binary and all type checking is performed at this time (except for C compilers which do not check anything, since any programmer writing code in C is by definition a system programmer, and we all know those never make any error, so therefore there's nothing to check anyway).
    • this binary is deployed, and runtime error catched then


Also notice that 3+'4' could be accepted by a Basic interpreter (which would automatically convert '4' into 4, whereas the other compilers would complain about adding a numeric and a string.




Compiling Generics - Generics implementation

4.1 - The Goal

When we use generics, there is an additional step in the loop:
  • the developer writes two pieces of code
    • the generic code, telling about, say, adding a value of some generic type A, and another value of type B (noted A+B, although in the code you add "values of type A" to "values of type B". The concrete types A and B are not specified at this stage
    • the "specialization code", where he specifies that type A and B actually are: Integer, Double, tDateTime, Complex number or whatever
  • at the customer end of the transformation, a usual, only binary can be used. So the transformation must convert the "A / Integer" in a processor instruction ADD, but "A / Double" into a call to the FPU


Somewhere the generic code and the actual types must be mixed together to generate the binary code. Here are 3 ways to do it



4.2 - C++ templates

In C++, the generic classes are just some kind of templates.

When the compiler finds some actual type, he reads the blueprint of the algorithm in the template, and compiles this template replacing the generic type (A) with the actual type (Integer)



This can be represented like this:

cpp_generics

The ETH people stressed over and over again that this kind of "macro generics" were not "true generics", and their Oberon language had more "compiled generics".

However this techniques allows a lot of flexibility when you are writing generics, since there is theoretically no need to impose any limitation on the operations performed on generic types, since by the time the code reaches the compiler they have been replaced by the macro processor in generic-free code, and the compiler performs his usual type checks on this code.



4.3 - The C# implementation

A we explained in our Delphi .Net generics tutorial, C# uses constraints. By putting constraints about what T can or cannot do, the compiler can check whether the generic code does respect those constraints.

When some class specializes a generic class, additional tests check that the actual type meets the generic ancestor's constraints.

As we wrote in our tutorial,

generics are implemented at the .Net intermediate language level (IL: Intermediate Language= C# pseudo code) and the CLR level (Common Language Runtime: the library managing the code, containing, the IL-to-native compiler, the type checker, the loader, the memory manager etc).

The intermediate language contains :

  • the parameterized types, along with the standard types
  • markers for the type arguments
  • informations about generics included in the IL meta data
When the intermeditate code is compiled into binary code (x386 assembler)
  • when the code defines type arguments, the metadata is used to update the generic metatdata with the argument metadata
  • the JIT compiler can then perform its type checking
  • if the type argument is a value type (Integer, Double etc), the parameters are replaced with the actual type, and the corresponding code is generated. Therefore, there is no boxing / unboxing for those actual types. In addition, if the type is used in some other places, the compiler uses a reference pointing to the compiled code
  • if the type argument is a reference type (classe, arrays, lists etc), the type parameter is replaced by tObject. The native code uses a reference pointing to the object, and this without any casting.



This can be represented like this:

c#_generics



4.4 - The Delphi 2009 Generics

Delphi 2009 follows the C# constraint technique.

So in Delphi, the compiler

  • transforms the generic code in units
  • for the units deriving new classes from generic classes by providing the actual type, the generic units are read back and the binary generated


This can be represented as follows:

delphi_win32



The big difference is that in Delphi 2009, the compiler is the sole involved in generating the code, whereas the C# system uses 2 steps:

  • the compiler which generates the IL
  • the JIT compiler which transforms this in binary code.


4.5 - With or Without Constraints ?

In C++ there is no need for constraints since the compiler has everything to check the code.



In C#, since there are 2 compilers, they had the choice

  • either you use no constraints, and write code with any operator, say addition, and no checks. The JIT then flags the inconsistencies
  • or you want earlier checks, and use constraints to perform them.


In Delphi there was the same choice :
  • no constraints on the generic code, and the compilation of the units with the actual type will ferret out the errors. However this will be performed only during the compilation of the .EXE, and the compiler can even display the incorrect line.

    delphi_win32_no_constraints

    The only difference with the current Delphi 2009 "with constraints" choice is that no checks are performed on the type when compiling the generic Units (our red circle)

  • or with constraints, and the compiler can reject mistakes even while compiling the generic code


Craig STUNTZ clearly prefers the constraints technique, explaining that
  • the errors are flagged earlier. So this will be a big help to the developer who writes the generic Unit
  • even more compelling, if there is no constraint, when the code with the actual type finds an error, there is a temptation to change the generic code to fix this problem, but maybe thereby invalidate other Units using the same generic ancestor. In addition the user of the generic library would have to dwelve in the generic code, which he maybe would not like to do.


4.6 - Security vs Flexibility

We had this discussion for a long time, and it is going to become even more important with the coming features

On the security side:

  • if you can spot some error, the compiler can. And, as Niklaus WIRTH emphasized, he must, and stop right there, and tell you about, and don't move until you correct it. "The compiler will shoot into the foot" (the quote from the joke Shoot yourself in the foot)
  • the general idea being that a mistake caught by the compiler will cost 100 time less than taking a plane to the customer premisses, understand and fix the bug, not talking about inviting him to a nice restaurant to try to forget about the whole incident.
On the flexibility side:
  • writing the generic code without having to bother about which Interface has the required operation is obviously both quicker and more readable
  • years ago, I remember people mentioning Python as a good language which did not require Types, while still guaranteeing type safety
  • the same goes for Type Inference (from the Delphi .Net draft documentation by Yooichi TAGAWA, with our identifier notation style):

    Type t_my_procedure<Y> = Procedure(p_1p_2YOf Object;

         c_my_class = Class
                        Procedure my_procedure<T>(p_ap_bT);
                        Procedure test;
                      end// c_my_class

    Procedure c_my_class.my_procedure<T>(p_ap_bT);
      begin
        Write(p_a.ToStringp_b.ToString);
      end// my_procedure

    Procedure c_my_class.test;
      begin
        my_procedure<String>('Hello''World');
        my_procedure('Hello''World');

        my_procedure<Integer>(10, 20);
        my_procedure(10, 20);
      end// test

    // ooo

    var l_my_proceduret_my_Procedure<Integer>;

      l_my_procedure := my_procedure<Integer>;
      l_my_procedure(40, 50);

    It certainly is more readable to avoid to type <Integer> or <String> before each call of a generic method



On one hand we want to be as expressive as possible, and the code should present what we want to achieve in the most natural way, on the other, we are happy that the compiler catches inconsitencies as soon and as thoroughly as possible.




5 - Your comments are welcome

  • we welcome any comment, criticism, enhancement, other sources or reference suggestion. Just send an e-mail to fcolibri@felix-colibri.com.
  • or more simply, enter your (anonymous or with your e-mail if you want an answer) comments below and clic the "send" button
    Name :
    E-mail :
    Comments * :
     




6 - References

Just a couple of links:

And don't forget to watch Craig's presentation at CodeRage III: Tuesday, December 2 at 8:45 Pst. He will present a more general point of view about "Functional Programming in Delphi 2009".

We all see the writing on the wall: functional programming is the obvious way to go. For one main reason: mathematics. In the same way that relational databases did overcome the other (hierarchical, navigational) database models (because of Codd's mathematical insight), the functional programming model will allow us to perform, some day, program validation and verification. No longer fiddling to check whether it works, but quietly moving from concept to implementation.




7 - The author

Felix John COLIBRI works at the Pascal Institute. Starting with Pascal in 1979, he then became involved with Object Oriented Programming, Delphi, Sql, Tcp/Ip, Html, UML. Currently, he is mainly active in the area of custom software development (new projects, maintenance, audits, BDE migration, Delphi Xe_n migrations, refactoring), Delphi Consulting and Delph training. His web site features tutorials, technical papers about programming with full downloadable source code, and the description and calendar of forthcoming Delphi, FireBird, Tcp/IP, Web Services, OOP  /  UML, Design Patterns, Unit Testing training sessions.
Created: oct-07. Last updated: jul-15 - 98 articles, 131 .ZIP sources, 1012 figures
Copyright © Felix J. Colibri   http://www.felix-colibri.com 2004 - 2015. All rigths reserved
Back:    Home  Papers  Training  Delphi developments  Links  Download
the Pascal Institute

Felix J COLIBRI

+ Home
  + articles_with_sources
    + database
    + web_internet_sockets
    + oop_components
      – virtual_constructor
      – generics_tutorial
      – generics_constraints
      – livebindings_spelunking
    + uml_design_patterns
    + debug_and_test
    + graphic
    + controls
    + colibri_utilities
    + colibri_helpers
    + delphi
    + firemonkey
    + compilers
  + delphi_training
  + delphi_developments
  + sweet_home
  – download_zip_sources
  + links
Contacts
Site Map
– search :

RSS feed  
Blog