1 - Introduction

When we download open source Delphi projects, it sometimes happens that all the components are not included with the project. Therefore we have created a utility which reconstitutes the components, simply analyzing the client code.

This utility examines the .Dfm and the .Pas, finding all the Class fields and methods to rebuild a "shadow" component which then can be included in the project.

The first step of this utility is to analyze the .Dfm. The analysis is performed by parsing the .Dfm and storing the content in a tree-like structure. The parser and the structure are presented in this paper.

2 - The Delphi Dfm Parser

2.1 - Binary and Textual .DFM

Our parser only analyses textual .Dfm files. If the .Dfm you wish to analyze is in binary format, we use CONVERT.EXE which is a binary to text conversion utility located in the BIN folder of Delphi (since Delphi 1). In fact we use a utility which converts all the binary .Dfms in a path to their textual form. This is simple to do and will not be shown here.

2.2 - The Structure of a .DFM

A .Dfm is organized in a hierarchical way, showing each component placed on the tForm and the properties whose values are not the default values initialized by the CONSTRUCTOR.

Here is a simple Delphi Form:

If we select "View as Text" in the contextual menu of the Form Designer, we see the following .Dfm content:

object Form1: TForm1
  Left = 192
  Top = 107
  Width = 463
  Height = 224
  Cursor = crHourGlass
  Caption = 'Form1'
  Color = clBtnFace
  Font.Charset = DEFAULT_CHARSET
  Font.Color = clWindowText
  Font.Height = -11
  Font.Name = 'MS Sans Serif'
  Font.Style = []
  OldCreateOrder = False
  PixelsPerInch = 96
  TextHeight = 13
  object ListBox1: TListBox
    Left = 8
    Top = 8
    Width = 217
    Height = 73
    Color = 8454143
    ItemHeight = 13
    Items.Strings = (
      'Felix COLIBRI Delphi Training Sessions:'
      ' + UML and Design Patterns'
      ' + Interbase Applications'
      ' + Object Oriented Programming'
      ' + Client Server Database')
    TabOrder = 1
  object DBGrid1: TDBGrid
    Left = 240
    Top = 8
    Width = 209
    Height = 89
    Options = [dgTitles, dgIndicator, dgColumnResize, dgColLines, dgRowLines, dgTabs, dgRowSelect, dgConfirmDelete, dgCancelOnExit]
    TabOrder = 0
    TitleFont.Charset = DEFAULT_CHARSET
    TitleFont.Color = clWindowText
    TitleFont.Height = -11
    TitleFont.Name = 'MS Sans Serif'
    TitleFont.Style = []
    Columns = <
        Expanded = False
        Visible = True
        Expanded = False
        Visible = True
  object Panel1: TPanel
    Left = 8
    Top = 104
    Width = 145
    Height = 57
    TabOrder = 2
    object CheckBox1: TCheckBox
      Left = 8
      Top = 8
      Width = 97
      Height = 17
      Caption = 'CheckBox1'
      TabOrder = 0
    object CheckBox2: TCheckBox
      Left = 8
      Top = 32
      Width = 97
      Height = 17
      Caption = 'CheckBox2'
      TabOrder = 1
  object Image1: TImage
    Left = 240
    Top = 104
    Width = 97
    Height = 89
    Picture.Data = {
// -- removed

The coding conventions of the .Dfm file are as follows:

  • for simple types (Integer, Boolean, String, ), the value is simply saved after "="

    object Form1TForm1
      Left = 192
      Cursor = crHourGlass
      Caption = 'Form1'
      Color = clBtnFace
      OldCreateOrder = False

  • for sets, the values are stored between "[" and "]"

      object DBGrid1TDBGrid
        Options = [dgTitlesdgIndicatordgColumnResizedgColLines
        TitleFont.Style = []

  • for tStrings and tList, the values are between "(" and ")":

      object ListBox1TListBox
        Items.Strings = (
          'Felix COLIBRI Delphi Training Sessions:'
          '    +  UML and Design Patterns'
          '    +  Interbase Applications'
          '    +  Object Oriented Programming'
          '    +  Client Server Database')

  • for binary values (bitmaps for glyphs etc), the hexadecimal content is saved between "{" and "}"

      object Image1TImage
        Picture.Data = {

  • for collections, the delimiters are "<" and ">", with each item in between. And for each item, between the item name and "end", we have the property name and value

      object DBGrid1TDBGrid
        Columns = <
            Expanded = False
            Visible = True
            Expanded = False
            Visible = True

  • for sub properties (like Font), the property name is composed with the property name and the sub-property name:

    object Form1TForm1
      Font.Charset = DEFAULT_CHARSET
      Font.Color = clWindowText
      Font.Height = -11
      Font.Name = 'MS Sans Serif'
      Font.Style = []

  • components dropped on a container are nested within the container definition:

    object Form1TForm1
      object Panel1TPanel
        object CheckBox1TCheckBox
          Left = 8

We did not find other cases, but did not check the Delphi Sources, which would determine all the possibilities.

2.3 - .Dfm Grammar

In order to parse the .Dfm content, we first started to build a grammar. We are using recursive descent parsing since 1980, as explained by Niklaus WIRTH, and have automated the parsing process, like so many other programmers have done (Yacc: Yet Another Compiler's Compiler).

Let us present a simple example here. A simple arithmetic expression like:

value:= amount * ( 1+ rate / 100.0 );

can be analyzed by the following grammar:

assignmentNAME ':=expression ';.
expression= [ '+' | '-' ] term { ( '+' | '-' ) term }  .
termfactor { ( '*' | '/' ) factor }  .
factorNUMBER | NAME | '(expression ').

This grammar is recursive, since expression calls term, which calls factor, which calls expression. However the "entry point" is always expression, and never "factor". Therefore, a possible Pascal structure for a parser could be:

PROCEDURE parse_assignment;

  PROCEDURE parse_expression;
    PROCEDURE parse_term;
      PROCEDURE parse_factor;
          CASE f_symbol_type OF
            e_string_litteral_symbol : read_symbol;
            e_IDENTIFIER_symbol :
            e_opening_parenthesis_symbol : 
                END// n
            ELSE display_error('NUMBER, NAME, (');
          END// case   
        END// parse_factor
      BEGIN // parse_term

        WHILE f_symbol_type IN [e_times_symbole_divide_symbolDO
          IF f_symbol_type IN [e_times_symbole_divide_symbol]
            THEN read_symbol
            ELSE display_error('*, /');
        END// WHILE 
      END// parse_term
    BEGIN // parse_expression

      WHILE f_symbol_type IN [e_plus_symbole_minus_symbolDO
        IF f_symbol_type IN [e_plus_symbole_minus_symbol]
          THEN read_symbol
          ELSE display_error('+, -');
      END// WHILE 
    END// parse_expression

  BEGIN // parse_assignment
    // -- skip the lhs
    // -- skip ":="


  END// parse_assignment

Notice that the parse_factor procedure is nested in the parse_term procedure, which is nested in the parse_expression procedure, which is nested in the parse_assignment procedure.

Therefore we are using the Indented Extended Backus Naur Formalism to specify a grammar. In our case, the grammar would be:

assignmentNAME ':=expression ';.
  expression= [ '+' | '-' ] term { ( '+' | '-' ) term }  .
    termfactor { ( '*' | '/' ) factor }  .
      factorNUMBER | NAME | '(expression ').

The indentation has the same benefit as indentation in a programming language, or the chapter in a book: it helps structure the program, the book, or the specification in our case. In addition we use a "folding editor" when we build the grammar (the same technique that Delphi added to Delphi 2005 editor: we can edit "by level").

I am fully aware that this nesting is not politically correct. In our days of object programming, we are supposed to put all the procedures in the CLASS, and this will yield smaller procedures, and also give other a chance to reuse these procedures. The data is transfered from call to call either thru parameters, or by using CLASS "mailbox" attributes: the caller saves the values there, and the callee grabs them from the attributes. So finding today nesting beyond level 1 is quite rare. Wirth went down to the level 8 in the P4 compiler, and if it was good enough for Wirth, is certainly is good enough for me. Therefore I am not afraid to nest whenever I can. And if you have deep recursive programs, you will get used to nest local data and combine value or variable parameters to avoid the declarations of over-parametrized procedures only called from one place, and of those mailbox attributes. In fact, locality is the name of the game: if you understand local variables, then you should understand the benefit of nesting procedures. For our little expression grammar, or even for the .Dfm grammar below, it does not make any difference whether you nest or not. But when the grammar increases in size (Pure Pascal: 100 lines, Delphi: 300, Sql: 500 for Interbase, beyond 3.000 for Sql 92) then structuring becomes important.

We are using this IEBNF since a long time. And we have grammars for nearly everything: for Delphi, for C, Java, C++, for HTML, Oberon, Sql, Interbase eSql, Pdf, Zip etc. Just give us a name and we have a grammar for it. Some of my friends call me a grammar junkie. Whenever I have to spelunk into some kind of structured information, I build the grammar first, launch the parser generator and start programming from there.

In the case of the .Dfm grammar, the structure is the following:

dfmobject .
      property { property } { object } END .
    propertyNAME [ '.NAME ] '=value .
          | NAME [ '.NAME ]
          | '[' [ value { ',value } ] ']'
          | '(value  { [ '+' ] value }  ')'
          | '{value { value } '}'
          | '<collection_item { collection_item } '>.
        collection_itemNAME property { property } END .

2.4 - The Scanner

The scanner is quite easy to write. The definition is the following:

 t_dfm_symbol_type= (e_unknown_symbol,




                  // -- accept as number values starting with A..Z

                  constructor create_dfm_scanner(p_namep_file_nameString);
                  function f_initializedBoolean;
                  function f_read_symbolBoolean;
                  destructor DestroyOverride;
                end// c_dfm_scanner

The main routine which isolates one symbol is (not all methods shown):

function c_dfm_scanner.f_read_symbolBoolean;

  procedure get_identifier;
    var l_startInteger;
      m_dfm_symbol_type:= e_identifier_symbol;
      l_start:= m_buffer_index;
      while (m_buffer_indexm_buffer_sizeand (m_pt_buffer[m_buffer_indexin k_identifierdo

      m_symbol_string:= f_extract_string_start_end(l_startm_buffer_index- 1);

      if m_symbol_string'object'
        then m_dfm_symbol_type:= e_object_symbol else
      if m_symbol_string'end'
        then m_dfm_symbol_type:= e_end_symbol else
      if LowerCase(m_symbol_string)= 'true'
        then m_dfm_symbol_type:= e_true_symbol else
      if LowerCase(m_symbol_string)= 'false'
        then m_dfm_symbol_type:= e_false_symbol else
    end// get_identifier

  // -- ... the other extraction procedure here

  begin // f_read_symbol
    m_symbol_string:= '';
    m_dfm_symbol_type:= e_unknown_symbol;

    if f_end_of_text
      then begin
          Result:= False;
          m_dfm_symbol_type:= e_end_of_parse_symbol;
      else begin

          if m_buffer_indexm_buffer_size
            then begin
                case m_pt_buffer[m_buffer_indexof
                  'a'..'z''A'..'Z' :
                      if m_in_brace and (m_pt_buffer[m_buffer_indexin k_hex_digits)
                        then get_number
                        else get_identifier;
                  '-''0'..'9' : get_number;
                  ':' : get_one_operator(e_colon_symbol);
                  '.' : get_one_operator(e_point_symbol);
                  '=' : get_one_operator(e_equal_symbol);
                  ',' : get_one_operator(e_comma_symbol);
                  '+' : get_one_operator(e_plus_symbol);
                  '(' : get_one_operator(e_opening_parenthesis_symbol);
                  ')' : get_one_operator(e_closing_parenthesis_symbol);
                  '[' : get_one_operator(e_opening_bracket_symbol);
                  ']' : get_one_operator(e_closing_bracket_symbol);
                  '{' : begin
                          m_in_brace:= True;
                  '}' : begin
                          m_in_brace:= False;
                  '<' : get_one_operator(e_lower_symbol);
                  '>' : get_one_operator(e_greater_symbol);
                    display_bug_stop('unknown_char_in_dfm 'm_pt_buffer[m_buffer_index]
                        + '< 'IntToStr(Ord(m_pt_buffer[m_buffer_index])));
                end// case
            else m_dfm_symbol_type:= e_end_of_parse_symbol;

          Result:= True;
        end// not eof
  end// f_read_symbol

2.5 - The Parser

The parser is derived from our Grammar. It simply checks that the .Dfm selected corresponds to our grammar or not:

PROCEDURE pure_parse_dfm;

  PROCEDURE parse_object;

    PROCEDURE parse_property;

      PROCEDURE parse_value;

        PROCEDURE parse_collection_item;
            WHILE f_symbol_typee_IDENTIFIER_symbol DO
          END// parse_collection_item

        BEGIN // parse_value
          CASE f_symbol_type OF
            e_INTEGER_symbol : 
            e_DOUBLE_symbol : 
            e_TRUE_symbol : 
            e_FALSE_symbol : 
            e_STRING_LITTERAL_symbol : 
            e_IDENTIFIER_symbol : 
                  IF f_symbol_typee_point_symbol
                    THEN BEGIN
                      END// IF [
                END// n

            e_opening_bracket_symbol :

                  IF f_symbol_type IN [e_INTEGER_symbole_DOUBLE_symbol
                    THEN BEGIN
                        WHILE f_symbol_typee_comma_symbol DO
                        END// WHILE {
                      END// IF [

                END// n
            e_opening_parenthesis_symbol :

                  WHILE f_symbol_type IN [e_plus_symbol,
                    if f_symbol_typee_plus_symbol
                      then read_symbol;
                  END// WHILE {
                END// n
            e_opening_brace_symbol :
                  WHILE f_symbol_type IN [e_INTEGER_symbole_DOUBLE_symbol
                END// n
            e_lower_symbol :
                  WHILE f_symbol_typee_IDENTIFIER_symbol DO
                END// n
            ELSE display_error('INTEGER, DOUBLE, TRUE, FALSE, STRING, NAME, [, (, {, <');
          END// case
        END// parse_value

      BEGIN // parse_property
    END// parse_object

  BEGIN // pure_parse_dfm

Knowing that the IDE has build a correct .Dfm is quite an achievement. So the real part starts now: we have to add to the parsing routines the code that performs some useful task.

We can either embed in the parser the code which will work on the .Dfm, or build intermediate structures, which will in turn be used to operate on the .Dfm. We chose the second solution, and will present now the structure that we will build.

Also note that this is quite similar to the handling of .XML files: parsing is trivial, but the difficult part is building the layers that will manipulate the file, either as call backs, or as a tree structure.

The Data Structure

The .Dfm grammar (and the samples) show that the .Dfm is composed of

  • an object node
  • the object contains
    • a list of properties
    • a list of sub objects
So the structure is simply a class for the object, with two lists for sub objects and sub properties.

Each property has a name and a (list of) value(s).

The only choice came from the collections: we chose to represent them as a kind of object list nested inside a property:

  • the list correspond to the "<>" part
  • the objects correspond to the the items, with the NAME and END
  • each object of this object list has properties (NAME= VALUE)
This can be represented with the following diagram:

This picture has been build using our "delphi-like" picture editor (Palette, Inspector, Design surface etc.). To be able to move, resize, save the figures, we use an in-memory structure. This structure can then be used to generate the corresponding unit, with the CLASSes.

In addition to the members and methods, we can also generate some standard CLASSes, like object lists using tList or tStringList. To detect which CLASS should be generated from a List container, we can either simple check for the "_list" suffix, or use a more complex syntax with all kind of original / replacement parameters. In our case, we have a single kind of list, with only the name of the object as a parameter, so the "_list" suffix rule is enough. Our starting skeletton is:

// 001 u_c_xxx_list
// 24 jan 2005


unit u_c_xxx_list;
    uses Classesu_c_basic_object;

    type c_xxx// one "xxx"
                  // -- m_name:

                  Constructor create_xxx(p_nameString);
                  function f_display_xxxString;
                  function f_c_selfc_xxx;
                  Destructor DestroyOverride;
                end// c_xxx

         c_xxx_list// "xxx" list

                       Constructor create_xxx_list(p_nameString);

                       function f_xxx_countInteger;
                       function f_c_xxx(p_xxx_indexInteger): c_xxx;
                       function f_index_of_xxx(p_xxx_nameString): Integer;
                       function f_c_find_by_xxx(p_xxx_nameString): c_xxx;
                       procedure add_xxx(p_xxx_nameStringp_c_xxxc_xxx);
                       function f_c_add_xxx(p_xxx_nameString): c_xxx;
                       function f_c_add_unique_xxx(p_xxx_nameString): c_xxx;
                       procedure display_xxx_list;

                       Destructor DestroyOverride;
                     end// c_xxx_list

    uses SysUtilsu_display;

    // -- c_xxx

    Constructor c_xxx.create_xxx(p_nameString);
        Inherited create_basic_object(p_name);
      end// create_xxx

    function c_xxx.f_display_xxxString;
        Result:= Format('%-10s ', [m_name]);
      end// f_display_xxx

    function c_xxx.f_c_selfc_xxx;
        Result:= Self;
      end// f_c_self

    Destructor c_xxx.Destroy;
      end// Destroy

    // -- c_xxx_list

    Constructor c_xxx_list.create_xxx_list(p_nameString);
        Inherited create_basic_object(p_name);

        m_c_xxx_list:= tStringList.Create;
      end// create_xxx_line

    function c_xxx_list.f_xxx_countInteger;
        Result:= m_c_xxx_list.Count;
      end// f_xxx_count

    function c_xxx_list.f_c_xxx(p_xxx_indexInteger): c_xxx;
        Result:= c_xxx(m_c_xxx_list.Objects[p_xxx_index]);
      end//  f_c_xxx

    function c_xxx_list.f_index_of_xxx(p_xxx_nameString): Integer;
        Result:= m_c_xxx_list.IndexOf(p_xxx_name);
      end// f_index_of_xxx

    function c_xxx_list.f_c_find_by_xxx(p_xxx_nameString): c_xxx;
      var l_index_ofInteger;
        l_index_of:= f_index_of_xxx(p_xxx_name);
        if l_index_of< 0
          then Result:= Nil
          else Result:= c_xxx(m_c_xxx_list.Objects[l_index_of]);
      end// f_c_find_by_name

    procedure c_xxx_list.add_xxx(p_xxx_nameStringp_c_xxxc_xxx);
      end// add_xxx

    function c_xxx_list.f_c_add_xxx(p_xxx_nameString): c_xxx;
        Result:= c_xxx.create_xxx(p_xxx_name);
      end// f_c_add_xxx

    function c_xxx_list.f_c_add_unique_xxx(p_xxx_nameString): c_xxx;
      var l_index_ofInteger;
        l_index_of:= f_index_of_xxx(p_xxx_name);
        if l_index_of>= 0
          then Result:= f_c_xxx(l_index_of)
          else Result:= f_c_add_xxx(p_xxx_name);
      end// f_c_add_unique_xxx

    procedure c_xxx_list.display_xxx_list;
      var l_xxx_indexInteger;
        display(m_name' 'IntToStr(f_xxx_count));

        for l_xxx_index:= 0 to f_xxx_count- 1 do
      end// display_xxx_list

    Destructor c_xxx_list.Destroy;
      var l_xxx_indexInteger;
        for l_xxx_index:= 0 to f_xxx_count- 1 do

      end// Destroy

    begin // u_c_xxx_list

By using the figure's data, the skeletton list and a simple generator, we get the unit which will represent the .Dfm's structure. After adding some members (not shown in the UML diagram), we then obtain the following definition:

 c_dfm_object_listClass// forward

 c_dfm_property// one "property"
                   // -- m_name: the property name

                   // -- if "Font.Color"

                   // -- if object tree, value  "xxx= yyy.zzz" saved
                   // --   as 3 values "yyy" "." "zzz"
                   // -- modeled as an object list


                   Constructor create_dfm_property(p_nameString);

                   function f_display_dfm_propertyString;
                   function f_c_selfc_dfm_property;
                   procedure add_name(p_nameString);
                   procedure add_value(p_valueString);
                   procedure add_unique_value(p_valueString);
                   procedure update_property_type(p_property_typet_dfm_symbol_type);
                   function f_display_value_listString;
                   function f_display_tree_value_listString;
                   function f_display_name_listString;

                   Destructor DestroyOverride;
                 end// c_dfm_property

 c_dfm_property_list// "property" list

                        Constructor create_dfm_property_list(p_nameString);

                        function f_dfm_property_countInteger;
                        function f_c_dfm_property(p_dfm_property_indexInteger): c_dfm_property;
                        function f_index_of(p_dfm_property_nameString): Integer;
                        function f_c_find_by_dfm_property(p_dfm_property_nameString): c_dfm_property;
                        procedure add_dfm_property(p_dfm_property_nameStringp_c_dfm_propertyc_dfm_property);
                        function f_c_add_dfm_property(p_dfm_property_nameString): c_dfm_property;
                        function f_c_add_unique_dfm_property(p_dfm_property_nameString): c_dfm_property;
                        procedure display_dfm_property_list;

                        Destructor DestroyOverride;
                      end// c_dfm_property_list

 c_dfm_objectClass// Forward

 c_dfm_object_list// "dfm_object" list

                      Constructor create_dfm_object_list(p_nameString);

                      function f_dfm_object_countInteger;
                      function f_c_dfm_object(p_dfm_object_indexInteger): c_dfm_object;
                      function f_index_of(p_dfm_object_nameString): Integer;
                      function f_c_find_by_dfm_object(p_dfm_object_nameString): c_dfm_object;
                      procedure add_dfm_object(p_dfm_object_nameStringp_c_dfm_objectc_dfm_object);
                      function f_c_add_dfm_object(p_dfm_object_nameString): c_dfm_object;
                      function f_c_add_unique_dfm_object(p_dfm_object_nameString): c_dfm_object;
                      procedure display_dfm_object_list;

                      procedure display_dfm_object_and_properties;

                      Destructor DestroyOverride;
                    end// c_dfm_object_list

 c_dfm_object// one "dfm_object"
                 // -- m_name: the object name


                 Constructor create_dfm_object(p_nameString);
                 function f_display_dfm_objectString;
                 function f_c_selfc_dfm_object;
                 procedure display_dfm_object_tree;

                 // -- for comparision with scanned text
                 procedure save_to_txt(p_scanner_full_file_namep_full_file_nameString);
                 // -- for "true" generation
                 procedure generate_indented_text(p_full_file_nameString);

                 Destructor DestroyOverride;
               end// c_dfm_object

The body of those classes, as well as the building of the the structure are in the companion .ZIP.

2.6 - Using the Parser

To use the parser
   select the path and click the .DFM you want to analyze
   the parser builds the data structure
   to visualize the data structure, select "display"
   the projects displays the indented content of the tree:

2.7 - Why a parser ?

Here are a couple of examples about how we use the parser:
  • to modify some .DFM's (removing or changing some properties). For instance, one of our customer requested the use of Quick Report. To my astonishment, you cannot resize some qrdbText or qrLabel without manually repositionning the components to the right of the resized component. The parser was used to automatically recompute the components sizes: we simply start from the left, and adjust the Left by computing the previous Left plus the previous Width

  • to create "stub components". Quick Report again: the components need a connection to a valid printer. The printer was on a network which was not always connected. So we built a "phoney quick report" component suite, allowing us to still work with the parts which did not depend on the report layout.
  • we also used the parser to shift from one Report generator to another. Guess who was involved once again ?

If such applications are of some interest to you, just send us a mail at, and we will try to publish some of the projects using the .dfm parser.

3 - Download the Sources

Here are the source code files:
  • the project with the parser and the data structure (51 K)
The .ZIP file(s) contain:
  • the main program (.DPR, .DOF, .RES), the main form (.PAS, .DFM), and any other auxiliary form
  • any .TXT for parameters, samples, test data
  • all units (.PAS) for units
Those .ZIP
  • are self-contained: you will not need any other product (unless expressly mentioned).
  • for Delphi 6 projects, can be used from any folder (the pathes are RELATIVE)
  • will not modify your PC in any way beyond the path where you placed the .ZIP (no registry changes, no path creation etc).
To use the .ZIP:
  • create or select any folder of your choice
  • unzip the downloaded file
  • using Delphi, compile and execute
To remove the .ZIP simply delete the folder.

The Pascal code uses the Alsacian notation, which prefixes identifier by program area: K_onstant, T_ype, G_lobal, L_ocal, P_arametre, F_unction, C_lass etc. This notation is presented in the Alsacian Notation paper.

As usual:

4 - Conclusion

We presented in this paper a simple parser which analyzes the content of a .Dfm file, which is the starting point for many Delphi utilities.

