Home  ==>  papers  ==>  colibri_utilities  ==>  delphi_string_filter   

Delphi String Filter - Felix John COLIBRI.

  • abstract : a filter used for selecting pathes and file names based on a simple AST interpreter of a simple boolean string expression: "sale export OR 2018_01" accepts "2018_02_sales_exports.pdf" and rejects "2018_02_production.pdf"
  • key words : string filter, parsing, Abstract Syntax Tree, interpreter
  • software used : Windows XP Home, Delphi 6
  • hardware used : Pentium 2.800Mhz, 512 M memory, 140 G hard disc
  • scope : Delphi 1 to 2006, Turbo Delphi for Windows, Kylix
    Delphi 5, 6, 7, 8 Delphi 2005, 2006, Delphi XE, Rad Studio
  • level : Delphi developer
  • plan :

1 - The String Filter purpose

The string filter implements a simple string filter allowing to accept or reject some string base on some substring content.

It was designed to filter pathes and directories. For instance, we can request

export sales OR international contracts OR software -2017


    (*export* AND *sales*)
    (*international* AND *contract*)
    (*software AND NOT *-2017*)

and the following files would qualify


and those would not


2 - The String Filter implementation

2.1 - The Filter Syntax

The syntax is the following
  • the tokens are linked by implicit ANDs, the ORs have to be explicit
  • case insensitive
  • no "word only" filter
  • no string delimiter, therefore the token cannot include spaces
Those choices are justified because the filter was created to analyze pathes and file names with minimal typing

2.2 - The string filter evaluation

In order to implement our filter, we use a simple AST interpreter:
  • a binary tree is built where
    • the terminal nodes are the litteral values
    • the non terminal nodes are the operators, here AND, OR, NOT
  • some string is entered.
  • we first build a table containing as many entries as non-terminals
  • we evaluate the value of the filter by walking in the binary tree
Let's take an example:
  • our test expression is

    aaa bbb OR ccc ddd

  • we build a table of the terminal values


  • we built the binary AST tree:


  • we enter to following string

    uuu ddd fff bbb gggg ccc

  • the array of the token contained in this string is

    aaa False
    bbb True
    ccc True
    ddd True

  • and the evaluation of the tree is


To recap, the steps are the following
  • the filter string is entered
  • the tokens are saved in a token string array
  • a simple parser builds the AST tree
Then, for each string to be evaluated
  • we set the value of each token in the token array to True or False if the token is present in the string
  • we call the evaluate function which will return whether the expression is true or not

2.3 - The Delphi AST tree

Basically we have to build a simple interpreter for a boolean expression. The BNF (IEBNF in our case: Indented Extended Backus Naur grammar notation) is:

expression= term { OR term }
    term= factor { AND factor }
        factor= NOT factor | STRING

2.4 - The Delphi implementation

The litteral values are saved in a variable array:

Type t_variable=

           m_variable_arrayArray Of t_variable;

           Constructor create_variable_array(p_nameString);
           Function f_add_variable(p_variable_nameString): Integer;
           Function f_index_of(p_variable_nameString): Integer;
           Procedure reset_values;
           Procedure initialize_contains(p_stringString);
           Procedure display_variable_array;
         End// c_variable_array

Then we define the Ast Nodes:

Type c_node=
           Constructor create_node(p_nameStringp_c_variable_array_refc_variable_array);
           Function f_evaluate_nodeBooleanVirtualAbstract;
           Procedure display_nodeVirtualAbstract;
         End// c_node
           // -- m_name : the litteral of the test
           m_litteral_valueString// alias of m_name
           // -- the index in the variable array of the string to filter
           Constructor create_value_node(p_nameString;
           Function f_evaluate_nodeBooleanOverride;
           Procedure display_nodeOverride;
         End// c_value_node

           Constructor create_not_node(p_nameStringp_c_variable_array_refc_variable_array;
           Function f_evaluate_nodeBooleanOverride;
           Procedure display_nodeOverride;
           Destructor DestroyOverride;
         End// c_not_node

           Constructor create_binary_node(p_nameStringp_c_variable_array_refc_variable_array;
           Procedure display_nodeOverride;
           Destructor DestroyOverride;
         End// c_binary_node

The string filter contains both the variables and the root of the AST tree:

Type c_string_filter=

           Constructor create_string_filter(p_nameString);

           Procedure build_ast(p_filterString);
           Procedure display_ast;
           Function f_filter_string(p_stringString): Boolean;

           Destructor DestroyOverride;
         End// c_string_filter

The AST building is quite standard, aside from the read_symbol which is more intricate to account for the implicit AND :

Procedure c_string_filter.build_ast(p_filterString);
    // -- "aaa bbb OR ccc OR ddd eee -fff
  Type t_symbol_type= (e_unknown_symbol,
  Var l_indexl_lengthinteger;


  Function _f_symbol_typeString;
      Case l_symbol_type Of
        e_unknown_symbol : Result:= 'unk';
        e_AND_symbol : Result:= 'AND';
        e_OR_symbol : Result:= 'OR ';
        e_litteral_symbolResult:= 'str';
        e_end_symbol : Result:= 'end';
    End// _f_symbol_type

  Procedure _read_symbol;
    Var l_stringString;
      If l_is_term_start
        Then Begin
            If l_index>= l_length
              Then Begin
                  l_symbol_type:= e_end_symbol;
                  l_symbol_string:= 'END';
            l_symbol_string:= f_string_extract_non_blank(p_filterl_index);
            l_symbol_type:= e_litteral_symbol;
            l_is_term_start:= False;
            l_read_next:= True;
        Else Begin
            If l_read_next
              Then Begin
                  If l_index>= l_length
                    Then Begin
                        l_symbol_type:= e_end_symbol;
                        l_symbol_string:= 'END';
                  l_symbol_string:= f_string_extract_non_blank(p_filterl_index);

                  If SameText(l_symbol_string'OR')
                    Then Begin
                        l_is_term_start:= True;
                        l_symbol_type:= e_OR_symbol;
                        l_read_next:= True;
                    Else Begin
                        l_symbol_type:= e_AND_symbol;
                        l_read_next:= False;
              Else Begin
                  l_symbol_type:= e_litteral_symbol  ;
                  l_read_next:= true;
    End// f__read_symbol

  Procedure _display_error(p_errorString);
      display_bug_stop('string_filter 'p_error);
    End// _display_error

  Function f_c_termc_node;

    Function f_c_factorc_node;
      Var l_variable_indexInteger;
        Case l_symbol_type Of
                If l_symbol_string[1]= '-'
                  Then Begin
                      System.Delete(l_symbol_string, 1, 1);
                      l_variable_index:= m_c_variable_array.f_add_variable(l_symbol_string);
                      Result:= c_not_node.create_not_node('not'm_c_variable_array,
                  Else Begin
                      l_variable_index:= m_c_variable_array.f_add_variable(l_symbol_string);
                      Result:= c_value_node.create_value_node(l_symbol_string,
          Else _display_error('string_litteral'); // NUMBER, NAME, (
        End// case
      End// f_c_factor

    Begin // f_c_term
      Result:= f_c_factor;

      While l_symbol_typee_AND_symbol Do
        Result:= c_and_node.create_binary_node('and'NilResultf_c_factor);
      End// WHILE {
    End// f_c_term

  Begin // build_ast
    l_index:= 1; l_length:= Length(p_filter);
    l_is_term_start:= True;

    m_c_root_node:= f_c_term;

    While l_symbol_typee_OR_symbol Do
      m_c_root_node:= c_or_node.create_binary_node('or'm_c_variable_array,
  End// build_ast

Please note that the Pascal structure of the parser strictly parallels the IEBNF (expression, termn factor), which usually is the case for recursive descent parsers.

For each candidate string , we call the f_filter_string function:

Function c_string_filter.f_filter_string(p_stringString): Boolean;
  Var l_indexInteger;
    With m_c_variable_array Do
      // display_variable_array;

    Result:= m_c_root_node.f_evaluate_node;
  End// f_filter_string


  • reset_values reinitalizes the value of the variable array:

    Procedure c_variable_array.reset_values;
      Var l_variable_indexInteger;
        For l_variable_index:= 0 To m_variable_count- 1 Do
          m_variable_array[l_variable_index].m_contains:= False;
      End// reset_values

  • we evaluate the value of each terminal of the candidate string:

    Procedure c_variable_array.initialize_contains(p_stringString);
        // -- check if the string contains the litteral values
      Var l_variable_indexInteger;
        l_lowercase_string:= LowerCase(p_string);
        For l_variable_index:= 0 To m_variable_count- 1 Do
          With m_variable_array[l_variable_indexDo
            m_contains:= Pos(m_variable_namel_lowercase_string)> 0;
      End// initialize_contains

  • finally we visit each tree of the AST node, evaluating the value of the node for this candidate string. We call the root node f_evaluate, and each node calls its own evaluation function:

    Function c_value_node.f_evaluate_nodeBoolean;
        Result:= m_c_variable_array_ref.m_variable_array[m_variable_index].m_contains;
      End// f_evaluate_node

    Function c_not_node.f_evaluate_nodeBoolean;
        Result:= Not m_c_not_node.f_evaluate_node;
      End// f_evaluate_node

    Function c_or_node.f_evaluate_nodeBoolean;
        Result:= m_c_left_node.f_evaluate_node Or m_c_right_node.f_evaluate_node
      End// evaluate_node

    Function c_and_node.f_evaluate_nodeBoolean;
        Result:= m_c_left_node.f_evaluate_node And m_c_right_node.f_evaluate_node;
      End// evaluate_node

2.5 - sample program

Wd want to test different filter on a list of candidate strings.

Our sample program will

  • use a tListBox which will contain the candidate string
  • have second tListBox with the different filters. Clicking on one filter will
    • build the variable array
    • build the AST tree
    • for each candidate string, evaluate the value of the filter

Here is the test procedure

Procedure TForm1.filter_listbox_Click(SenderTObject);
  Var l_c_string_filterc_string_filter;

  Procedure _build_filter;
    Var l_filterString;
      With filter_listbox_ Do
        l_filter:= Items[ItemIndex];

      display('----- 'l_filter);
      With l_c_string_filter Do
    End// _build_filter

  Procedure _evaluate_strings;
    Var l_list_indexInteger;
      display('----- accepted strings');
      With ListBox1 Do
        For l_list_index:= 0 To COunt- 1 Do
          l_string:= Items[l_list_index];
          With l_c_string_filter Do
            If f_filter_string(l_string)
              Then display('  OK 'l_string)
    End// _evaluate_strings

  Begin // filter_listbox_Click
    l_c_string_filter:= c_string_filter.create_string_filter('');
  End// filter_listbox_Click

and the result is:


3 - Download the Sources

Here are the source code files: The .ZIP file(s) contain:
  • the main program (.DPR, .DOF, .RES), the main form (.PAS, .DFM), and any other auxiliary form
  • any .TXT for parameters, samples, test data
  • all units (.PAS) for units
Those .ZIP
  • are self-contained: you will not need any other product (unless expressly mentioned).
  • for Delphi 6 projects, can be used from any folder (the pathes are RELATIVE)
  • will not modify your PC in any way beyond the path where you placed the .ZIP (no registry changes, no path creation etc).
To use the .ZIP:
  • create or select any folder of your choice
  • unzip the downloaded file
  • using Delphi, compile and execute
To remove the .ZIP simply delete the folder.

The Pascal code uses the Alsacian notation, which prefixes identifier by program area: K_onstant, T_ype, G_lobal, L_ocal, P_arametre, F_unction, C_lass etc. This notation is presented in the Alsacian Notation paper.

As usual:

  • please tell us at if you found some errors, mistakes, bugs, broken links or had some problem downloading the file. Resulting corrections will be helpful for other readers
  • we welcome any comment, criticism, enhancement, other sources or reference suggestion. Just send an e-mail to
  • or more simply, enter your (anonymous or with your e-mail if you want an answer) comments below and clic the "send" button
    Name :
    E-mail :
    Comments * :

  • and if you liked this article, talk about this site to your fellow developpers, add a link to your links page ou mention our articles in your blog or newsgroup posts when relevant. That's the way we operate: the more traffic and Google references we get, the more articles we will write.

4 - References

We published another paper about expression evaluations:

5 - The author

Felix John COLIBRI works at the Pascal Institute. Starting with Pascal in 1979, he then became involved with Object Oriented Programming, Delphi, Sql, Tcp/Ip, Html, UML. Currently, he is mainly active in the area of custom software development (new projects, maintenance, audits, BDE migration, Delphi Xe_n migrations, refactoring), Delphi Consulting and Delph training. His web site features tutorials, technical papers about programming with full downloadable source code, and the description and calendar of forthcoming Delphi, FireBird, Tcp/IP, Web Services, OOP  /  UML, Design Patterns, Unit Testing training sessions.
Created: feb-18. Last updated: feb-2018 - 96 articles, 231 .ZIP sources, 1200 figures
Copyright © Felix J. Colibri 2004 - 2018. All rigths reserved
Back:    Home  Papers  Training  Delphi developments  Links  Download
the Pascal Institute


+ Home
  + articles_with_sources
    + database
    + web_internet_sockets
    + oop_components
    + uml_design_patterns
    + debug_and_test
    + graphic
    + controls
    + colibri_utilities
      – delphi_net_bdsproj
      – dccil_bat_generator
      – coliget_search_engine
      – dfm_parser
      – dfm_binary_to_text
      – component_to_code
      – exe_dll_pe_explorer
      – dll_process_viewer
      – the_alsacian_notation
      – html_help_viewer
      – cooking_the_code
      – events_record_playback
      – colibri_hiragana_quiz
      – delphi_string_filter
    + colibri_helpers
    + delphi
    + firemonkey
    + compilers
    + vcl
  + delphi_training
  + delphi_developments
  + sweet_home
  – download_zip_sources
  + links
Site Map
– search :

RSS feed