Next Article in Journal
EEG Channel Selection for Stroke Patient Rehabilitation Using BAT Optimizer
Previous Article in Journal
Inversion-Based Deblending in Common Midpoint Domain Using Time Domain High-Resolution Radon
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Precedence Table Construction Algorithm for CFGs Regardless of Being OPGs

Department of Computer Science and Engineering, Universidad del Norte, Barranquilla 081007, Colombia
*
Author to whom correspondence should be addressed.
Algorithms 2024, 17(8), 345; https://doi.org/10.3390/a17080345
Submission received: 24 June 2024 / Revised: 29 July 2024 / Accepted: 30 July 2024 / Published: 7 August 2024

Abstract

:
Operator precedence grammars (OPG) are context-free grammars (CFG) that are characterized by the absence of two adjacent non-terminal symbols in the body of each production (right-hand side). Operator precedence languages (OPL) are deterministic and context-free. Three possible precedence relations between pairs of terminal symbols are established for these languages. Many CFGs are not OPGs because the operator precedence cannot be applied to them as they do not comply with the basic rule. To solve this problem, we have conducted a thorough redefinition of the Left and Right sets of terminals that are the basis for calculating the precedence relations, and we have defined a new Leftmost set. The algorithms for calculating them are also described in detail. Our work’s most significant contribution is that we establish precedence relationships between terminals by overcoming the basic rule of not having two consecutive non-terminals using an algorithm that allows building the operator precedence table for a CFG regardless of whether it is an OPG. The paper shows the complexities of the proposed algorithms and possible exceptions to the proposed rules. We present examples by using an OPG and two non-OPGs to illustrate the operation of the proposed algorithms. With these, the operator precedence table is built, and bottom-up parsing is carried out correctly.

1. Introduction

This paper delves into the operator precedence languages family (OPL), created by R. Floyd [1], which is used for efficient bottom-up parsing [2,3]. The author’s inspiration from the structure of arithmetic expressions, where precedence is assigned to multiplicative operators (∗ and /) over additive operators (+ and −), is a key insight. Through the process of bottom-up parsing, the left side (non-terminal symbol) completely replaces the identified right-side production (body) in a context-free grammar (CFG), leaving no room for ambiguity. Operator precedence was extended to grammars that consist of terminals on the right side, even if non-terminals separated them. A CFG is generally considered an operator precedence grammar (OPG) if it does not contain productions with adjacent non-terminal symbols on its right-hand side.
OPLs have greatly aided in inferring context-free grammars. An example is described in [4,5,6], where a method is presented for inferring operator precedence-free grammars from bracketed samples. The method is based on constructing parse trees and utilizing functions that produce the grammars. Precedence rules are used to identify the brackets surrounding the terminal elements of a string in the sample, indicating the order in which the string should be evaluated.
Research on input-driven formal languages (IDL) [7,8,9] (later renamed as Visibly Pushdown Languages (VPL) [10]) has concluded [11] that OP closure properties imply ID closure properties and that ID languages are a specific type of OP languages characterized by limited OP relations [12].
Since the advancements in parallel computing, local parsability properties derived from OPL have been utilized to generate fast parallel parsers [13].
This paper’s main contribution is its novel approach to building precedence relationships between terminal symbols in non-OPGs. This approach allows the generation of operator precedence tables and bottom-up parsing on a string derived by the non-OPG with support from the table obtained. The novelty of the work is a significant aspect that sets it apart from previous research in the field.
  • The sets of Left and Right terminals proposed in previous works [1,11,14] to build precedence relationships between the terminals of an OPG were redefined.
  • A new set of terminals called Leftmost was defined to support solving precedence relationships when there are two or more adjacent terminals on the right side of the productions. This set is significant as it systematically handles the precedence relationships in such complex scenarios, expanding the bottom-up parsing process with non-OPGs.
  • An algorithm was established to build the precedence table with the support of the three sets of terminals proposed.
  • Finally, applying the proposed algorithms makes it possible to obtain from a non-OPG the relations and operator precedence table previously limited to OPGs only. In addition, a bottom-up operator-precedence parser could be used to parse any string generated by a non-OPG, a technique that could only be applied to OPGs.
As demonstrated in this work, the proposed solution to construct the operator precedence table is not just a theoretical concept but a practical tool that can be applied to OPGs without any restrictions. At the end of the work, the functionality is demonstrated by applying it to an OPG and two non-OPGs and carrying out the respective bottom-up parsing. This practical demonstration should instill confidence in the effectiveness of the proposed solution and ensure its practical applicability.
The paper is organized as follows: Section 2 provides the necessary definitions to understand this work. Section 3 presents the previous work carried out in the area of CFG. Section 4 introduces our approach, detailing the problem we aim to solve and the redefinitions of previous works’ concepts used to develop our algorithms. Section 5 presents examples of applying the proposed algorithm to one OPG and two non-OPGs, demonstrating the application of bottom-up parsing with the precedence tables to test their use. Section 6 describes several exceptions where the rules described in the proposed algorithms do not apply. Finally, Section 7 provides conclusions to our work.

2. Basic Definitions

2.1. Context-Free Grammars and Languages

Within Chomsky’s hierarchy [15], context-free grammars play a significant role in programming and compilation language applications by addressing the syntactic structure of programming languages.
A context-free grammar (CFG) G generates a language L ( G ) , commonly called context-free language (CFL), which is composed of the strings generated by the CFG. The CFG is defined as a 4-tuple G ( S ) = < T , N , S , P > , where:
  • T represents the set of terminal symbols that create the strings within the CFL generated by the CFG.
  • N represents the set of non-terminal symbols or syntactic variables that determine strings leading to CFL generation.
  • S is the initial non-terminal symbol from which all strings defined by the GFC are generated.
  • P is the set of productions or rules that group terminals and non-terminals to generate the strings. A production takes the form A α , where A represents a non-terminal and α is a combination of terminals and non-terminals. The ⟶ is pronounced “produce”. Thus, the production is read “A produce alpha”.
The following notation will be used: lowercase letters from the beginning of the alphabet represent terminal symbols ( a , b , c , T ) ; uppercase letters from the beginning of the alphabet represent non-terminal symbols ( A , B , C , N ) ; lowercase letters late in the alphabet represent terminal strings ( u , v , w , x , y , z T + ) ; lowercase Greek letters generally represent terminal and/or non-terminal strings ( α , β , γ , ( T N ) ) , and the empty string will be represented by ε (this Greek letter will be the exception to the above convention).

2.2. Operator Precedence Grammars (Basic Rule)

A CFG is considered an operator grammar if none of the productions in P have adjacent non-terminal symbols on their right side. In other words, for a production A α in G, α does not take the form u B C v .

2.3. Derivations

A string η ( T N ) can be derived in k steps from a non-terminal A N if there exists a sequence of strings of the form η 1 , η 2 , , η k . Using the symbol ⟹, meaning one-step derivation, the sequence of strings is arranged in the form: η 1 η 2 η k .
In this sequence, η 1 corresponds to A and η k to η . Each intermediate η i in the sequence has the form δ B γ , where B N and there is a production B β , such that by substituting it on the right side, η i + 1 ( = δ β γ ) is obtained. The derivation in one or more steps is denoted with the operator + . In general, it is stated that every string of terminal symbols w L ( G ) can be established as S + w or expanding the derivation in one or more steps: S = η 1 η 2 η k = w . For a grammar G, a sentential form η ( T N ) is a string that S η , meaning that η can be obtained with zero or more derivations from S. If two grammars generate identical languages, they are considered equivalent.

2.4. Types of Derivations

According to [16], S = η 1 η 2 η k = w is a leftmost derivation of the string w T , when each η i , 2 i k 1 , has the form x i A i β i with x i T , A i N and β i ( T N ) . Furthermore, A i α i belongs to P, and then, in each η i , the symbol A i is substituted by α i , resulting in the sentential form η i + 1 . Each η i is called the left-sentential form.
If each η i , 2 i k 1 , takes the form of β i A i x i with x i T , A i N , and β i ( T N ) , then a rightmost derivation would be formed. Each η i is called the right-sentential form.

2.5. Parse Trees

According to reference [16], a derivation tree for a CFG G ( S ) = < T , N , S , P > is a labeled and ordered tree in which each node receives a label corresponding to a symbol from the set N T { ε } . If a non-leaf node is labeled A and its immediate descendants are labeled X 1 , X 2 , , X n , then A X 1 X 2 X n is a production in P. A labeled ordered tree D is a derivation tree for a CFG G ( A ) = < T , N , A , P > if
  • The root of D is labeled A.
  • If D 1 , , D k are the subtrees of the direct descendants of the root and the root of D i is labeled X i , then A X 1 X k is in P.
  • D i is a derivation tree for G ( X i ) = < T , N , X i , P > if X i is in N or D i is a single node labeled X i if X i is in T or D i is a single node labeled ε .
Parse trees can be constructed from any derivation type.

2.6. Bottom-Up Parsing

The bottom-up parsing syntax analysis [2], also known as shift-reduce parsing, attempts to build a parsing tree for an input string that starts in the leaves (tree bottom) and moves to the root (tree top). This process is considered as the reduction of the string w to the initial symbol S of a CFG. At each step of parsing reduction, a particular substring of the sentential form that matches the right side of a production is replaced by the non-terminal symbol of the left side of that production. If, at each step, the substring is chosen correctly, a rightmost derivation is traced out inversely.

2.7. Handle

A handle [2,17] of a right-sentential form γ is a production A β and a position of γ where the string β could be found and replaced by A to produce the previous right-sentential form in a rightmost derivation of γ . Generally, if S α A w α β w , then A β if the position next to α is a handle of α β w . If the grammar is ambiguous, more than one handle will be obtained. If the grammar is unambiguous, then every right-sentential form has exactly one handle. Usually, the string β is told to be a handle of α β w when the conditions for doing so are clear.

2.8. Implementation of a Bottom-Up Parsing

The bottom-up parsing by shift and reduction uses a stack to store grammar symbols and a buffer to handle the input string to be analyzed. The $ symbol is also used to delimit the bottom of the stack and the right side of the input buffer. The recognition model format starts with the stack at $ and the string w $ in the input, as shown in Table 1.
The bottom-up parsing shifts zero or more input symbols to the stack until a handle is at the top. The parsing then reduces the handle to the left side of the appropriate production to obtain the sentential form corresponding to the previous step of the rightmost derivation. When reducing, the parsing recognizes that the right end of the handle is at the top of the stack. Therefore, the parsing takes action to find its left end in the stack and determine the non-terminal that will replace it according to the right side of some production where it matches the handle. These steps are repeated until an error is detected or until the stack contains $ S and the input is with the symbol $, at which point the parsing ends and the input string is considered valid for the CFG.

3. Previous Work

According to [1,11,14], for a CFG G, the left terminal sets L G ( A ) and right terminal sets R G ( A ) are defined as follows:
L G ( A ) = a | A γ a α , γ N { ε } .
R G ( A ) = a | A α a γ , γ N { ε } .
Precedence relationships arise from the parse tree’s establishment of binary relationships between consecutive terminals or those that become sequential after a bottom-up process toward the non-terminal symbol S.
The precedence relationships are established as follows:
  • a b , if and only if A α a γ b β P , γ N { ε } .
  • a b , if and only if A α a D β P , b L G ( D ) .
  • a b , if and only if A α D b β P , a R G ( D ) .
According to [2], precedence relations are used to delimit a handle in a right-sentential form. That is, in the sentential form α β w , the substring β is a handle if there is a production A β in the CFG; therefore, it can be delimited by the precedence relations, obtaining α β w . The relation ≐ between the symbols used when the handle has more than one consecutive terminal symbol. That is, if β = a 1 a 2 a n , then the relation a i a i + 1 , i = 1 , , n 1 should be established. Suppose a right-sentential form is A 0 a 1 A 1 a 2 a n A n , where each A i is a non-terminal. In that case, the relations can be established as a i A i a i + 1 , with a maximum of one non-terminal (operator grammar principle) between the terminals. Non-terminals could be eliminated in the right sentential form, leaving only the relationships between terminals.
In short, each time a handle is obtained, it can be enclosed between the symbols of ⋖ and ⋗. In addition, two consecutive terminal symbols can be inside a handle (non-terminals in the middle), and the relationship is to be established ≐. Finally, non-terminal symbols can be removed from the right-sentential form when a handle is found and locked between the precedence relations.
According to [2], the concepts discussed in Section 2.8, and the precedence relations, the precedence parsing Algorithm 1 is constructed as follows:
Algorithm 1 Precedence parsing algorithm
Require: Set of precedence relations.
  • procedure PrecedenceParsing(w)
  •    s t a c k [ $ ]
  •    e n d f a l s e
  •    p o s 0
  •   do
  •      a s t a c k [ t o p ]
  •      b w $ [ p o s ]
  •     if  a = $  and  b = $  then
  •        print (“Action: Accept”)
  •        e n d t r u e
  •     else
  •       if  a b  or  a b  then
  •          p o s + +
  •          push ( s t a c k , b )
  •          print (“Action: Shift”)
  •       else
  •         if  a b  then
  •           do
  •              e pull ( s t a c k )
  •           while not  ( s t a c k [ t o p ] e )
  •            print (“Action: Reduce”)
  •         else
  •            print (“Action: Error”)
  •            e n d t r u e
  •         end if
  •       end if
  •     end if
  •   while not  e n d
  • end procedure

4. Approach

This work proposes a redefinition of the Left and Right terminal sets, corresponding to L G ( A ) and R G ( A ) , respectively. Additionally, the definition of the new Leftmost set is presented. The new rules for determining precedence relations are demonstrated even if the CFG does not comply with the fundamental rule of operator grammar, meaning that the CFG can have more than one adjacent non-terminal symbol on the right side of its productions.
We present the algorithms for constructing the three proposed sets: Left, Leftmost, and Right. Additionally, we demonstrate the algorithm that incorporates the newly established precedence rules.

4.1. Definitions

There are the following symbols:
  • A string of terminals x T + .
  • A string of non-terminals β N .
  • A sentential form γ ( T N ) .
The following sets are defined:
  • The left, denoted as L ( A ) , is the set of terminal symbols that can appear on the left side of any sentential form as a product of a rightmost derivation from A.
    L ( A ) = { a T | A β a γ } .
  • The leftmost, denoted as L m ( A ) , is the set of terminal symbols that can appear at the end of the left side in any derivation from A, in the absence of other leftmost grammatical symbols (or that derive to ε in their absence).
    L m ( A ) = { a T | A a γ } .
  • The right, denoted as R ( A ) , is the set of terminal symbols that can appear at the right end of the rightmost derivations from A or appear further to the right in any production of A.
    R ( A ) = { a T | A γ a A γ a β } .
According to the above, the following precedence rules are defined:
  • If C γ 1 a β B γ 2 , b L ( B ) , then a b .
  • If C γ 1 A β B γ 2 , a R ( A ) b L m ( B ) β ε , then a b .
  • If C γ 1 A β b γ 2 , a R ( A ) β ε , then a b .
  • If C γ 1 a β b γ 2 , then a b .

4.2. Problem Analysis

4.2.1. Absolute Operator Grammar

Given the rightmost derivation series of the form:
S + γ 1 A x 1 γ 1 ω 1 x 1 ,   where   A ω 1   and   ω 1 = γ 2 B γ 3 .
It is noted that ω 1 is a handle according to the previous definition in Section 2.7.
The right-sentential form of (1) can be rewritten as follows:
γ 1 ω 1 x 1 = γ 1 γ 2 B γ 3 x 1 + γ 1 γ 2 B x 2 x 1 γ 1 γ 2 ω 2 x 2 x 1 ,   where   B ω 2 .
From the last sentential form in (2), it follows that ω 2 is a handle according to the definition in Section 2.7.
Because of the above, the analysis will focus on the productions born of A, which have at least one non-terminal B on the right, as seen in (1). The productions in A will be traversed through grammar to find the relations of precedence of non-terminal B located on the right side.
Within the context of ω 1 , having to B ω 2 , and ω 2 is a handle, the following forms can be obtained from its definition in (1):
(a)
ω 1 = γ 4 a B , it follows that a L ( B ) . Here γ 2 = γ 4 a and γ 3 = ε .
(b)
ω 1 = B b γ 5 , it follows that R ( B ) b . Here γ 2 = ε and γ 3 = b γ 5 .
(c)
ω 1 = γ 4 a B b γ 5 , it follows that a L ( B ) and R ( B ) b . Here γ 2 = γ 4 a and γ 3 = b γ 5 .
For the non-terminal start symbol, S, an initial dummy production is defined E $ S $ and rule (c) is applied.
For a production A γ 1 a X b γ 2 , X N { ε } , it follows that a b .

4.2.2. General Grammar with Two or More Consecutive Non-Terminal Symbols

Redefinition of Left, Right and definition of Leftmost.
  • Left(B):
Given the rightmost derivation:
γ 1 a B x 1 γ 1 a ω 1 x 1 .
It is observed that ω 1 is a handle within a bottom-up parsing (inverse rightmost derivation) since B ω 1 ; therefore, from (3), it can be established that γ 1 a ω 1 x 1 . In addition, ω 1 can have the following forms:
(a)
ω 1 = β 1 b γ 2 .
(b)
ω 1 = β 1 C γ 3 .
Continuing with the rightmost derivation in (3) and taking each of the options of ω 1 , we obtain that:
(a)
γ 1 a B x 1 γ 1 a β 1 b γ 2 x 1 since ω 1 is a handle, consequently γ 1 a β 1 b γ 2 x 1 ; therefore, a b where B β 1 b γ 2 and b L ( B ) .
(b)
γ 1 a B x 1 γ 1 a β 1 C γ 3 x 1 γ 1 a β 1 C x 2 x 1 γ 1 a β 1 β 2 D x 3 x 2 x 1 γ 1 a β 1 β 2 β 3 b γ 4 x 3 x 2 x 1 . It is observed that C + β 2 D x 3 and D β 3 b γ 4 ; therefore, β 3 b γ 4 is a handle, and consequently γ 1 a β 1 β 2 β 3 b γ 4 x 3 x 2 x 1 and a b .
In short, γ 1 a B x 1 + γ 1 a β 1 β 2 β 3 b γ 4 x 3 x 2 x 1 ; taking β = β 1 β 2 β 3 and γ 5 = γ 4 x 3 x 2 , we obtain the summary derivation γ 1 a B x 1 + γ 1 a β b γ 5 x 1 , where B + β b γ 5 and b L ( B ) .
  • Right(B):
Given the rightmost derivation:
γ 1 B a x 1 γ 1 ω 1 a x 1 .
It is observed that ω 1 is a handle within a bottom-up parsing (inverse rightmost derivation), since B ω 1 ; therefore, from (4), it can be established that γ 1 ω 1 a x 1 . In addition, ω 1 can have the following forms:
(a)
ω 1 = γ 2 b β .
(b)
ω 1 = γ 3 C β 1 , β 1 ε .
Continuing with the rightmost derivation in (4) and taking each of the options of ω 1 , we obtain that:
(a)
γ 1 B a x 1 γ 1 γ 2 b β a x 1 since ω 1 is a handle, consequently γ 1 γ 2 b β a x 1 ; therefore, b a where B γ 2 b β and b R ( B ) .
(b)
γ 1 B a x 1 γ 1 γ 3 C β 1 a x 1 γ 1 γ 3 γ 4 D a x 1 γ 1 γ 3 γ 4 γ 5 b β 2 a x 1 . It is noted that C + γ 4 D and D γ 5 b β 2 ; therefore, γ 5 b β 2 is a handle, and consequently, γ 1 γ 3 γ 4 γ 5 b β 2 a x 1 and b a .
In short, γ 1 B a x 1 + γ 1 γ 3 γ 4 γ 5 b β 2 a x 1 ; taking γ 6 = γ 3 γ 4 γ 5 , we obtain the summary derivation γ 1 B a x 1 + γ 1 γ 6 b β 2 a x 1 , where B + γ 6 b β 2 and b R ( B ) .
  • Leftmost(B):
Given the production with two non-terminal symbols on the right (body):
S A B
A γ 1 b 1 β 1
Rightmost derivation from S in (5) is:
S A B + A a γ + A a x 1 γ 1 b 1 β 1 a x 1 = γ 2 C a x 1 + γ 2 γ 3 γ 4 b 2 β 2 a x 1 ( i ) ( i i ) ( i i i ) ( i v ) ( v ) ( v i ) ( v i i ) .
In summary:
  • In ( v ) , the relationship γ 1 b 1 β 1 a x 1 can be established since the right part of the production (6) is a handle. Therefore, b 1 a and b 1 R ( A ) .
  • In ( v i ) , it is known that β 1 = C 0 C 1 C n C , where each C i N and C is the non-terminal, first derived from the right. It is noted that γ 2 = γ 1 b 1 C 0 C n .
  • In ( v i i ) , C + γ 3 γ 4 b 2 β 2 being γ 4 b 2 β 2 the result of the last rightmost derivation performed and, in turn, a handle; therefore, the relationship γ 2 γ 3 γ 4 b 2 β 2 a x 1 is established, since γ 4 b 2 β 2 is a handle, therefore, b 2 a .
  • From ( i v ) , A a x 1 + γ 2 γ 3 γ 4 b 2 β 2 a x 1 . Establishing γ 5 = γ 2 γ 3 γ 4 , then A + γ 5 b 2 β 2 , therefore, b 2 R ( A ) .
  • From ( i i ) , B + a γ is established, from which we define the new Leftmost set of B as L m ( B ) = { a T | B + a γ } .

4.2.3. An Intuitive Way to Observe the Elements of Each Set

  • Vision of Left(·):
  • Given the CFG:
  • A C D E
  • E e h
  • D d
  • C c F
  • F f
The rightmost derivation of the string c f d e h carries the following way of creating the L e f t ( A ) or L ( A ) .
A C D E C D e h C d e h c F d e h c f d e h e d c .
The terminal symbols e, d, and c are viewed from the left and recorded in L ( A ) . The symbol e is the first one seen from the left and is recorded in L ( A ) , and the symbol h is blocked by e since both appear when E is substituted. Then, D is replaced by d, which is observed and recorded in L ( A ) . When c F replaces C, we observe and record c in L ( A ) . Finally, when the substitution of F by f is carried out, it can no longer be observed (blocked by c) and cannot be registered. Consequently:
L ( A ) = { e , d , c } .
  • Vision of Right(·):
  • Given the CFG:
  • A a b B C
  • B e F
  • C c D
  • D d
  • F f
The rightmost derivation of the string a b e f c d carries the following way of creating the R i g h t ( A ) or R ( A ) .
A a b B C a b B c D a b B c d a b e F c d a b e f c d b c d .
The terminal symbols b, c, and d are viewed from the right and recorded in R ( A ) . The symbol b is the first one observed and recorded in R ( A ) . The symbol a is blocked by b, since both appear when A is derived. C is then replaced by c D and c is observed and recorded in R ( A ) . When D is substituted for d, we observe and record d in R ( A ) . Then B is derived by e F but e cannot be recorded since the previous inclusion of c and d in R ( A ) makes it invisible or blocks it from being observed from the right. Finally, when substituting F for f, it cannot be observed from the right either (blocked by c and d) and cannot be recorded. Consequently:
R ( A ) = { b , c , d } .
  • Vision of Leftmost(·):
  • Given the CFG:
  • S d A B
  • A f a
  • B C b
  • C c
The rightmost derivation of the string d f a c b carries the following way of creating the L e f t m o s t ( B ) or L m ( B ) , as shown in Figure 1.
It can be seen that L ( B ) = { b , c } . When B derivates C b , we observe that L m ( B ) = { b } because no more terminal symbols are seen up to that point. In the following derivation, C is replaced by c; therefore, c is located further to the left of the substring c b , leaving L m ( B ) = { c } .

4.2.4. Summary of Relations between Grammatical Symbols from Productions and Sets

Table 2 shows the relationships between the grammatical symbols of the productions and the sets of terminals.

4.3. Algorithms

The following algorithms were developed based on the definitions presented in Section 4.1.
Algorithm 2 receives a non-terminal A whose productions are known; for each of these, it is verified whether they correspond to the form β B γ ; if so, Algorithm 2 is executed again, sending the non-terminal B as a parameter (provided that B A ), and its response is included in the set L ( A ) .
Subsequently, it is verified whether the production corresponds to the form β a γ , and if so, the terminal a is added to the set L ( A ) .
Once the verification of all productions of A is completed, the set L ( A ) is returned as the output of Algorithm 2.
Algorithm 2  Left algorithm of a non-terminal A
  • function Left(A)
  •    L ( A )
  •   for each production of A do
  •     if  A β B γ  then
  •        L ( A ) Left ( B ) // L ( A ) contains a Left ( B )
  •     end if
  •     if  A β a γ  then
  •        a L ( A ) // a is in L ( A )
  •     end if
  •   end for
  •   return  L ( A )
  • end function
Algorithm 3 receives a non-terminal A whose productions are known; for each of these, it is verified if they correspond to the form β B γ and β ε ; if so, Algorithm 3 is executed again, sending the non-terminal B as a parameter (provided that B A ), and its response is included in the set L m ( A ) .
Subsequently, it is verified if the production corresponds to the form β a γ and β ε . If so, the terminal a is added to the set L m ( A ) .
Once the verification of all productions of A is completed, the set L m ( A ) is returned as the output of Algorithm 3.
Algorithm 3  Leftmost algorithm of a non-terminal A
  • function Leftmost(A)
  •    L m ( A )
  •   for each production of A do
  •     if  A β B γ  and  β ε  then
  •        L m ( A ) Leftmost ( B )
  •     end if
  •     if  A β a γ  and  β ε  then
  •        a L m ( A )
  •     end if
  •   end for
  •   return  L m ( A )
  • end function
Algorithm 4 receives a non-terminal A whose productions are known; for each of these, it is verified if they correspond to the form γ B β and β ε ; if so, Algorithm 4 is executed again, sending the non-terminal B as a parameter (provided that B A ) and its response is included in the set R ( A ) .
Subsequently, it is verified whether the production corresponds to the form γ a β , and if so, the terminal a is added to the set R ( A ) .
Once all productions of A have been verified, the set R ( A ) is returned as the output of Algorithm 4.
Algorithm 4  Right algorithm of a non-terminal A
  • function Right(A)
  •    R ( A )
  •   for each production of A do
  •     if  A γ B β  and  β ε  then
  •        R ( A ) Right ( B )
  •     end if
  •     if  A γ a β  then
  •        a R ( A )
  •     end if
  •   end for
  •   return  R ( A )
  • end function
Algorithm 5 receives a grammar G ( S ) = < T , N , S , P > and Algorithms 2–4. Initially, precedence relations are added between the delimiting symbol $ and the Left and Right sets of the non-terminal symbol of start S, according to rules (A.1) and (A.2), respectively.
Algorithm 5 operates in an iterative manner. It traverses all the productions of the set P, and for each iteration, it initializes two temporary variables, u and l, to null and an empty list, respectively. It then performs a series of processes for each pair of contiguous grammatical symbols X and Y in a production.
  • The algorithm checks if X Y has the form a b ; if so, it adds the corresponding precedence relationship between a and b, following rule (B).
  • The algorithm checks if X Y has the form a B ; if so, it assigns u the value of a and adds the corresponding precedence relations between a and terminals in Left ( B ) , following rule (C).
  • The algorithm checks if X Y has the form A b ; if so, it adds the corresponding precedence relations between b and terminals in Right ( A ) , following rule (D.1). Then, check if u n u l l , and if so, add the corresponding precedence relationship between u and b, following rule (D.2), and reset u to n u l l . Finally, checks if list l has elements, and if so, adds the corresponding precedence relations between b and terminals in Right ( B ) , following rule (D.3), and resets list l to empty.
  • The algorithm checks if X Y has the form A B ; if so, it adds the corresponding precedence relations between terminals in Right ( A ) and terminals in Leftmost ( B ) , following rule (E.1). Then, it checks if u n u l l , and if so, adds the corresponding precedence relation between u and terminals in Left ( B ) , following rule (E.2). Finally, it checks if the list l has elements and if so, adds the corresponding precedence relations between terminals in Right ( C ) , with C l and terminals in Leftmost ( B ) , following rule (E.3), and resets the list l if necessary.
Algorithm 5 Algorithm to obtain the operator precedence table of a G grammar
Require:  G ( S ) = < T , N , S , P >
  • procedure Table( G ,   Left ( · ) ,   Right ( · ) ,   Leftmost ( · ) )
  •   compute Left ( A ) ,   Right ( A ) ,   Leftmost ( A ) ,   A N
  •   add $ a , a Left ( S )      (A.1)
  •   add a $ , a Right ( S )      (A.2)
  •   for each A γ in P do
  •      u n u l l
  •      l [ ] // empty list
  •     for each X Y contiguous in γ  do
  •       if  X Y = a b  then
  •         add a b      (B)
  •       end if
  •       if  X Y = a B  then
  •          u a
  •         add a b , b Left ( B )      (C)
  •       end if
  •       if  X Y = A b  then
  •         add a b , a Right ( A )      (D.1)
  •         if  u n u l l  then
  •           add u b      (D.2)
  •            u n u l l
  •         end if
  •         if  l [ ]  then
  •           for  B l  do
  •             add a b , a Right ( B )      (D.3)
  •           end for
  •            l [ ]
  •         end if
  •       end if
  •       if  X Y = A B  then
  •         add a b , a Right ( A ) b Leftmost ( B )      (E.1)
  •         if  u n u l l  then
  •           add u b , b Left ( B )      (E.2)
  •         end if
  •         if  l [ ]  then
  •           for  C l  do
  •             add c b , c Right ( C ) b Leftmost ( B )      (E.3)
  •           end for
  •         end if
  •         if  B ε  then
  •           add A to l
  •         else
  •            l [ ]
  •         end if
  •       end if
  •     end for
  •   end for
  • end procedure

4.4. Complexity Analysis

The following variables will be established for the computation of the complexity of the four proposed algorithms:
  • t: Cardinality of the set of terminal symbols.
  • p: Cardinality of the set of productions.
  • g: Number of grammatical symbols in the productions.
  • n: Cardinality of the set of non-terminal symbols.

4.4.1. Complexity of Algorithm 2 (Left)

The Left function checks in which productions the symbol A appears on the left side. In the first if-block, a recursive call of Left is observed, which implies that all non-terminal symbols of G could be traversed at the end of the execution in the search for Left ( A ) . Therefore, all worst-case productions would be traversed, yielding a computational time of O ( p ) . The second if-block has a computational time of O ( 1 ) , which does not contribute to the overall computational time O ( p ) .

4.4.2. Complexity of Algorithm 3 (Leftmost)

The Leftmost function has a similar structure to the previous one; therefore, the computational execution time is O ( p ) .

4.4.3. Complexity of Algorithm 4 (Right)

The Right function has a similar structure to Left; therefore, the computational execution time is O ( p ) .

4.4.4. Complexity of Algorithm 5 (Table)

According to the first for-cycle, the number of times it would be executed will be p. Immediately following the previous one, the inner for-cycle is executed in the worst-case g times, assuming that the total number of grammatical symbols appear in each production. Therefore, the computational cost of the two cycles is O ( p × g ) .
The computational costs for the construction of Left ( A ) , Right ( A ) , and Leftmost ( A ) , A N , are already computed. In computing the complexity of Table, the length of each generated set will be considered, which in the worst case is t.
The analysis of the four if-blocks inside the second for-cycle is as follows:
  • In the first if-block, after checking, the executed instruction is carried out in constant time and is denoted O ( 1 ) .
  • In the second if-block, after checking, the first instruction is executed in a constant time O ( 1 ) , and the second is executed in a computational time O ( t ) ; therefore, the computational time of the if-block is O ( t ) .
  • In the third if-block, after checking, three sub-blocks are distinguished; the first one corresponds to the travel of Right ( A ) , executed in a computational time O ( t ) . After checking u n u l l , the second sub-block presents two instructions executed in a constant time O ( 1 ) . The third sub-block, after the checking of l [ ] , presents a computational cost O ( n × t ) , determined by the for-cycle executed, in the worst case, O ( n ) times, corresponding to the number of non-terminal symbols in G and the travel it makes from Right ( B ) with a cost O ( t ) . In summary, the computational cost of the third if-block is O ( n × t ) , corresponding to the worst case.
  • In the fourth if-block, after checking, four sub-blocks are distinguished. The first corresponds to the combinations formed from traversing Right ( A ) and Leftmost ( B ) to obtain the precedence relations a b . Since the maximum size of each set is t, the computational cost of these combinations is O ( t 2 ) . After checking u n u l l , the second sub-block presents an instruction where Left ( B ) is traversed; therefore, the computational cost is O ( t ) . In the third sub-block, after the checking of l [ ] , it presents a for-cycle that executes at most n times the combinations formed from traversing Right ( C ) and Leftmost ( B ) to obtain the precedence relations c b . These combinations are traversed maximally in O ( t 2 ) computational time because t is the size of both sets and, consequently, the computational cost of the third sub-block is O ( n × t 2 ) . The last sub-block will have a cost of O ( 1 ) . In summary, the computational cost of the fourth block is O ( n × t 2 ) , corresponding to the worst case.
In summary, the computational cost of the four if-blocks within the second for-cycle of Algorithm 5 is O ( n × t 2 ) , this being the worst case; therefore, the computational cost of the main body of Algorithm 5 formed by the two nested for-cycles and the four if-blocks is O ( p × g × n × t 2 ) which is greater than the cost of computing any of the three sets of terminal symbols Left ( A ) , Right ( A ) , and Leftmost ( A ) , A N , when starting the algorithm, and corresponding to O ( p ) .

5. Examples

5.1. Example 1

5.1.1. Grammar

  • Given the following OPG:
  • E E + T | T
  • T T F | F
  • F ( E ) | i d
  • The sets are built step by step following the algorithms.

5.1.2. Left

The Left sets are calculated as detailed in Table 3.
  • Summary of Left sets of non-terminals:
  • Left ( E ) = { + } L ( T ) .
  • Left ( T ) = { } L ( F ) .
  • Left ( F ) = { ( , i d } .
  • Substituting the sets:
  • Left ( E ) = { + , , ( , i d } .
  • Left ( T ) = { , ( , i d } .
  • Left ( F ) = { ( , i d } .

5.1.3. Right

The Right sets are calculated as detailed in Table 4.
  • Summary of Right sets of non-terminals:
  • Right ( E ) = R ( T ) { + } .
  • Right ( T ) = R ( F ) { } .
  • Right ( F ) = { ) , i d } .
  • Substituting the sets:
  • Right ( E ) = { ) , i d , , + } .
  • Right ( T ) = { ) , i d , } .
  • Right ( F ) = { ) , i d } .

5.1.4. Leftmost

The Leftmost are calculated as detailed in Table 5.
  • Summary of Leftmost sets of non-terminals:
  • Leftmost ( E ) = L m ( T ) .
  • Leftmost ( T ) = L m ( F ) .
  • Leftmost ( F ) = { ( , i d } .
  • Substituting the sets:
  • Leftmost ( E ) = { ( , i d } .
  • Leftmost ( T ) = { ( , i d } .
  • Leftmost ( F ) = { ( , i d } .

5.1.5. Case Studies

For Algorithm 5, the application of some precedence relations is exhibited.
  • For the delimiter symbol $:
    (a)
    $ L ( E ) Rule (A.1).
    • $ +
    • $
    • $ (
    • $ i d
    (b)
    R ( E ) $ Rule (A.2).
    • ) $
    • i d $
    • $
    • + $
  • For E E + T
    (a)
    X = E , Y = + Rule (D.1).
    R ( E ) +
    • ) +
    • i d +
    • +
    • + +
    u = n u l l , l = [ ]
    (b)
    X = + , Y = T Rule (C).
    + L ( T )
    • +
    • + (
    • + i d
    u = + , l = [ ]
  • For F ( E )
    (a)
    X = ( , Y = E Rule (C).
    ( L ( E )
    • ( +
    • (
    • ( (
    • ( i d
    u = ( , l = [ ]
    (b)
    X = E , Y = ) Rule (D.1).
    R ( E ) )
    • ) )
    • i d )
    • )
    • + )
    u n u l l u b ( ) Rule (D.2).
    u = n u l l , l = [ ]
  • In the following cases, the form X Y is not fulfilled in the right side part of the production; therefore, no rule can be applied:
    (a)
    E T
    (b)
    T F
    (c)
    F i d
The remaining precedence relations are derived using the same method as the previously demonstrated cases.

5.1.6. Operator Precedence Table

At the end of the algorithm execution, operator precedence Table 6 is obtained.

5.1.7. Bottom-Up Parsing

Let i d + ( ( i d + i d ) ( i d ) ) i d be a string obtained by rightmost derivation, then applying Algorithm 1 results in the bottom-up parsing detailed in Table 7.

5.2. Example 2

5.2.1. Grammar

  • Given the following non-OPG:
  • S S D ; | D ;
  • D T i d ( L )
  • T T | i n t
  • L I | ε
  • I T | T , I
  • The sets are built step by step following the algorithms.

5.2.2. Left

The Left sets are calculated as detailed in Table 8.
  • Summary of Left sets of non-terminals:
  • Left ( S ) = L ( D ) { ; } .
  • Left ( D ) = L ( T ) { i d } .
  • Left ( T ) = { , i n t } .
  • Left ( L ) = L ( I ) .
  • Left ( I ) = L ( T ) { , } .
  • Substituting the sets:
  • Left ( S ) = { ; , i d , , i n t } .
  • Left ( D ) = { i d , , i n t } .
  • Left ( T ) = { , i n t } .
  • Left ( L ) = { , , , i n t } .
  • Left ( I ) = { , , , i n t } .

5.2.3. Right

The Right sets are calculated as detailed in Table 9.
  • Summary of Right sets of non-terminals:
  • Right ( S ) = { ; } .
  • Right ( D ) = { ) } .
  • Right ( T ) = { , i n t } .
  • Right ( L ) = R ( I ) .
  • Right ( I ) = R ( T ) { , } .
  • Substituting the sets:
  • Right ( S ) = { ; } .
  • Right ( D ) = { ) } .
  • Right ( T ) = { , i n t } .
  • Right ( L ) = { , , , i n t } .
  • Right ( I ) = { , , , i n t } .

5.2.4. Leftmost

The Leftmost sets are calculated as detailed in Table 10.
  • Summary of Leftmost sets of non-terminals:
  • Leftmost ( S ) = L m ( D ) .
  • Leftmost ( D ) = L m ( T ) .
  • Leftmost ( T ) = { i n t } .
  • Leftmost ( L ) = L m ( I ) .
  • Leftmost ( I ) = L m ( T ) .
  • Substituting the sets:
  • Leftmost ( S ) = { i n t } .
  • Leftmost ( D ) = { i n t } .
  • Leftmost ( T ) = { i n t } .
  • Leftmost ( L ) = { i n t } .
  • Leftmost ( I ) = { i n t } .

5.2.5. Case Studies

For Algorithm 5, the application of some precedence relations is exhibited.
  • For the delimiter symbol $:
    (a)
    $ L ( S ) Rule (A.1).
    • $ ;
    • $ i d
    • $
    • $ i n t
    (b)
    R ( S ) $ Rule (A.2).
    • ; $
  • For S S D ;
    (a)
    X = S , Y = D Rule (E.1).
    R ( S ) L m ( D )
    • ; i n t
    u = n u l l , l = [ ]
    (b)
    X = D , Y = ; Rule (D.1).
    R ( D ) ;
    • ) ;
    u = n u l l , l = [ ]
  • For D T i d ( L )
    (a)
    X = i d , Y = ( Rule (B).
    • i d (
    u = n u l l , l = [ ]
    (b)
    X = ( , Y = L Rule (C).
    ( L ( L )
    • ( ,
    • (
    • ( i n t
    u = ( , l = [ ]
    (c)
    X = L , Y = ) Rule (D.1).
    R ( L ) )
    • , )
    • )
    • i n t )
    u n u l l u b ( ) Rule (D.2).
    u = n u l l , l = [ ]
  • In the following cases, the form X Y is not fulfilled in the right side part of the production; therefore, no rule can be applied:
    (a)
    T i n t
    (b)
    L I
    (c)
    L ε
    (d)
    I T
The remaining precedence relations are derived using the same method as the previously demonstrated cases.

5.2.6. Operator Precedence Table

At the end of the algorithm execution, operator precedence Table 11 is obtained.

5.2.7. Bottom-Up Parsing

Let i n t i d ( ) ; i n t i d ( i n t , i n t ) ; be a string obtained by rightmost derivation, then applying Algorithm 1 results in the bottom-up parsing detailed in Table 12.

5.3. Example 3

5.3.1. Grammar

  • Given the following non-OPG:
    S A B C
  • A a A | a
  • B b B | b | ε
  • C C D c | c
  • D d
  • The sets are built step by step following the algorithms.

5.3.2. Left

The Left sets are calculated as detailed in Table 13.
  • Summary of Left sets of non-terminals:
  • Left ( S ) = L ( A ) L ( B ) L ( C ) .
  • Left ( A ) = { a } .
  • Left ( B ) = { b } .
  • Left ( C ) = L ( D ) { c } .
  • Left ( D ) = { d } .
  • Substituting the sets:
  • Left ( S ) = { a , b , d , c } .
  • Left ( A ) = { a } .
  • Left ( B ) = { b } .
  • Left ( C ) = { d , c } .
  • Left ( D ) = { d } .

5.3.3. Right

The Right sets are calculated as detailed in Table 14
  • Summary of Right sets of non-terminals:
  • Right ( S ) = R ( C ) .
  • Right ( A ) = { a } .
  • Right ( B ) = { b } .
  • Right ( C ) = { c } .
  • Right ( D ) = { d } .
  • Substituting the sets:
  • Right ( S ) = { c } .
  • Right ( A ) = { a } .
  • Right ( B ) = { b } .
  • Right ( C ) = { c } .
  • Right ( D ) = { d } .

5.3.4. Leftmost

The Leftmost sets are calculated as detailed in Table 15.
  • Summary of Leftmost sets of non-terminals:
  • Leftmost ( S ) = L m ( A ) .
  • Leftmost ( A ) = { a } .
  • Leftmost ( B ) = { b } .
  • Leftmost ( C ) = { c } .
  • Leftmost ( D ) = { d } .
  • Substituting the sets:
  • Leftmost ( S ) = { a } .
  • Leftmost ( A ) = { a } .
  • Leftmost ( B ) = { b } .
  • Leftmost ( C ) = { c } .
  • Leftmost ( D ) = { d } .

5.3.5. Case Studies

For Algorithm 5, the application of some precedence relations is exhibited.
  • For the delimiter symbol $:
    (a)
    $ L ( S ) Rule (A.1).
    • $ a
    • $ b
    • $ d
    • $ c
    (b)
    R ( S ) $ Rule (A.2).
    • c $
  • For S A B C
    (a)
    X = A , Y = B Rule (E.1).
    R ( A ) L m ( B )
    • a b
    u = n u l l , l = [ A ]
    (b)
    X = B , Y = C Rule (E.1).
    R ( B ) L m ( C )
    • b c
    A l [ ] Rule (E.3).
    R ( A ) L m ( C )
    • a c
    u = n u l l , l = [ ]
  • For A a A
    (a)
    X = a , Y = A Rule (C).
    a L ( A )
    • a a
    u = a , l = [ ]
  • For B b B
    (a)
    X = b , Y = B Rule (C).
    b L ( B )
    • b b
    u = b , l = [ ]
  • For C C D c
    (a)
    X = C , Y = D Rule (E.1).
    R ( C ) L m ( D )
    • c d
    u = n u l l , l = [ ]
    (b)
    X = D , Y = c Rule (D.1).
    R ( D ) c
    • d c
    u = n u l l , l = [ ]
  • In the following cases, the form X Y is not fulfilled in the right side part of the production; therefore, no rule can be applied:
    (a)
    A a
    (b)
    B b
    (c)
    B ε
    (d)
    C c
    (e)
    D d
The remaining precedence relations are derived using the same method as the previously demonstrated cases.

5.3.6. Operator Precedence Table

At the end of the algorithm execution, operator precedence Table 16 is obtained.

5.3.7. Bottom-Up Parsing

Let a a b b c d c be a string obtained by rightmost derivation, then applying Algorithm 1 results in the bottom-up parsing detailed in Table 17.

5.4. Discussion

In the examples, three CFGs are considered: one OPG and two non-OPGs. In the first CFG, it is observed that it complies with the basic rule, while the following ones do not since they present productions with consecutive non-terminal symbols on the right side. Regardless of the nature of the CFG, the proposed algorithms play a crucial role in obtaining the operator precedence table. Using Algorithm 1 applied to a specific string for each CFG, a bottom-up parsing is performed, verifying that the strings are generated by their respective grammars. This process underscores the functionality of the proposal to perform bottom-up parsing based on operator precedence, even when the CFGs do not fulfill the basic requirement of the OPGs.

6. Exceptions

If the CFG has any of the following forms, the rules outlined in the algorithms of this paper would not apply:
  • A a B b
    B β a
    The sets for B in the second production are:
    L ( B ) = { a } .
    R ( B ) = { a } .
    The precedence relations based on the sections of Algorithm 5 exhibit:
    • For the rule (C): a a and u = a .
    • For the rule (D.1): a b .
    • For the rule (D.2): a b .
    Applying rules (D.1) and (D.2) generates the conflict.
  • A a B b
    B a β
    Being β ε β γ a .
    According to the established conditions, the sets for B in the second production are
    L ( B ) = { a } .
    R ( B ) = { a } .
    The precedence relations based on the sections of Algorithm 5 exhibit:
    • For the rule (C): a a and u = a .
    • For the rule (D.1): a b .
    • For the rule (D.2): a b .
    The application of rules (D.1) and (D.2) generates the conflict.
    If the R ( B ) does not contain a, then the conflict does not arise.
  • A b B a
    B a β
    The sets for B in the second production are:
    L ( B ) = { a } .
    R ( B ) = { a } .
    The precedence relations based on the sections of Algorithm 5 exhibit:
    • For the rule (C): b a and u = b .
    • For the rule (D.1): a a .
    • For the rule (D.2): b a .
    The application of rules (C) and (D.2) generates the conflict.
  • A b B a
    B β a
    Being β ε β a γ .
    According to the established conditions, the sets for B in the second production are:
    L ( B ) = { a } .
    R ( B ) = { a } .
    The precedence relations based on the sections of Algorithm 5 exhibit:
    • For the rule (C): b a and u = b .
    • For the rule (D.1): a a .
    • For the rule (D.2): b a .
    The application of rules (C) and (D.2) generates the conflict.
    If the L ( B ) does not contain a, then the conflict does not arise.
  • When CFG productions take the following form, there will be no conflicts:
  • A a B b
  • B γ
  • where γ has neither a at the end of the right side nor b at the end of the left side.

7. Conclusions

Previous work has addressed the application of OPGs to solve problems related to the parsing of various forms of languages; however, not much emphasis has been placed on the definition of the Left and Right sets nor on the generation of an algorithm to find the operator precedence table.
In this work, we have redefined the sets Right and Left, previously exposed by other authors, and introduced a novel set called Leftmost. These sets form the basis for an algorithm that constructs the operator precedence table. Each set is constructed from an algorithm that facilitates its obtaining, adding a new dimension to the existing research.
The proposed algorithm for constructing the precedence table, while breaking (with some exceptions) the basic definition of an OPG, opens up new possibilities. It allows for the construction of an operator precedence table from a non-OPG CFG, paving the way for a bottom-up parsing algorithm to recognize a string generated by the CFG.
Three examples of CFGs are shown: one OPG and two non-OPGs. The proposed algorithms are systematically applied to these examples, step by step, until the operator precedence tables are obtained, regardless of the CFG’s nature. Additionally, the bottom-up parsing algorithm is applied to specific strings for each grammar, providing verification that they generated them.
It is important to note that while the algorithm and the definition of the new sets of terminal symbols are significant advancements, they do have limitations. It is not always possible to solve all cases where two or more adjacent terminal symbols are on the right-hand side. There are exceptions, and we must be aware of them.

Author Contributions

Conceptualization, L.L. and J.M.; methodology, J.M.; software, L.L.; validation, L.L. and J.M.; formal analysis, L.L., E.A., and J.M.; investigation, L.L. and J.M.; resources, J.M.; data curation, E.A.; writing—original draft preparation, E.A. and J.M.; writing—review and editing, E.A. and J.M.; visualization, E.A.; supervision, J.M.; project administration, J.M.; funding acquisition, J.M.; All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding and the APC was funded by Universidad del Norte.

Data Availability Statement

Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Floyd, R.W. Syntactic analysis and operator precedence. J. ACM (JACM) 1963, 10, 316–333. [Google Scholar] [CrossRef]
  2. Alfred, V.; Monica, S.; Ravi, S.; Jeffrey, D.U. Compilers Principles, Techniques; Pearson Education: London, UK, 2007. [Google Scholar]
  3. Grune, D.; Jacobs, C.J.H. Parsing Techniques (Monographs in Computer Science); Springer: Berlin, Heidelberg, 2006. [Google Scholar]
  4. Crespi-Reghizzi, S. An effective model for grammar inference. In Information Processing 71; Gilchrist, B., Ed.; Elsevier: North-Holland, New York, 1972; pp. 524–529. [Google Scholar]
  5. Crespi-Reghizzi, S. Reduction of Enumeration in Grammar Acquisition. In Proceedings of the 2nd International Joint Conference on Artificial Intelligence, San Francisco, CA, USA, 1–3 September 1971; pp. 546–552. [Google Scholar]
  6. Crespi-Reghizzi, S.; Melkanoff, M.A.; Lichten, L. The Use of Grammatical Inference for Designing Programming Languages. Commun. ACM 1973, 16, 83–90. [Google Scholar] [CrossRef]
  7. von Braunmühl, B.; Verbeek, R. Input Driven Languages are Recognized in log n Space. In Topics in the Theory of Computation; North-Holland Mathematics Studies; Karplnski, M., van Leeuwen, J., Eds.; Elsevier: North-Holland, The Netherlands, 1985; Volume 102, pp. 1–19. [Google Scholar]
  8. Mehlhorn, K. Pebbling mountain ranges and its application to DCFL-recognition. In Automata, Languages and Programming; de Bakker, J., van Leeuwen, J., Eds.; Springer: Berlin/Heidelberg, Germany, 1980; pp. 422–435. [Google Scholar]
  9. Mandrioli, D.; Pradella, M. Generalizing input-driven languages: Theoretical and practical benefits. Comput. Sci. Rev. 2018, 27, 61–87. [Google Scholar] [CrossRef]
  10. Alur, R.; Madhusudan, P. Adding Nesting Structure to Words. J. ACM 2009, 56, 1–43. [Google Scholar] [CrossRef]
  11. Crespi Reghizzi, S.; Mandrioli, D. Operator precedence and the visibly pushdown property. J. Comput. Syst. Sci. 2012, 78, 1837–1867. [Google Scholar] [CrossRef]
  12. Crespi Reghizzi, S.; Pradella, M. Beyond operator-precedence grammars and languages. J. Comput. Syst. Sci. 2020, 113, 18–41. [Google Scholar] [CrossRef]
  13. Barenghi, A.; Crespi Reghizzi, S.; Mandrioli, D.; Panella, F.; Pradella, M. Parallel parsing made practical. Sci. Comput. Program. 2015, 112, 195–226. [Google Scholar] [CrossRef]
  14. Crespi-Reghizzi, S.; Mandrioli, D.; Martin, D.F. Algebraic properties of operator precedence languages. Inf. Control 1978, 37, 115–133. [Google Scholar] [CrossRef]
  15. Chomsky, N. Three models for the description of language. IRE Trans. Inf. Theory 1956, 2, 113–124. [Google Scholar] [CrossRef]
  16. Aho, A.V.; Ullman, J.D. The Theory of Parsing, Translation, and Compiling; Prentice-Hall Englewood: Cliffs, NJ, USA, 1973; Volume 1. [Google Scholar]
  17. Louden Kenneth, C. Compiler Construction: Principles and Practice. In Course Technology; PWS Publishing Co.: Montgomery, IL, USA, 1997. [Google Scholar]
Figure 1. Derivation of d f a c b chain and L e f t m o s t ( B ) set creation sequence.
Figure 1. Derivation of d f a c b chain and L e f t m o s t ( B ) set creation sequence.
Algorithms 17 00345 g001
Table 1. Bottom-up parsing table.
Table 1. Bottom-up parsing table.
StackInputAction
$ w $  
Shift/Reduce
Shift/Reduce
$ S $Accept
Table 2. General productions and precedence relationships obtained.
Table 2. General productions and precedence relationships obtained.
ProducingRelationship
A a β b a b
A C β b , β ε R ( C ) b
A a β C a L ( C )
A B β C , β ε R ( B ) L m ( C )
Table 3. Left Sets calculation for the CFG in Example 1.
Table 3. Left Sets calculation for the CFG in Example 1.
Left(E)
For E E + T
1. Evaluating the first conditional of Algorithm 2. A = E 2. Evaluating the second conditional of Algorithm 2. A = E
β = ε β = E
B = E a = +
γ = + T γ = T
L ( E ) L ( E ) L ( E ) { + }
For E T
3. Evaluating the first conditional of Algorithm 2. A = E   
β = ε   
B = T   
γ = ε   
L ( E ) = { + } L ( T )   
 
Left(T)
For T T F
1. Evaluating the first conditional of Algorithm 2. A = T 2. Evaluating the second conditional of Algorithm 2. A = T
β = ε β = T
B = T a =
γ = F γ = F
L ( T ) L ( T ) L ( T ) { }
For T F
3. Evaluating the first conditional of Algorithm 2. A = T   
β = ε   
B = F   
γ = ε   
L ( T ) = { } L ( F )   
 
Left(F)
For F ( E )  For F i d  
1. Evaluating the second conditional of Algorithm 2. A = F 2. Evaluating the second conditional of Algorithm 2. A = F
β = ε β = ε
a = ( a = i d
γ = E ) γ = ε
L ( F ) { ( } L ( F ) = { ( , i d }
Table 4. Right Sets calculation for the CFG in Example 1.
Table 4. Right Sets calculation for the CFG in Example 1.
Right(E)
For E E + T
1. Evaluating the first conditional of Algorithm 4. A = E 2. Evaluating the second conditional of Algorithm 4. A = E
γ = E + γ = E
B = T a = +
β = ε β = T
R ( E ) R ( T ) R ( E ) R ( T ) { + }
For E T
3. Evaluating the first conditional of Algorithm 4. A = E   
γ = ε   
B = T   
β = ε   
R ( E ) = R ( T ) { + }   
 
Right(T)
For T T F
1. Evaluating the first conditional of Algorithm 4. A = T 2. Evaluating the second conditional of Algorithm 4. A = T
γ = T γ = T
B = F a =
β = ε β = F
R ( T ) R ( F ) R ( T ) R ( F ) { }
For T F
3. Evaluating the first conditional of Algorithm 4. A = T   
γ = ε   
B = F   
β = ε   
R ( T ) = R ( F ) { }   
 
Right(F)
For F ( E )  For F i d  
1. Evaluating the second conditional of Algorithm 2. A = F 2. Evaluating the second conditional of Algorithm 2. A = F
γ = ( E γ = ε
a = ) a = i d
β = ε β = ε
R ( F ) { ) } R ( F ) = { ) , i d }
Table 5. Leftmost Sets calculation for the CFG in Example 1.
Table 5. Leftmost Sets calculation for the CFG in Example 1.
Leftmost(E)
For E E + T  For E T  
1. Evaluating the first conditional of Algorithm 3. A = E 2. Evaluating the first conditional of Algorithm 3. A = E
β = ε β = ε
B = E B = T
γ = + T γ = ε
L m ( E ) L m ( E ) L m ( E ) = L m ( T )
 
Leftmost(T)
For T T F  For T F  
1. Evaluating the first conditional of Algorithm 3. A = T 2. Evaluating the first conditional of Algorithm 3. A = T
β = ε β = ε
B = T B = F
γ = F γ = ε
L m ( T ) L m ( T ) L m ( T ) = L m ( F )
 
Leftmost(F)
For F ( E )  For F i d  
1. Evaluating the second conditional of Algorithm 3. A = F 2. Evaluating the second conditional of Algorithm 3. A = F
β = ε β = ε
a = ( a = i d
γ = E ) γ = ε
L m ( F ) { ( } L m ( F ) = { ( , i d }
Table 6. Precedence Table of Example 1.
Table 6. Precedence Table of Example 1.
 +() i d $
+
( 
)  
i d   
$   
Table 7. Bottom-up Parsing of Example 1.
Table 7. Bottom-up Parsing of Example 1.
StackInputAction
$ i d + ( ( i d + i d ) ( i d ) ) i d $ Shift
$ i d + ( ( i d + i d ) ( i d ) ) i d $ Reduce
$ + ( ( i d + i d ) ( i d ) ) i d $ Shift
$ + ( ( i d + i d ) ( i d ) ) i d $ Shift
$ + ( ( i d + i d ) ( i d ) ) i d $ Shift
$ + ( ( i d + i d ) ( i d ) ) i d $ Shift
$ + ( ( i d + i d ) ( i d ) ) i d $ Reduce
$ + ( ( + i d ) ( i d ) ) i d $ Shift
$ + ( ( + i d ) ( i d ) ) i d $ Shift
$ + ( ( + i d ) ( i d ) ) i d $ Reduce
$ + ( ( + ) ( i d ) ) i d $ Reduce
$ + ( ( ) ( i d ) ) i d $ Shift
$ + ( ( ) ( i d ) ) i d $ Reduce
$ + ( ( i d ) ) i d $ Shift
$ + ( ( i d ) ) i d $ Shift
$ + ( ( i d ) ) i d $ Shift
$ + ( ( i d ) ) i d $ Reduce
$ + ( ( ) ) i d $ Shift
$ + ( ( ) ) i d $ Reduce
$ + ( ) i d $ Reduce
$ + ( ) i d $ Shift
$ + ( ) i d $ Reduce
$ + i d $ Shift
$ + i d $ Shift
$ + i d $ Reduce
$ + $ Reduce
$ + $ Reduce
$ $ Accept
Table 8. Left Sets calculation for the CFG in Example 2.
Table 8. Left Sets calculation for the CFG in Example 2.
Left(S)
For S S D ;
1. Evaluating the first conditional of Algorithm 2. A = S 2. Evaluating the first conditional of Algorithm 2. A = S
β = ε β = S
B = S B = D
γ = D ; γ = ;
L ( S ) L ( S ) L ( S ) L ( D )
3. Evaluating the second conditional of Algorithm 2. A = S   
β = S D   
a = ;   
γ = ε   
L ( S ) L ( D ) { ; }   
For S D ;
4. Evaluating the first conditional of Algorithm 2. A = S 5. Evaluating the second conditional of Algorithm 2. A = S
β = ε β = D
B = D a = ;
γ = ; γ = ε
L ( S ) L ( D ) { ; } L ( D ) L ( S ) = L ( D ) { ; }
 
Left(D)
For D T i d ( L )
1. Evaluating the first conditional of Algorithm 2. A = D 2. Evaluating the second conditional of Algorithm 2. A = D
β = ε β = T
B = T a = i d
γ = i d ( L ) γ = ( L )
L ( D ) L ( T ) L ( D ) = L ( T ) { i d }
 
Left(T)
For T T
1. Evaluating the first conditional of Algorithm 2. A = T 2. Evaluating the second conditional of Algorithm 2. A = T
β = ε β = T
B = T a =
γ = γ = ε
L ( T ) L ( T ) L ( T ) { }
For T i n t
3. Evaluating the second conditional of Algorithm 2. A = T   
β = ε   
a = i n t   
γ = ε   
L ( T ) = { , i n t }   
 
Left(L)
For L I
1. Evaluating the first conditional of Algorithm 2. A = L   
β = ε   
B = I   
γ = ε   
 
L ( L ) = L ( I )    
Left(I)
For I T
1. Evaluating the first conditional of Algorithm 2. A = I   
β = ε   
B = T   
γ = ε   
L ( I ) L ( T )   
For I T , I
2. Evaluating the first conditional of Algorithm 2. A = I 3. Evaluating the second conditional of Algorithm 2. A = I
β = ε β = T
B = T a = ,
γ = , I γ = I
L ( I ) L ( T ) L ( I ) = L ( T ) { , }
Table 9. Right Sets calculation for the CFG in Example 2.
Table 9. Right Sets calculation for the CFG in Example 2.
Right(S)
For S S D ;  For S D ;  
1. Evaluating the second conditional of Algorithm 4. A = S 2. Evaluating the second conditional of Algorithm 4. A = S
γ = S D γ = D
a = ; a = ;
β = ε β = ε
R ( S ) { ; } R ( S ) = { ; }
 
Right(D)
For D T i d ( L )
1. Evaluating the second conditional of Algorithm 4. A = D   
γ = T i d ( L   
a = )   
β = ε   
R ( D ) = { ) }   
 
Right(T)
For T T  For T i n t  
1. Evaluating the second conditional of Algorithm 4. A = T 2. Evaluating the second conditional of Algorithm 4. A = T
γ = T γ = ε
a = a = i n t
β = ε β = ε
R ( T ) { } R ( T ) = { , i n t }
 
Right(L)
For L I
1. Evaluating the first conditional of Algorithm 4. A = L   
γ = ε   
B = I   
β = ε   
R ( L ) = R ( I )   
 
Right(I)
For I T
1. Evaluating the first conditional of Algorithm 4. A = I   
γ = ε   
B = T   
β = ε   
R ( I ) R ( T )   
For I T , I
2. Evaluating the first conditional of Algorithm 4. A = I 3. Evaluating the second conditional of Algorithm 4. A = I
γ = T , γ = T
B = I a = ,
β = ε β = I
R ( I ) R ( T ) R ( I ) R ( I ) = R ( T ) { , }
Table 10. Leftmost Sets calculation for the CFG in Example 2.
Table 10. Leftmost Sets calculation for the CFG in Example 2.
Leftmost(S)
For S S D ;  For S D ;  
1. Evaluating the first conditional of Algorithm 3. A = S 2. Evaluating the first conditional of Algorithm 3. A = S
β = ε β = ε
B = S B = D
γ = D ; γ = ;
L m ( S ) L m ( S ) L m ( S ) = L m ( D )
 
Leftmost(D)
For D T i d ( L )
1. Evaluating the first conditional of Algorithm 3. A = D   
β = ε   
B = T   
γ = i d ( L )   
L m ( D ) = L m ( T )   
 
Leftmost(T)
For T T  For T i n t  
1. Evaluating the first conditional of Algorithm 3. A = T 2. Evaluating the second conditional of Algorithm 3. A = T
β = ε β = ε
B = T a = i n t
γ = γ = ε
L m ( T ) L m ( T ) L m ( T ) = { i n t }
 
Leftmost(L)
For L I
1. Evaluating the first conditional of Algorithm 3. A = L   
β = ε   
B = I   
γ = ε   
L m ( L ) = L m ( I )   
 
Leftmost(I)
For I T  For I T , I  
1. Evaluating the first conditional of Algorithm 3. A = I 2. Evaluating the first conditional of Algorithm 3. A = I
β = ε β = ε
B = T B = T
γ = ε γ = , I
L m ( I ) L m ( T ) L m ( I ) = L m ( T )
Table 11. Precedence Table of Example 2.
Table 11. Precedence Table of Example 2.
 ; i d () i n t , $
;      
i d        
(    
)       
    
i n t     
,    
$     
Table 12. Bottom-up Parsing of Example 2.
Table 12. Bottom-up Parsing of Example 2.
StackInputAction
$ i n t i d ( ) ; i n t i d ( i n t , i n t ) ; $ Shift
$ i n t i d ( ) ; i n t i d ( i n t , i n t ) ; $ Reduce
$ i d ( ) ; i n t i d ( i n t , i n t ) ; $ Shift
$ i d ( ) ; i n t i d ( i n t , i n t ) ; $ Shift
$ i d ( ) ; i n t i d ( i n t , i n t ) ; $ Shift
$ i d ( ) ; i n t i d ( i n t , i n t ) ; $ Reduce
$ ; i n t i d ( i n t , i n t ) ; $ Shift
$ ; i n t i d ( i n t , i n t ) ; $ Reduce
$ i n t i d ( i n t , i n t ) ; $ Shift
$ i n t i d ( i n t , i n t ) ; $ Reduce
$ i d ( i n t , i n t ) ; $ Shift
$ i d ( i n t , i n t ) ; $ Shift
$ i d ( i n t , i n t ) ; $ Shift
$ i d ( i n t , i n t ) ; $ Reduce
$ i d ( , i n t ) ; $ Shift
$ i d ( , i n t ) ; $ Shift
$ i d ( , i n t ) ; $ Reduce
$ i d ( , ) ; $ Reduce
$ i d ( ) ; $ Shift
$ i d ( ) ; $ Reduce
$ ; $ Shift
$ ; $ Reduce
$ $ Accept
Table 13. Left Sets calculation for the CFG in Example 3.
Table 13. Left Sets calculation for the CFG in Example 3.
Left(S)
For S A B C
1. Evaluating the first conditional of Algorithm 2. A = S 2. Evaluating the first conditional of Algorithm 2. A = S
β = ε β = A
B = A B = B
γ = B C γ = C
L ( S ) L ( A ) L ( S ) L ( A ) L ( B )
3. Evaluating the first conditional of Algorithm 2. A = S   
β = A B   
B = C   
γ = ε   
L ( S ) = L ( A ) L ( B ) L ( C )   
 
Left(A)
For A a A
1. Evaluating the second conditional of Algorithm 2. A = A 2. Evaluating the first conditional of Algorithm 2. A = A
β = ε β = a
a = a B = A
γ = A γ = ε
L ( A ) { a } L ( A ) { a } L ( A )
For A a
3. Evaluating the second conditional of Algorithm 2. A = A   
β = ε   
a = a   
γ = ε   
L ( A ) = { a }   
 
Left(B)
For B b B
1. Evaluating the second conditional of Algorithm 2. A = B 2. Evaluating the first conditional of Algorithm 2. A = B
β = ε β = b
a = b B = B
γ = B γ = ε
L ( B ) { b } L ( B ) { b } L ( B )
For B b
3. Evaluating the second conditional of Algorithm 2. A = B   
β = ε   
a = b   
γ = ε   
L ( B ) = { b }   
 
Left(C)
For C C D c
1. Evaluating the first conditional of Algorithm 2. A = C 2. Evaluating the first conditional of Algorithm 2. A = C
β = ε β = C
B = C B = D
γ = D c γ = c
L ( C ) L ( C ) L ( C ) L ( D )
3. Evaluating the second conditional of Algorithm 2. A = C   
β = C D   
a = c   
γ = ε   
L ( C ) L ( D ) { c }   
For C c
4. Evaluating the second conditional of Algorithm 2. A = C   
β = ε   
a = c   
γ = ε   
L ( C ) = L ( D ) { c }   
 
Left(D)
For D d
1. Evaluating the second conditional of Algorithm 2. A = D   
β = ε   
a = d   
γ = ε   
L ( D ) = { d }   
Table 14. Right Sets calculation for the CFG in Example 3.
Table 14. Right Sets calculation for the CFG in Example 3.
Right(S)
For S A B C
1. Evaluating the first conditional of Algorithm 4. A = S   
γ = A B   
B = C   
β = ε   
R ( S ) = R ( C )   
 
Right(A)
For A a A
1. Evaluating the first conditional of Algorithm 4. A = A 2. Evaluating the second conditional of Algorithm 4. A = A
γ = a γ = ε
B = A a = a
β = ε β = A
R ( A ) R ( A ) R ( A ) { a }
For A a
3. Evaluating the second conditional of Algorithm 4. A = A   
γ = ε   
a = a   
β = ε   
R ( A ) = { a }   
 
Right(B)
For B b B
1. Evaluating the first conditional of Algorithm 4. A = B 2. Evaluating the second conditional of Algorithm 4. A = B
γ = b γ = ε
B = B a = b
β = ε β = B
R ( B ) R ( B ) R ( B ) { b }
For B b
3. Evaluating the second conditional of Algorithm 4. A = B   
γ = ε   
a = b   
β = ε   
R ( b ) = { b }   
 
Right(C)
For C C D c  For C c  
1. Evaluating the second conditional of Algorithm 4. A = C 2. Evaluating the second conditional of Algorithm 4. A = C
γ = C D γ = ε
a = c a = c
β = ε β = ε
R ( C ) { c } R ( C ) = { c }
 
Right(D)
For D d
1. Evaluating the second conditional of Algorithm 4. A = D   
γ = ε   
a = d   
β = ε   
R ( D ) = { d }   
Table 15. Leftmost Sets calculation for the CFG in Example 3.
Table 15. Leftmost Sets calculation for the CFG in Example 3.
Leftmost(S)
For S A B C
1. Evaluating the first conditional of Algorithm 3. A = S   
β = ε   
B = A   
γ = B C   
L m ( S ) = L m ( A )   
 
Leftmost(A)
For A a A  For A a  
1. Evaluating the second conditional of Algorithm 3. A = A 2. Evaluating the second conditional of Algorithm 3. A = A
β = ε β = ε
a = a a = a
γ = A γ = ε
L m ( A ) { a } L m ( A ) = { a }
 
Leftmost(B)
For B b B  For B b  
1. Evaluating the second conditional of Algorithm 3. A = B 2. Evaluating the second conditional of Algorithm 3. A = B
β = ε β = ε
a = b a = b
γ = B γ = ε
L m ( B ) { b } L m ( B ) = { b }
 
Leftmost(C)
For C C D c  For C c  
1. Evaluating the first conditional of Algorithm 3. A = C 2. Evaluating the second conditional of Algorithm 3. A = C
β = ε β = ε
B = C a = c
γ = D c γ = ε
L m ( C ) L m ( C ) L m ( C ) = { c }
 
Leftmost(D)
For D d
1. Evaluating the second conditional of Algorithm 3. A = D   
β = ε   
a = d   
γ = ε   
L m ( D ) = { d }   
Table 16. Precedence Table of Example 3.
Table 16. Precedence Table of Example 3.
 abcd $
a  
b   
c   
d    
$  
Table 17. Bottom-up Parsing of Example 3.
Table 17. Bottom-up Parsing of Example 3.
StackInputAction
$ a a b b c d c $ Shift
$ a a b b c d c $ Shift
$ a a b b c d c $ Reduce
$ a b b c d c $ Reduce
$ b b c d c $ Shift
$ b b c d c $ Shift
$ b b c d c $ Reduce
$ b c d c $ Reduce
$ c d c $ Shift
$ c d c $ Reduce
$ d c $ Shift
$ d c $ Reduce
$ c $ Shift
$ c $ Reduce
$ $ Accept
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Lizcano, L.; Angulo, E.; Márquez, J. Precedence Table Construction Algorithm for CFGs Regardless of Being OPGs. Algorithms 2024, 17, 345. https://doi.org/10.3390/a17080345

AMA Style

Lizcano L, Angulo E, Márquez J. Precedence Table Construction Algorithm for CFGs Regardless of Being OPGs. Algorithms. 2024; 17(8):345. https://doi.org/10.3390/a17080345

Chicago/Turabian Style

Lizcano, Leonardo, Eduardo Angulo, and José Márquez. 2024. "Precedence Table Construction Algorithm for CFGs Regardless of Being OPGs" Algorithms 17, no. 8: 345. https://doi.org/10.3390/a17080345

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop