Hello, OnlineGDB Q&A section lets you put your programming query to fellow community users. Asking a solution for whole assignment is strictly not allowed. You may ask for help where you are stuck. Try to add as much information as possible so that fellow users can know about your problem statement easily.

Is this code correct using recursive descent parser? And how to get input from the input.txt file?

+3 votes
asked Jul 16, 2021 by npatel300 (1,440 points)
//A -> I = E
//E -> P O P | P
//O -> + | - | * | / | **
//P -> I | L | UI | UL | (E)
//U -> + | - | !
//I -> C | CI
//C -> a | b | ... | y | z
//L -> D | DL
//D -> 0 | 1 | ... | 8 | 9

#include <iostream>
#include <string>
using std::cout;
using std::endl;
using std::string;

bool A(void);
bool E(void);
bool O(void);
bool P(void);
bool U(void);
bool I(void);
bool C(void);
bool L(void);
bool D(void);

string s;

int main(int argc, char *argv[])
{
    string t;
    s = t = argc == 2 ? argv[1] : "";
    if (A() && s == "")
    {
        cout << "The string \"" << t << "\" is in the language." << endl;
    }
    else
    {
        cout << "The string \"" << t << "\" is not in the language." << endl;
    }
    return 0;
}

bool A(void)
{
    if (E == 0)
    {
        I == E;
        return true;
    }
    return false;
}

bool E(void)
{
    if (P())
    {
        if(s[0] == 'P' && s[0] == 'O' && s[0] == 'P')
        {
            return true;
        }
    }
    return false;
}

bool O(void)
{
    if(s[0] == '+' || s[0] == '-')
    {
        s = s.substr(1);
        if (O()){
            return true;
        }
    }
    if(s[0] == '*' || s[0] == '/' || s[0] == '**')
    {
        s = s.substr(1);
        if (O())
        {
            return true;
        }
        return false;
    }
}

bool P(void)
{
    if(I())
    {
        return true;
    }
    else if (L())
    {
        return true;
    }
    if(s[0] == 'U' && s[0] == 'L')
    {
        return true;
    }
    else if(s[0] == 'U' && s[0] == 'I')
    {
        return true;
    }
    else if (s[0] == '(')
    {
        s = s.substr(1);
        if (E())
        {
            if (s[0] == ')')
            {
                s = s.substr(1);
                return true;
            }
        }
    }
    return false;
}

bool U(void)
{
    if(s[0] == '+' || s[0] == '-' || s[0] == '!')
    {
        s = s.substr(1);
        if (U())
        {
            return true;
        }
    }
    return false;
}

bool I(void)
{
    if (C())
    {
        return true;
    }
    else if (s[0] == 'C' && s[0] == 'I')
    {
       return true;
    }
    else if (s[0] == ')')
    {
        s = s.substr(1);
        return true;
    }
    return false;
}

bool C(void)
{
    if (s[0] == 'a')
    {
        return true;
    }
    else if (s[0] == 'b')
    {
        return true;
    }
    if (s[0] == ')')
    {
        s = s.substr(1);
        return true;
    }
    if ('b' <= s[0] && s[0] <= 'y' && s[0] <= 'z')
    {
        s = s.substr(1);
        return true;
    }
    return false;
}

bool L(void)
{
    if (D())
    {
        return true;
    }
    else if (s[0] == 'D' && s[0] == 'L')
    {
        return true;
    }
    return false;
}

bool D(void)
{
    if (s[0] == '0' || s[0] == '1')
    {
        return true;
    }
    else if (s[0] == '8' || s[0] == '9')
    {
        return true;
    }
    if (s[0] == ')')
    {
        s = s.substr(1);
        return true;
    }
    if ('1' <= s[0] && s[0] <= '8' && s[0] <= '9')
    {
        s = s.substr(1);
        return true;
    }
    return false;
}

2 Answers

0 votes
answered Jul 16, 2021 by Peter Minarik (86,860 points)
edited Jul 16, 2021 by Peter Minarik

The Parser

Okay... Parsers... An LL(k) Parser. I've never been great at these. :)

First of all, it would be nice to include some documentation what your program is supposed to do. What is the syntax of the rules you provided.

Your code is wrong in many-many places. For instance, you compare function pointers (I == E or E == 0). Other places instead of applying the rules P then O then P you search for literal letters 'P', 'O', 'P'.

So first of all, let's clear the requirements, let's clear the language that describes the rules.

The Rules

To my understanding the rules should be something, like this:

  • Assignment:      A -> I '=' E
  • Expression:      E -> P O P | P
  • Binary Operator: O -> '+' | '-' | '*' | '/' | '*''*'
  • Parameter:       P -> I | L | UI | UL | '(' E ')'
  • Unary Operator:  U -> '+' | '-' | '!'
  • Identifier:      I -> C | CI
  • Character:       C -> 'a' | 'b' | ... | 'y' | 'z'
  • Numeric Literal: L -> D | DL
  • Digit:           D -> '0' | '1' | ... | '8' | '9'

Where

  • ? ? ? means a symbol (?) is followed by another symbol (?). A symbol may be a rule or a literal character. E.g. P O P means apply the rule P then the rule O then the rule P
  • '?' means a character literal (e.g. the character '0' or character 'a' or '=' character).
  • X means a rule, e.g. the rule A or the rule E
  • | means OR, e.g. either plus or minus sign would be '+' | '-'

Notes

  • in rule O '*''*' means two asterisk operator following each other
  • spaces must not be included in the language that we want to parse. The parser would reject such sentences.

Examples

Below are a few examples that are part of the above defined language.

Positive Examples

  • area=a*b
  • area=(side**2)
  • zero=-1--1

Negative Examples

  • area = a * b --> no spaces allows
  • i=sqrt(-1) --> identifier must be followed by an operator (parenthesis is not an operator)
  • 0=-1--1 --> left side must be an identifier

Let's see the code

So, now we know how everything should work so we can focus on the code.

The main idea is that we're trying to apply the rules based on the current element. If a rule matches, only then we move to the next element in the list.

main.cpp

#include <iostream>

#include "Parser.h"

int main(int argc, char * argv[])
{
    Parser parser;
    const char * sentence = argc >= 2 ? argv[1] : "";
    if (parser.CanParse(sentence))
    {
        std::cout << '"' << sentence << "\" is part of the language." << std::endl;
        return 0;
    }
    else
    {
        std::cout << "Sentence \"" << sentence << "\" is not part of the language." << std::endl;
        std::cout << "Parsing stopped at offset " << parser.CurrentOffset() << " on character '" << parser.CurrentChar() << "'." << std::endl;
        return 1;
    }
}

Parser.h

#pragma once

#include <cstddef>

class Parser
{
private:
    const char * sentence; // The sentence to be parsed
    const char * current; // The current character being parsed
    
    bool IsCurrent(char ch); // Is the current a specific character? Increment current if true.
    bool IsCurrent(const char * chars); // Is the current character any of the chars (zero terminated string)
    bool IsCurrentInRange(char rangeStart, char rangeEnd); // Check if the current token is withing the [rangeStart, rangeEnd] (start and end inclusive) range
    
    bool A(); // A -> I '=' E
    bool E(); // E -> P O P | P
    bool O(); // O -> '+' | '-' | '*' | '/' | "**"
    bool P(); // P -> I | L | UI | UL | '(' E ')'
    bool U(); // U -> '+' | '-' | '!'
    bool I(); // I -> C | CI
    bool C(); // C -> 'a' | 'b' | ... | 'y' | 'z'
    bool L(); // L -> D | DL
    bool D(); // D -> '0' | '1' | ... | '8' | '9'
    
public:
    Parser();
    bool CanParse(const char * str);
    char CurrentChar() const;
    size_t CurrentOffset() const;
};

Parser.cpp

#include "Parser.h"

Parser::Parser() :
    sentence(nullptr),
    current(nullptr)
{ }

bool Parser::IsCurrent(char ch)
{
    if (*current == ch)
    {
        current++;
        return true;
    }
    
    return false;
}

bool Parser::IsCurrent(const char * chars)
{
    for (const char * ch = chars; *ch != '\0'; ch++)
        if (IsCurrent(*ch))
            return true;

    return false;
}

bool Parser::IsCurrentInRange(char rangeStart, char rangeEnd)
{
    if (rangeStart <= *current && *current <= rangeEnd)
    {
        current++;
        return true;
    }
    
    return false;
}

bool Parser::A() { return I() && IsCurrent('=') && E(); }

bool Parser::E()
{
    if (P())
    {
        const char * start = current;
        if (!(O() && P()))
            current = start; // Revert the pointer if the rules do not apply

        return true;
    }
    
    return false;
}

bool Parser::O()
{
    if (IsCurrent("+-/"))
        return true;
    
    if (IsCurrent('*'))
    {
        IsCurrent('*'); // Optinally we may consume another asterisk
        return true;
    }
}

bool Parser::P()
{
    if (I() || L())
        return true;
    
    const char * start = current;
    if (U() && (I() || L()))
        return true;
    
    current = start;
    if (IsCurrent('(') && E() && IsCurrent(')'))
        return true;
    
    current = start;
    return false;
}

bool Parser::U() { return IsCurrent("+-!"); }

bool Parser::I()
{
    if (C())
    {
        I(); // I is optional, so we don't want to return false if I failed
        return true;
    }
    
    return false;
}

bool Parser::C() { return IsCurrentInRange('a', 'z'); }

bool Parser::L()
{
    if (D())
    {
        L(); // L is optional, so we don't want to return false if L failed
        return true;
    }
    
    return false;
}

bool Parser::D() { return IsCurrentInRange('0', '9'); }

bool Parser::CanParse(const char * str)
{
    sentence = str;
    current = str;
    return A() && *current == '\0';
}

char Parser::CurrentChar() const { return current ? *current : '\0'; }

size_t Parser::CurrentOffset() const { return current - sentence; }
0 votes
answered Jul 16, 2021 by Peter Minarik (86,860 points)

Reading Input from a File

  • For C++, please, check this out.
  • For C, please, check this out instead.
commented Jul 17, 2021 by npatel300 (1,440 points)
I have my main.cc and input file separately, but where should I put my input in order to make it run?
commented Jul 17, 2021 by Peter Minarik (86,860 points)
You can add a new file to your project. Then you can open that file from your source code.

An alternative is to use the "Command Line Argument" (at the bottom of the screen, below the code editor) and redirect the file's content into the standard input. And in your code you read from the standard input.

So the command line argument would be something like "< myInput.txt" (without the quotes, of course).
Welcome to OnlineGDB Q&A, where you can ask questions related to programming and OnlineGDB IDE and and receive answers from other members of the community.
...