1 基本概念
1.1 文法符号
- 排在前面的大写字母表示非终结符,例如$A$、$B$、$C$、$D$、$E$、$F$、$G$
- 排在前面的小写字母表示终结符,例如$a$、$b$、$c$
- 排在后面的大写字母表示文法符号,例如$X$、$Y$、$Z$
1.2 文法符号串
1.3 产生式
$$\alpha \to \beta$$
1.4 文法
- 0-型文法,又称为无限制文法$$\alpha \to \beta$$
- 1-型文法,又称为上下文有关文法$$\alpha_1A\alpha_2 \to \alpha_1\beta\alpha_2,|\alpha|≤|\beta|$$
- 2-型文法,又称为上下文无关文法$$A \to \beta$$
- 3-型文法,又称为正则文法$$A \to wB\;\;\;or\;\;\;A \to w$$ or $$A \to Bw\;\;\;or\;\;\;A \to w$$
1.5 分析法概述
- 计算first集合
- 计算follow集合
- 计算select集合
- 生成预测分析表
- 生成LR自动机
- 生成预测分析表
1.5.1 first集
- 显然如果$X$是终结符,那么$First(X)=\{X\}$
1.5.2 follow集
2 正则表达式实践篇
- base package:
- 包括nfa自动机以及dfa自动机
- 支持以下正则表达式通配符:
- 支持量词
- 支持部分转义,包括
- 支持
- nfa自动机支持捕获组(nfa转dfa时,捕获组信息会丢失,因此dfa自动机尚不支持捕获组。目前还没有很好的解决方法)
3 编译引擎介绍
- 词法分析
- 语法分析
- 语义分析
- 中间代码生成
- 代码优化
- 存储管理
4 Hua语言实践篇
4.1 文法定义
- 支持方法定义、方法重载、方法调用
- 支持变量定义
- 支持基本类型int和boolean
- 支持数组类型、多维数组
- 支持基本的控制流语句,包括
- if then
- if then else
- while
- do while
- for
- condition expression
- 支持二元运算符
- 支持一些系统函数,目前仅包含
- print(int)
- print(boolean)
- nextInt(int,int)
- nextInt()
- nextBoolean()
| <additive expression> → <additive expression> + <multiplicative expression> | <additive expression> - <multiplicative expression> | <multiplicative expression> <and expression> → <and expression> & <equality expression> | <equality expression> <argument list> → <argument list> , <expression> | <expression> <array access> → <expression name> <mark 286_1_1> [ <expression> ] | <primary no new array> [ <expression> ] <array creation expression> → new <primitive type> <dim exprs> <epsilon or dims> <array type> → <type> [ ] <assignment expression> → <assignment> | <conditional expression> <assignment operator> → %= | &= | *= | += | -= | /= | <<= | = | >>= | >>>= | ^= | |= <assignment> → <left hand side> <assignment operator> <mark 222_1_1> <assignment expression> <block statement> → <local variable declaration statement> | <statement> <block statements> → <block statement> | <block statements> <block statement> <block> → { <mark 139_1_1> <epsilon or block statements> } <boolean literal> → false | true <cast expression> → ( <primitive type> ) <unary expression> | ( <reference type> ) <unary expression not plus minus> <conditional and expression> → <conditional and expression> && <mark 232_2_1> <inclusive or expression> | <inclusive or expression> <conditional expression> → <conditional or expression> | <conditional or expression> ? <mark true block> <expression> : <mark false block> <conditional expression> <conditional or expression> → <conditional and expression> | <conditional or expression> || <mark 230_2_1> <conditional and expression> <decimal integer literal> → <decimal numeral> <decimal numeral> → 0 | <non zero digit> <epsilon or digits> <digit> → 0 | <non zero digit> <digits> → <digit> | <digits> <digit> <dim expr> → [ <expression> ] <dim exprs> → <dim expr> | <dim exprs> <dim expr> <dims> → [ ] | <dims> [ ] <do statement> → do <mark loop offset> <statement> while ( <expression> ) ; <empty statement> → ; <epsilon or argument list> → __ε__ | <argument list> <epsilon or block statements> → __ε__ | <block statements> <epsilon or digits> → __ε__ | <digits> <epsilon or dims> → __ε__ | <dims> <epsilon or expression> → __ε__ | <expression> <epsilon or for init> → __ε__ | <for init> <epsilon or for update> → __ε__ | <for update> <epsilon or formal parameter list> → __ε__ | <formal parameter list> <equality expression> → <equality expression> != <relational expression> | <equality expression> == <relational expression> | <relational expression> <exclusive or expression> → <and expression> | <exclusive or expression> ^ <and expression> <expression name> → @identifier <expression statement> → <statement expression> ; <expression> → <assignment expression> <floating-point type> → float <for init> → <local variable declaration> | <statement expression list> <for statement no short if> → for ( <mark before init> <epsilon or for init> ; <mark loop offset> <epsilon or expression> ; <mark before update> <epsilon or for update> ) <mark after update> <statement no short if> <for statement> → for ( <mark before init> <epsilon or for init> ; <mark loop offset> <epsilon or expression> ; <mark before update> <epsilon or for update> ) <mark after update> <statement> <for update> → <statement expression list> <formal parameter list> → <formal parameter list> , <formal parameter> | <formal parameter> <formal parameter> → <type> <mark 50_1_1> <variable declarator id> <if then else statement no short if> → if ( <expression> ) <mark true block> <statement no short if> else <mark false block> <statement no short if> <if then else statement> → if ( <expression> ) <mark true block> <statement no short if> else <mark false block> <statement> <if then statement> → if ( <expression> ) <mark true block> <statement> <inclusive or expression> → <exclusive or expression> | <inclusive or expression> | <exclusive or expression> <integer literal> → <decimal integer literal> <integral type> → int <left hand side> → <array access> | <expression name> <literal> → <boolean literal> | <integer literal> <local variable declaration statement> → <local variable declaration> ; <local variable declaration> → <type> <mark 146_1_1> <variable declarators> <mark 139_1_1> → __ε__ <mark 146_1_1> → __ε__ <mark 222_1_1> → __ε__ <mark 230_2_1> → __ε__ <mark 232_2_1> → __ε__ <mark 286_1_1> → __ε__ <mark 50_1_1> → __ε__ <mark 66_2_1> → __ε__ <mark 74_1_1> → __ε__ <mark 74_1_2> → __ε__ <mark after update> → __ε__ <mark before init> → __ε__ <mark before update> → __ε__ <mark false block> → __ε__ <mark loop offset> → __ε__ <mark prefix expression> → __ε__ <mark true block> → __ε__ <method body> → ; | <block> <method declaration> → <mark 74_1_1> <method header> <mark 74_1_2> <method body> <method declarations> → <method declaration> | <method declarations> <method declaration> <method declarator> → @identifier ( <epsilon or formal parameter list> ) <method header> → <result type> <method declarator> <method invocation> → <method name> ( <epsilon or argument list> ) <method name> → @identifier <multiplicative expression> → <multiplicative expression> % <unary expression> | <multiplicative expression> * <unary expression> | <multiplicative expression> / <unary expression> | <unary expression> <non zero digit> → @nonZeroDigit <numeric type> → <floating-point type> | <integral type> <postdecrement expression> → <postfix expression> -- <postfix expression> → <expression name> | <postdecrement expression> | <postincrement expression> | <primary> <postincrement expression> → <postfix expression> ++ <predecrement expression> → -- <mark prefix expression> <unary expression> <preincrement expression> → ++ <mark prefix expression> <unary expression> <primary no new array> → ( <expression> ) | <array access> | <literal> | <method invocation> <primary> → <array creation expression> | <primary no new array> <primitive type> → boolean | <numeric type> <programs> → <method declarations> <reference type> → <array type> <relational expression> → <relational expression> < <shift expression> | <relational expression> <= <shift expression> | <relational expression> > <shift expression> | <relational expression> >= <shift expression> | <shift expression> <result type> → void | <type> <return statement> → return <epsilon or expression> ; <shift expression> → <additive expression> | <shift expression> << <additive expression> | <shift expression> >> <additive expression> | <shift expression> >>> <additive expression> <statement expression list> → <statement expression list> , <statement expression> | <statement expression> <statement expression> → <assignment> | <method invocation> | <postdecrement expression> | <postincrement expression> | <predecrement expression> | <preincrement expression> <statement no short if> → <for statement no short if> | <if then else statement no short if> | <statement without trailing substatement> | <while statement no short if> <statement without trailing substatement> → <block> | <do statement> | <empty statement> | <expression statement> | <return statement> <statement> → <for statement> | <if then else statement> | <if then statement> | <statement without trailing substatement> | <while statement> <type> → <primitive type> | <reference type> <unary expression not plus minus> → ! <unary expression> | ~ <unary expression> | <cast expression> | <postfix expression> <unary expression> → + <unary expression> | - <unary expression> | <predecrement expression> | <preincrement expression> | <unary expression not plus minus> <variable declarator id> → @identifier | <variable declarator id> [ ] <variable declarator> → <variable declarator id> | <variable declarator id> = <variable initializer> <variable declarators> → <variable declarator> | <variable declarators> , <mark 66_2_1> <variable declarator> <variable initializer> → <expression> <while statement no short if> → while ( <mark loop offset> <expression> ) <mark true block> <statement no short if> <while statement> → while ( <mark loop offset> <expression> ) <mark true block> <statement>