Skip to content

Commit 4a2a93d

Browse files
committed
Updated docs
1 parent 36260f6 commit 4a2a93d

7 files changed

Lines changed: 179 additions & 115 deletions

File tree

docs/AST_Types.md

Lines changed: 19 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -140,6 +140,7 @@ Code Prev;
140140
Code Next;
141141
parser::Token* Tok;
142142
Code Parent;
143+
StringCached Name;
143144
CodeT Type;
144145
```
145146

@@ -155,6 +156,12 @@ Serialization:
155156
{
156157
<Body>
157158
}
159+
160+
// Constructor Source Implementation
161+
<Specs> <Parent>::~<Parent->Name>( <Params> ) <Specs>
162+
{
163+
<Body>
164+
}
158165
```
159166

160167
## Define
@@ -191,6 +198,7 @@ Code Prev;
191198
Code Next;
192199
parser::Token* Tok;
193200
Code Parent;
201+
StringCached Name;
194202
CodeT Type;
195203
```
196204

@@ -205,6 +213,12 @@ Serialization:
205213
{
206214
<Body>
207215
}
216+
217+
// Destructor Source Implementation
218+
<Specs> <Parent>::~<Parent->Name>( <Params> ) <Specs>
219+
{
220+
<Body>
221+
}
208222
```
209223

210224
## Enum
@@ -468,12 +482,13 @@ Serialization:
468482
}
469483
```
470484

471-
## Parameters
485+
## Parameters (AST_Param)
472486

473487
Fields:
474488

475489
```cpp
476490
CodeType ValueType;
491+
Code Macro;
477492
Code Value;
478493
CodeParam Last;
479494
CodeParam Next;
@@ -487,7 +502,9 @@ s32 NumEntries;
487502
Serialization:
488503

489504
```cpp
490-
<ValueType> <Name>, <Next>... <Last>
505+
<Macro>, <Next> ... <Last>
506+
507+
<Macro> <ValueType> <Name>, <Next>... <Last>
491508
```
492509

493510
## Pragma

docs/Parser_Algo.md

Lines changed: 98 additions & 71 deletions
Original file line numberDiff line numberDiff line change
@@ -119,12 +119,20 @@ Below is an outline of the general alogirithim used for these internal procedure
119119
5. If adjacent opening bracket
120120
1. Repeat array declaration parse until no brackets remain
121121

122+
## `parse_assignment_expression`
123+
124+
1. Eat the assignment operator
125+
2. Make sure there is content or at least an end statement after.
126+
3. Flatten the assignment expression to an untyped Code string.
127+
122128
## `parse_attributes`
123129

124130
1. Check for standard attribute
125131
2. Check for GNU attribute
126132
3. Check for MSVC attribute
127133
4. Check for a token registered as an attribute
134+
a. Check and grab the arguments of a token registered of an attribute if it has any.
135+
5. Repeat for chained attributes. Flatten them to a single attribute AST node.
128136

129137
## `parse_class_struct`
130138

@@ -142,39 +150,40 @@ Below is an outline of the general alogirithim used for these internal procedure
142150

143151
1. Opening curly brace
144152
2. Parse the body (Possible options):
145-
1. Newline : ast constant
146-
2. Comment : `parse_comment`
147-
3. Access_Public : ast constant
148-
4. Access_Protected : ast constant
149-
5. Access_Private : ast constant
150-
6. Decl_Class : `parse_complicated_definition`
151-
7. Decl_Enum : `parse_complicated_definition`
152-
8. Decl_Friend : `parse_friend`
153-
9. Decl_Operator : `parse_operator_cast`
154-
10. Decl_Struct : `parse_complicated_definition`
155-
11. Decl_Template : `parse_template`
156-
12. Decl_Typedef : `parse_typedef`
157-
13. Decl_Union : `parse_complicated_definition`
158-
14. Decl_Using : `parse_using`
159-
15. Operator == '~'
153+
1. Ignore dangling end statements
154+
2. Newline : ast constant
155+
3. Comment : `parse_comment`
156+
4. Access_Public : ast constant
157+
5. Access_Protected : ast constant
158+
6. Access_Private : ast constant
159+
7. Decl_Class : `parse_complicated_definition`
160+
8. Decl_Enum : `parse_complicated_definition`
161+
9. Decl_Friend : `parse_friend`
162+
10. Decl_Operator : `parse_operator_cast`
163+
11. Decl_Struct : `parse_complicated_definition`
164+
12. Decl_Template : `parse_template`
165+
13. Decl_Typedef : `parse_typedef`
166+
14. Decl_Union : `parse_complicated_definition`
167+
15. Decl_Using : `parse_using`
168+
16. Operator == '~'
160169
1. `parse_destructor`
161-
16. Preprocess_Define : `parse_define`
162-
17. Preprocess_Include : `parse_include`
163-
18. Preprocess_Conditional (if, ifdef, ifndef, elif, else, endif) : `parse_preprocess_cond` or else/endif ast constant
164-
19. Preprocess_Macro : `parse_simple_preprocess`
165-
20. Preprocess_Pragma : `parse_pragma`
166-
21. Preprocess_Unsupported : `parse_simple_preprocess`
167-
22. StaticAssert : `parse_static_assert`
168-
23. The following compound into a resolved definition or declaration:
170+
17. Preprocess_Define : `parse_define`
171+
18. Preprocess_Include : `parse_include`
172+
19. Preprocess_Conditional (if, ifdef, ifndef, elif, else, endif) : `parse_preprocess_cond` or else/endif ast constant
173+
20. Preprocess_Macro : `parse_simple_preprocess`
174+
21. Preprocess_Pragma : `parse_pragma`
175+
22. Preprocess_Unsupported : `parse_simple_preprocess`
176+
23. StaticAssert : `parse_static_assert`
177+
24. The following compound into a resolved definition or declaration:
169178
1. Attributes (Standard, GNU, MSVC) : `parse_attributes`
170-
2. Specifiers (consteval, constexpr, constinit, forceinline, inline, mutable, neverinline, static, volatile)
179+
2. Specifiers (consteval, constexpr, constinit, explicit, forceinline, inline, mutable, neverinline, static, volatile, virtual)
171180
3. Possible Destructor : `parse_destructor`
172181
4. Possible User defined operator cast : `parse_operator_cast`
173182
5. Possible Constructor : `parse_constructor`
174183
6. Something that has the following: (identifier, const, unsigned, signed, short, long, bool, char, int, double)
175184
1. Possible Constructor `parse_constructor`
176185
2. Possible Operator, Function, or varaible : `parse_operator_function_or_variable`
177-
24. Something completely unknown (will just make untyped...) : `parse_untyped`
186+
25. Something completely unknown (will just make untyped...) : `parse_untyped`
178187

179188
## `parse_comment`
180189

@@ -197,15 +206,17 @@ A portion of the code in `parse_typedef` is very similar to this as both have to
197206
2. If the token has a closing brace its an inplace definition
198207
3. If the `token[-2]` is an identifier & `token[-3]` is the declaration type, its a variable using a namespaced type.
199208
4. If the `token[-2]` is an indirection, then its a variable using a namespaced/forwarded type.
200-
5. If any of the above is the case, `parse_operator_function_or_variable`
201-
4. If the previous token was a closing curly brace, its a definition : `parse_forward_or_definition`
202-
5. If the previous token was a closing square brace, its an array definition : `parse_operator_function_or_variable`
209+
5. If the `token[-2]` is an assign classifier, and the starting tokens were the which type with possible `class` token after, its an enum forward declaration.
210+
6. If any of the above is the case, `parse_operator_function_or_variable`
211+
4. If the `token[2]` is a vendor fundamental type (builtin) then it is an enum forward declaration.
212+
5. If the previous token was a closing curly brace, its a definition : `parse_forward_or_definition`
213+
6. If the previous token was a closing square brace, its an array definition : `parse_operator_function_or_variable`
203214

204215
## `parse_define`
205216

206217
1. Define directive
207218
2. Get identifier
208-
3. Get Content
219+
3. Get Content (Optional)
209220

210221
## `parse_forward_or_definition`
211222

@@ -243,36 +254,47 @@ In the future statements and expressions will be parsed.
243254
1. Make sure this is being called for a valid type (namespace, global body, export body, linkage body)
244255
2. If its not a global body, consume the opening curly brace
245256
3. Parse the body (Possible options):
246-
1. NewLine : ast constant
247-
2. Comment : `parse_comment`
248-
3. Decl_Cass : `parse_complicated_definition`
249-
4. Decl_Enum : `parse_complicated_definition`
250-
5. Decl_Extern_Linkage : `parse_extern_link`
251-
6. Decl_Namespace : `parse_namespace`
252-
7. Decl_Struct : `parse_complicated_definition`
253-
8. Decl_Template : `parse_template`
254-
9. Decl_Typedef : `parse_typedef`
255-
10. Decl_Union : `parse_complicated_definition`
256-
11. Decl_Using : `parse_using`
257-
12. Preprocess_Define : `parse_define`
258-
13. Preprocess_Include : `parse_include`
259-
14. Preprocess_If, IfDef, IfNotDef, Elif : `parse_preprocess_cond`
260-
15. Preprocess_Else : ast constant
261-
16. Preprocess_Endif : ast constant
262-
17. Preprocess_Macro : `parse_simple_preprocess`
263-
18. Preprocess_Pragma : `parse_pragma`
264-
19. Preprocess_Unsupported : `parse_simple_preprocess`
265-
20. StaticAssert : `parse_static_assert`
266-
21. Module_Export : `parse_export_body`
267-
22. Module_Import : NOT_IMPLEMENTED
268-
23. The following compound into a resolved definition or declaration:
257+
1. Ignore dangling end statements
258+
2. NewLine : ast constant
259+
3. Comment : `parse_comment`
260+
4. Decl_Cass : `parse_complicated_definition`
261+
5. Decl_Enum : `parse_complicated_definition`
262+
6. Decl_Extern_Linkage : `parse_extern_link`
263+
7. Decl_Namespace : `parse_namespace`
264+
8. Decl_Struct : `parse_complicated_definition`
265+
9. Decl_Template : `parse_template`
266+
10. Decl_Typedef : `parse_typedef`
267+
11. Decl_Union : `parse_complicated_definition`
268+
12. Decl_Using : `parse_using`
269+
13. Preprocess_Define : `parse_define`
270+
14. Preprocess_Include : `parse_include`
271+
15. Preprocess_If, IfDef, IfNotDef, Elif : `parse_preprocess_cond`
272+
16. Preprocess_Else : ast constant
273+
17. Preprocess_Endif : ast constant
274+
18. Preprocess_Macro : `parse_simple_preprocess`
275+
19. Preprocess_Pragma : `parse_pragma`
276+
20. Preprocess_Unsupported : `parse_simple_preprocess`
277+
21. StaticAssert : `parse_static_assert`
278+
22. Module_Export : `parse_export_body`
279+
23. Module_Import : NOT_IMPLEMENTED
280+
24. The following compound into a resolved definition or declaration:
269281
1. Attributes ( Standard, GNU, MSVC, Macro ) : `parse_attributes`
270282
2. Specifiers ( consteval, constexpr, constinit, extern, forceinline, global, inline, internal_linkage, neverinline, static )
271283
3. Is either ( identifier, const specifier, long, short, signed, unsigned, bool, char, double, int)
272-
1. If its an operator cast (definition outside class) : `parse_operator_cast`
273-
2. Its an operator, function, or varaible : `parse_operator_function_or_varaible`
284+
1. Attempt to parse as constrcutor or destructor : `parse_global_nspace_constructor_destructor`
285+
2. If its an operator cast (definition outside class) : `parse_operator_cast`
286+
3. Its an operator, function, or varaible : `parse_operator_function_or_varaible`
274287
4. If its not a global body, consume the closing curly brace
275288

289+
## `parse_global_nspace_constructor_destructor`
290+
291+
1. Look ahead for the start of the arguments for a possible constructor/destructor
292+
2. Go back past the identifier
293+
3. Check to see if its a destructor by checking for the `~`
294+
4. Continue the next token should be a `::`
295+
5. Determine if the next valid identifier (ignoring possible template parameters) is the same as the first identifier of the function.
296+
6. If it is we have either a constructor or destructor so parse using their respective functions (`parse_constructor`, `parse_destructor`).
297+
276298
## `parse_identifier`
277299

278300
This is going to get heavily changed down the line to have a more broken down "identifier expression" so that the qualifier, template args, etc, can be distinguished between the targeted identifier.
@@ -284,6 +306,7 @@ The function can parse all of them, however the AST node compresses them all int
284306
1. Consume `::`
285307
2. Consume member identifier
286308
3. `parse_template args` (for member identifier)
309+
4. If a `~` is encounted and the scope is for a destructor's identifier, do not consume it and return with what parsed.
287310

288311
## `parse_include`
289312

@@ -329,15 +352,17 @@ When this function is called, attribute and specifiers may have been resolved, h
329352
2. If the we immdiately find a closing token, consume it and finish.
330353
3. If we encounter a varadic argument, consume it and return a `param_varadic` ast constant
331354
4. `parse_type`
332-
5. If we have an identifier
355+
5. If we have a macro, parse it (Unreal has macros as tags to parameters and or as entire arguments).
356+
6. So long as next token isn't a comma
357+
a. If we have an identifier
333358
1. Consume it
334359
2. Check for assignment:
335-
1. Consume assign operator
336-
2. Parse the expression
337-
6. While we continue to encounter commas
338-
1. Consume them
339-
2. Repeat steps 3 to 5.2.2
340-
7. Consume the closing token
360+
a. Consume assign operator
361+
b. Parse the expression
362+
7. While we continue to encounter commas
363+
a. Consume them
364+
b. Repeat steps 3 to 6.2.b
365+
8. Consume the closing token
341366

342367
## `parse_preprocess_cond`
343368

@@ -456,6 +481,7 @@ This currently doesn't support postfix specifiers (planning to in the future)
456481
2. If there is an assignment operator:
457482
1. Consume operator
458483
2. Consume the expression (assigned to untyped string for now)
484+
3. If a macro is encountered consume it (Unreal UMETA macro support)
459485
3. If there is a comma, consume it
460486

461487
## `parse_export_body`
@@ -476,10 +502,9 @@ This currently doesn't support postfix specifiers (planning to in the future)
476502

477503
1. Consume `friend`
478504
2. `parse_type`
479-
3. If the currok is an identifier its a function declaration (there is no support for inline definitions yet)
480-
1. `parse_identifier`
481-
2. `parse_params`
482-
4. Consume end statement
505+
3. If the currok is an identifier its a function declaration or definition
506+
1. `parse_function_after_name`
507+
4. Consume end statement so long as its not a function definion
483508
5. Check for inline comment, `parse_comment` if exists
484509

485510
## `parse_function`
@@ -540,7 +565,8 @@ Note: This currently doesn't support templated operator casts (going to need to
540565
5. The following compound into a resolved definition or declaration:
541566
1. `parse_attributes`
542567
2. Parse specifiers
543-
3. `parse_operator_function_or_variable`
568+
3. Attempt to parse as constructor or destructor: `parse_global_nspace_constructor_destructor`
569+
4. Otherwise: `parse_operator_function_or_variable`
544570

545571
## `parse_type`
546572

@@ -553,14 +579,15 @@ Anything that is in the qualifier capture of the function typename is treated as
553579

554580
1. `parse_attributes`
555581
2. Parse specifiers
556-
3. This is where things get ugly for each of these depend on what the next token is.
582+
3. If the `parse_type` was called from a template parse, check to see if class was used instead of typname and consume as name.
583+
4. This is where things get ugly for each of these depend on what the next token is.
557584
1. If its an in-place definition of a class, enum, struct, or union:
558585
2. If its a decltype (Not supported yet but draft impl there)
559586
3. If its a compound native type expression (unsigned, char, short, long, int, float, dobule, etc )
560587
4. Ends up being a regular type alias of an identifier
561-
4. Parse specifiers (postfix)
562-
5. We need to now look ahead to see If we're dealing with a function typename
563-
6. If wer're dealing with a function typename:
588+
5. Parse specifiers (postfix)
589+
6. We need to now look ahead to see If we're dealing with a function typename
590+
7. If wer're dealing with a function typename:
564591
1. Shove the specifiers, and identifier code we have so far into a return type typename's Name (untyped string)
565592
1. Reset the specifiers code for the top-level typeanme
566593
2. Check to see if the next token is an identifier:
@@ -571,7 +598,7 @@ Anything that is in the qualifier capture of the function typename is treated as
571598
3. Consume `)`
572599
4. `parse_params`
573600
5. Parse postfix specifiers
574-
7. Check for varaidic argument (param pack) token:
601+
8. Check for varaidic argument (param pack) token:
575602
1. Consume varadic argument token
576603

577604
### WIP - Alternative Algorithim

docs/Parsing.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
The library features a naive parser tailored for only what the library needs to construct the supported syntax of C++ into its AST.
44

5-
This parser does not, and should not do the compiler's job. By only supporting this minimal set of features, the parser is kept (so far) around 5500 loc. I hope to keep it under 10k loc worst case.
5+
This parser does not, and should not do the compiler's job. By only supporting this minimal set of features, the parser is kept (so far) around ~5600 loc. I hope to keep it under 10k loc worst case.
66

77
You can think of this parser of a frontend parser vs a semantic parser. Its intuitively similar to WYSIWYG. What you precerive as the syntax from the user-side before the compiler gets a hold of it, is what you get.
88

@@ -73,7 +73,7 @@ The lexing and parsing takes shortcuts from whats expected in the standard.
7373
* The parse API treats any execution scope definitions with no validation and are turned into untyped Code ASTs.
7474
* *This includes the assignment of variables.*
7575
* Attributes ( `[[]]` (standard), `__declspec` (Microsoft), or `__attribute__` (GNU) )
76-
* Assumed to *come before specifiers* (`const`, `constexpr`, `extern`, `static`, etc) for a function
76+
* Assumed to *come before specifiers* (`const`, `constexpr`, `extern`, `static`, etc) for a function or right afterthe return type.
7777
* Or in the usual spot for class, structs, (*right after the declaration keyword*)
7878
* typedefs have attributes with the type (`parse_type`)
7979
* Parsing attributes can be extended to support user defined macros by defining `GEN_DEFINE_ATTRIBUTE_TOKENS` (see `gen.hpp` for the formatting)

docs/Readme.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,7 @@ union {
8282
AST* ValueType; // Parameter, Variable
8383
};
8484
union {
85+
AST* Macro; // Parameters
8586
AST* BitfieldSize; // Variable (Class/Struct Data Member)
8687
AST* Params; // Constructor, Function, Operator, Template, Typename
8788
};
@@ -461,6 +462,7 @@ The AST and constructors will be able to validate that the arguments provided fo
461462
* If return type must match a parameter
462463
* If number of parameters is correct
463464
* If added as a member symbol to a class or struct, that operator matches the requirements for the class (types match up)
465+
* There is no support for validating new & delete operations (yet)
464466

465467
The user is responsible for making sure the code types provided are correct
466468
and have the desired specifiers assigned to them beforehand.

project/Readme.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,7 @@ Just like the `gen.<hpp/cpp>` they include their components: `dependencies/<depe
1010

1111
Code not making up the core library is located in `auxiliary/<auxiliary_name>.<hpp/cpp>`. These are optional extensions or tools for the library.
1212

13-
**TODO : Right now the library is not finished, as such the first self-hosting iteration is still WIP**
14-
Both libraries use *pre-generated* (self-hosting I guess) version of the library to then generate the latest version of itself.
13+
Both libraries use *pre-generated* (self-hosting I guess) version of the library to then generate the latest version of itself.
1514

1615
The default `gen.bootstrap.cpp` located in the project folder is meant to be produce a standard segmented library, where the components of the library
1716
have relatively dedicated header and source files. Dependencies included at the top of the file and each header starting with a pragma once.
@@ -52,7 +51,7 @@ Names or Content fields are interned strings and thus showed be cached using `ge
5251

5352
The library has its code segmented into component files, use it to help create a derived version without needing to have to rewrite a generated file directly or build on top of the header via composition or inheritance.
5453

55-
The parser is documented under `docs/Parsing.md` and `docs/Parser_Algo.md`.
54+
The parser is documented under `docs/Parsing.md` and `docs/Parser_Algo.md`.
5655

5756
## A note on compilation and runtime generation speed
5857

0 commit comments

Comments
 (0)