Skip to content

Commit 64b7879

Browse files
More bugfixes.
1 parent f733914 commit 64b7879

File tree

9 files changed

+991
-232
lines changed

9 files changed

+991
-232
lines changed

docs/SPECIFICATION.html

Lines changed: 14 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -52,14 +52,21 @@
5252

5353
- [12. Operators, Functions, and Statements](#12-operators-functions-and-statements)
5454

55-
This document specifies an imperative programming language with a binary integer data model and an explicit small-step execution semantics. A program is compiled into an initial machine state (a seed configuration) and then executed solely by repeatedly applying a fixed, program-independent state-transition (rewrite) function. All intermediate states are (semi-)human-readable and serializable, and execution can be traced and replayed exactly, including I/O and nondeterministic choices, which are modeled explicitly for deterministic replay.
55+
This document specifies an imperative programming language with a statically typed data model and an explicit small-step execution semantics.
56+
57+
A program is compiled into an initial machine state (a seed configuration) and then executed solely by repeatedly applying a fixed, program-independent state-transition (rewrite) function. All intermediate states are (semi-)human-readable and serializable, and execution can be traced and replayed exactly, including I/O and nondeterministic choices, which are modeled explicitly for deterministic replay.
5658

5759
## 1. Overview
5860

59-
The language is a familiar statement-based, imperative language. Programs consist of variable declarations via assignment, expressions, and control-flow constructs such as `IF`, `ELSEIF`, `ELSE`, `WHILE`, and `FOR`. Prefix has seven runtime data types: binary integers (`INT`), binary floating-point numbers (`FLT`, IEEE754), strings (`STR`), non-scalar tensors (`TNS`), first-class user-defined functions (`FUNC`), associative maps (`MAP`), and thread handles (`THR`). Identifiers, function parameters, and return values are statically typed; the type of every symbol must be declared when it is first introduced. Computation proceeds by evaluating expressions and executing statements in sequence, with explicit constructs for branching and looping. Input and output are modeled through built-in operators, in particular `INPUT` and `PRINT`.
61+
The language is a familiar statement-based, imperative language.
6062

61-
The interpreter compiles source code into a single initial configuration (the seed state), which includes the program code, an empty variable environment, and an initial I/O history. It then advances execution by repeatedly applying a single, fixed small-step transition function that is independent of the particular program. A disassembler and log view expose all intermediate states so that every step and every control-flow decision is inspectable and replay-able.
63+
Programs consist of variable declarations via assignment, expressions, and control-flow constructs such as `IF`, `ELSEIF`, `ELSE`, `WHILE`, and `FOR`.
64+
65+
Prefix has seven runtime data types: binary integers (`INT`), binary floating-point numbers (`FLT`, IEEE754), strings (`STR`), non-scalar tensors (`TNS`), first-class user-defined functions (`FUNC`), associative maps (`MAP`), and thread handles (`THR`).
6266
67+
Identifiers, function parameters, and return values are statically typed; the type of every symbol must be declared when it is first introduced. Computation proceeds by evaluating expressions and executing statements in sequence, with explicit constructs for branching and looping. Input and output are modeled through built-in operators, in particular `INPUT` and `PRINT`.
68+
69+
The interpreter compiles source code into a single initial configuration (the seed state), which includes the program code, an empty variable environment, and an initial I/O history. It then advances execution by repeatedly applying a single, fixed small-step transition function that is independent of the particular program. A disassembler and log view expose all intermediate states so that every step and every control-flow decision is inspectable and replay-able.
6370
6471
## 2. Lexical Structure
6572
@@ -228,7 +235,7 @@
228235

229236
Assignments have the syntax `TYPE : identifier = expression` on first use, where TYPE is `INT`, `FLT`, `STR`, `TNS`, or `FUNC`. Spaces around the colon and equals sign are optional. Subsequent assignments to an existing identifier may omit the type but must preserve the original type. Variables are deallocated only when `DEL(identifier)` is executed.
230237

231-
Inline assignment operator: the built-in `ASSIGN` provides an expression form of assignment, similar to a walrus operator. The general form is `ASSIGN(target, expression)` and it evaluates `expression`, assigns it to `target`, and returns the assigned value. The `target` may be a plain identifier (e.g. `ASSIGN(x, 1)`) or an indexed assignment target (e.g. `ASSIGN(tensor[1], 0)` or `ASSIGN(map<"k">, 1)`). If the identifier has not yet been declared, the typed form is required: `ASSIGN(TYPE: name, expression)` declares the identifier's type and performs the assignment in one step. The typed form is only valid for plain identifiers; indexed targets must already exist and follow the same indexed-assignment rules as statement assignments.
238+
Inline assignment operator: the built-in `ASSIGN` provides an expression form of assignment, similar to a walrus operator. The general form is `ASSIGN(target, expression)` and it evaluates `expression`, assigns it to `target`, and returns the assigned value. The `target` may be a plain identifier (for example `ASSIGN(x, 1)`) or an indexed assignment target (for example `ASSIGN(tensor[1], 0)` or `ASSIGN(map<"k">, 1)`). If the identifier has not yet been declared, the typed form is required: `ASSIGN(TYPE: name, expression)` declares the identifier's type and performs the assignment in one step. The typed form is only valid for plain identifiers; indexed targets must already exist and follow the same indexed-assignment rules as statement assignments.
232239

233240
Tensor elements can be reassigned with the indexed form `identifier[i1,...,iN] = expression`. The base must be a previously-declared `TNS` binding. The indices must match the tensor's dimensionality, follow the same one-based/negative-index rules as ordinary indexing, and must reference anexisting position. The element's original type cannot change: attempting to store a different type at that position is a runtime error. Indexed assignment mutates the `TNS`.
234241

@@ -329,7 +336,7 @@
329336
330337
`GOTO` and `GOTOPOINT` are intended to be low-level primitives and their use can make programs harder to reason about. They are serialized in the stat log like other statements so that execution is fully replayable for debugging and tracing.
331338
332-
- Trigger: a traceback is produced when a runtime error occurs that prevents normal forward execution (e.g., an assertion failure, divide by zero, undefined variable reference, executing `RETURN` outside of a function, or any other interpreter-defined runtime error).
339+
- Trigger: a traceback is produced when a runtime error occurs that prevents normal forward execution (for example, an assertion failure, divide by zero, undefined variable reference, executing `RETURN` outside of a function, or any other interpreter-defined runtime error).
333340
334341
- Content: the traceback must list frames in chronological call order from the outermost (earliest) frame to the innermost (where the error occurred), and for each frame include: function name (or `<top-level>` for global code), precise source location, a short excerpt of the offending statement, and identifiers linking to the corresponding states in the state log.
335342
@@ -536,7 +543,7 @@
536543
537544
- `TYPE(ANY: obj):STR` ; returns the runtime type name of `obj` as a `STR` — one of `INT`, `FLT`, `STR`, or `TNS` (extension-defined type names are returned unchanged when extensions are enabled).
538545
539-
- `SIGNATURE(SYMBOL: sym):STR` ; returns a textual signature for the identifier `sym`. If `sym` denotes a user-defined function (`FUNC`) the result is formatted in the canonical form used in this specification, e.g. `FUNC name(T1: arg1, T2: arg2 = default):R`. For other bound symbols the result is `TYPE: symbol` (for example `INT: x`). The argument must be a plain identifier.
546+
- `SIGNATURE(SYMBOL: sym):STR` ; returns a textual signature for the identifier `sym`. If `sym` denotes a user-defined function (`FUNC`) the result is formatted in the canonical form used in this specification, for example `FUNC name(T1: arg1, T2: arg2 = default):R`. For other bound symbols the result is `TYPE: symbol` (for example `INT: x`). The argument must be a plain identifier.
540547
541548
- `COPY(ANY: obj):ANY` ; return a shallow copy of `obj`. For `INT`, `FLT`, `STR`, and `FUNC` this produces a same-typed value wrapper. For `TNS` it returns a newly-allocated tensor with the same shape whose elements reference the original element values (shallow). For `MAP` it returns a new map with the same keys and the same value references (shallow).
542549
@@ -688,7 +695,7 @@
688695
689696
- `TFLIP(TNS: obj, INT: dim):TNS` — Returns a new `TNS` with the elements along 1-based dimension `dim` reversed. Errors if `dim` is out of range.
690697
691-
- `SCAT(TNS: src, TNS: dst, TNS: ind):TNS` — Returns a copy of `dst` with a rectangular slice replaced by `src`. `ind` must be a 2D tensor of `INT` pairs with shape `[TLEN(dst, 1), 10]` (binary `10` = decimal 2), i.e., one `[lo, hi]` row per destination dimension (rank; for example `rank = TLEN(SHAPE(dst), 1)`). Indices are 1-based; negatives follow the tensor indexing rules (for example, `-1` is the last element) and `0` is invalid. For each dimension, the inclusive span `hi - lo + 1` must equal the corresponding `src` dimension length, and all bounds must fall within `dst`. Elements outside the slice are copied from `dst` unchanged.
698+
- `SCAT(TNS: src, TNS: dst, TNS: ind):TNS` — Returns a copy of `dst` with a rectangular slice replaced by `src`. `ind` must be a 2D tensor of `INT` pairs with shape `[TLEN(dst, 1), 10]` (binary `10` = decimal 2), that is, one `[lo, hi]` row per destination dimension (rank; for example `rank = TLEN(SHAPE(dst), 1)`). Indices are 1-based; negatives follow the tensor indexing rules (for example, `-1` is the last element) and `0` is invalid. For each dimension, the inclusive span `hi - lo + 1` must equal the corresponding `src` dimension length, and all bounds must fall within `dst`. Elements outside the slice are copied from `dst` unchanged.
692699
693700
- `FILL(TNS: tensor, ANY: value):TNS` — Returns a new tensor with the same shape as `tensor`, filled with `value`. The supplied value`s type must match the existing element type at every position.
694701

src/ast.c

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,15 @@ Expr* expr_call(Expr* callee, int line, int column) {
5555
expr->line = line;
5656
expr->column = column;
5757
expr->as.call.callee = callee;
58+
expr->as.call.args.items = NULL;
59+
expr->as.call.args.count = 0;
60+
expr->as.call.args.capacity = 0;
61+
expr->as.call.kw_names = NULL;
62+
expr->as.call.kw_args.items = NULL;
63+
expr->as.call.kw_args.count = 0;
64+
expr->as.call.kw_args.capacity = 0;
65+
expr->as.call.kw_count = 0;
66+
expr->as.call.kw_capacity = 0;
5867
return expr;
5968
}
6069

@@ -126,6 +135,18 @@ void expr_list_add(ExprList* list, Expr* expr) {
126135
list->items[list->count++] = expr;
127136
}
128137

138+
void call_kw_add(Expr* call, char* name, Expr* value) {
139+
if (!call || call->type != EXPR_CALL) return;
140+
if (call->as.call.kw_count + 1 > call->as.call.kw_capacity) {
141+
size_t new_cap = call->as.call.kw_capacity == 0 ? 4 : call->as.call.kw_capacity * 2;
142+
call->as.call.kw_names = realloc(call->as.call.kw_names, new_cap * sizeof(char*));
143+
if (!call->as.call.kw_names) { fprintf(stderr, "Out of memory\n"); exit(1); }
144+
call->as.call.kw_capacity = new_cap;
145+
}
146+
call->as.call.kw_names[call->as.call.kw_count++] = name;
147+
expr_list_add(&call->as.call.kw_args, value);
148+
}
149+
129150
Stmt* stmt_block(int line, int column) {
130151
Stmt* stmt = ast_alloc(sizeof(Stmt));
131152
stmt->type = STMT_BLOCK;
@@ -342,6 +363,11 @@ void free_expr(Expr* expr) {
342363
case EXPR_CALL:
343364
free_expr(expr->as.call.callee);
344365
free_expr_list(&expr->as.call.args);
366+
if (expr->as.call.kw_names) {
367+
for (size_t i = 0; i < expr->as.call.kw_count; i++) free(expr->as.call.kw_names[i]);
368+
free(expr->as.call.kw_names);
369+
}
370+
free_expr_list(&expr->as.call.kw_args);
345371
break;
346372
default:
347373
break;

src/ast.h

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,10 @@ struct Expr {
4848
struct {
4949
Expr* callee;
5050
ExprList args;
51+
char** kw_names;
52+
ExprList kw_args;
53+
size_t kw_count;
54+
size_t kw_capacity;
5155
} call;
5256
struct {
5357
Expr* target;
@@ -135,6 +139,7 @@ Expr* expr_str(char* value, int line, int column);
135139
Expr* expr_ptr(char* name, int line, int column);
136140
Expr* expr_ident(char* name, int line, int column);
137141
Expr* expr_call(Expr* callee, int line, int column);
142+
void call_kw_add(Expr* call, char* name, Expr* value);
138143
Expr* expr_tns(int line, int column);
139144
Expr* expr_map(int line, int column);
140145
Expr* expr_index(Expr* target, int line, int column);

0 commit comments

Comments
 (0)