22 Commits

Author SHA1 Message Date
Jarvis Carroll
78c9c67f38 typecheck bits
Sophia bitstrings aren't really something you initialize manually, so we have to make up a literal format for them. Failing that, we just accept arbitrary integers and bytearrays as bitstrings.
2026-02-13 06:25:24 +00:00
Jarvis Carroll
9bc0ffafd1 bool/char literals
Character literals were the main complexity here, but I threw booleans in as well, since that covers all the major literals.
2026-02-13 06:25:24 +00:00
Jarvis Carroll
60985130cb Refine tuple parsing errors
There are four major fixes here:
1. some eof tokens were being pattern matched with the wrong arity
2. tuples that are too long actually speculatively parse as an untyped tuple, and then complain that there were too many elements,
3. singleton tuples with a trailing comma are now handled differently to grouping parentheses, consistently between typed and untyped logic
4. the extra return values used to detect untyped singleton tuples are also used to pass the close paren position, so that too_many_elements can report the correct file position too.

Point 4. also completely removes the need for tracking open paren positions that I was doing, and that I thought I would need to do even more of in the ambiguous-open-paren-stack case.
2026-02-13 04:08:58 +00:00
6c172c4783 Adjusting a few calls. 2026-02-12 17:44:56 +09:00
Jarvis Carroll
3838a7e3c5 Parse qualified names.
This seemed like it was going to be insanely insanely complex, but
then it turns out the compiler doesn't accept spaces in qualified
names, so I can just dump periods in the lexer and hit it with
string:split/3. Easy.
2026-02-05 07:13:25 +00:00
Jarvis Carroll
d014ae0982 Handle token/parse errors more carefully 2026-02-04 07:00:39 +00:00
Jarvis Carroll
bb4bcbb7de remove 'tk' atom from file positions 2026-02-03 06:08:54 +00:00
Jarvis Carroll
a695c21fc9 Parse address literals.
Also signatures.
2026-02-03 06:00:40 +00:00
Jarvis Carroll
493bdb990c Fix lexer row/column calculations. 2026-02-03 01:42:17 +00:00
Jarvis Carroll
17f635af61 Parse long hex escape codes
This doesn't work super consistently in the compiler, for codepoints above 127, but it should work fine for us, so, oh well!
2026-02-03 00:41:00 +00:00
Jarvis Carroll
272ed01fdc Singleton record/tuple parsing.
Records are a simple case to detect and handle correctly.

Tuples took an entire rewrite of the little tuple parsing bit of the code.
2026-01-30 08:12:32 +00:00
Jarvis Carroll
49cd8b6687 Parse strings 2026-01-29 06:18:06 +00:00
Jarvis Carroll
966b4b2748 Calculate scalar values during lexing
This saves some effort and probably some performance for things like integers, but I'm mainly doing this in anticipation of string literals, because it would just be ridiculous to read code that lexes string literals twice.
2026-01-29 04:06:19 +00:00
Jarvis Carroll
fe182a5233 Handle underscores in integers/bytes
This forces us to test for alpha/num/hex enough times that it's now worth making macros for these things.
2026-01-29 03:03:11 +00:00
Jarvis Carroll
f1696e2b9e Bytes lexing
I don't handle underscores in bytes correctly... Nor in integers, for that matter.
2026-01-29 02:01:16 +00:00
Jarvis Carroll
2bf384ca82 Infer correct values for tests automatically
Now tests compare the literal parser against the output of the
compiler. The little example contracts we are compiling for the
AACI already had the FATE value in them, in the form of the
instruction
	{'RETURNR', {immediate, FateValue}}
so we just extract that and use it for the tests.
2026-01-27 06:42:55 +00:00
Jarvis Carroll
4f2a3c6c6f Variant parsing 2026-01-23 06:18:39 +00:00
Jarvis Carroll
7df04a81be Tuple parsing 2026-01-23 02:45:23 +00:00
Jarvis Carroll
6f02d4c4e6 Record parsing 2026-01-23 00:48:06 +00:00
Jarvis Carroll
56e63051bc Map parsing 2026-01-16 05:46:27 +00:00
Jarvis Carroll
3f1c9bd626 List parsing
Slowly chipping away at cases...
2026-01-15 09:38:04 +00:00
Jarvis Carroll
97e32574c4 set up parsing structure
We tokenize, and then do the simplest possible recursive descent.

We don't want to evaluate anything, so infix operators are out,
meaning no shunting yard or tree rearranging or LR(1) shenanigans
are necessary, just write the code.

If we want to 'peek', just take the next token, and pass it around
from that point on, until it can actually be consumed.
2026-01-15 01:52:30 +00:00