Custom Language

Custom Calculator Programming Language


I've created a custom programming language and an interpreter written in python that's specifically designed for maths problems (kind of like a CLI calculator). The languages syntax is optimised for CLI usage and supports concepts such as units, tensors, and continued expressions.

See my GitHub Calculator project.

Syntax and Features


The basic syntax is very similar to Python and aims to be simple and easy to read. 

Numeric value are stored as exact decimal values for precision calculations.


Basix Syntax


All basic maths equatons are supported, order of operation is respected, and parentheses are evaluated first.


In
Out
0.1 + 0.2
0.3
1 + 2 * 3
7
(1 + 2) * 3
9
|0-1|
1
Absolute value
|0 - |10-100| |
90
Nested absolute values
[1,2,,3,4,,5,6] # [1,2,3,,4,5,6]
[[ 9, 12, 15],
[19, 26, 33],
[29, 40, 51]]
Matrix multiplication


Units & Conversions


The language is built in support for assigning units to values and combining units in operations as appropriate


In
Out

120cm
120cm
The value has a unit attached.
120cm + 15mm
121.500cm
The second operand is automatically converted to the first.
If the unit are not of the same category (distance in this case) an error is raised.
20m / 5s
4m/s
Multiplicative operations combine units into a composite unit.
20m * 10cm
2.00m^2
Muiltiple units of the same unit category are converted and combined.
1m / 20cm
5
If all units cancel, the result is unit-less.
20(m^3/s^2) * 5(m*s^-1)
5(m^4/s^3)
Units can be abitrarily complex.
(1hr + 35min) * 3
4.75hr
Supports time and other unit categories.
1m * 1cm @ (mm^2)
10000mm^2
Convert a value to any other compatible unit.
160cm * 50cm * 40cm @ L
320L
There are named composite units.
1L = 1dm * 1dm * 1dm.


Tensors (Arrays, Matrices, and more)


Tensors / n-dimensional array are defined with brackets and n consecutive commas to go to the next element in the n-th dimension. So one comma goes to the next element in a vector, two commas go to the next row in a matrix, and so on.


Vector / Array (3)
[2,4,8]
Matrix (2, 3)
[1,2,,3,4,,5,6]
Matrix (3,1)
[1,2,3,,]
Tensor / NdArray (3, 2, 3)
[
 1,2,3,,
 4,5,6
 ,,,
 6,5,4,,
 3,2,1
 ,,,
 1,2,3,,
 4,5,6
]


Distributed Operations


Scalar operations can be performed between tensors if either one is a scalar value, or both tensors have the same shape.


InOut
0.1 + 0.2
0.3
Scalar - Scalar
2 ^ [1,2,3,4]

[1,4,9,16]

Scalar - Tensor
Scalar operation is applied to each element in the tensor
[1,2,3] * 2
[2,4,6]
Tensor - Scalar
Scalar operation is applied to each element in the tensor
[2,4,8] ^ [2,3,4]
[4, 64, 4096]
Tensor - Tensor
Scalar operation is applied to between the matching elements in both tensors
[2,4,8] ^ [2,3]
Error: Incompatible tensor shapes for scalar operation
Tensor - Tensor of different shape


Continued Expressions


When running the interactive shell in a CLI you can continue expressions on the next line with the previous result implicitly being applied to the start of the line if applicable.


In
Out
1 + 2 + 3
6
Standard Expression
/ 3
2
Because '/' is a binary opepration, the previous result '6' is autmatically applied as the first operand making the expression to be evaluated '6 / 3'
= x
2
The previous result is assigned to variable 'x'


All results are stored in variables with the name out<line-number> so you can write expressions such as 'out5 * 2' to reference the result from line 5.


Variables


This language supports creating, re-assinging and using variables just like any other languages.


In
Out

x = 1
1
Variable is assigned a value
y = 2 * x
2
Variable is assigned the result of the expression
x = [1,2,3,4]
[1,2,3,4]
Variable x holds a tensor/array
(a,b,c,d) = x
[1,2,3,4]
variable 'a' is set to 1, 'b' to 2, 'c' to 3, and 'd' to 4

Functions


Functions use an arrow syntax similar to JavaScript. All function definitions are expressions that can be assigned to variables, tensor elements, or passed as function parameters.


In
Out

f = x => x^3

Declare a function and assign it to 'f'.
Single parameter & single expression body.
f(5)
125
Call function f with a scalar value.
f([1,2,3])
[1,8,27]
Call function f with a tensor that has a compatible shape with all operations in the functions body.
Result is a tensor of the same shape.
f = (x, y) => x^y

Declare a new function and assign it to 'f'.
f(3,4)
81
Call function f with scalar values.
f([1,2,3], [4,5,6])
[1, 32, 729]
Call function f with tensors that have a compatible shape with all operations in the functions body.
Result is a tensor of the same shape.
f = x => {y = x + 1; x^y}
Declare a new function and assign it to 'f'.
Functions has two statements in a code block.
The value of the last statement is returned.
f(3)
81
Call function f which evaluates two statements and returns the last value.

Implementation Details


The language is implemented in python (because it's a fun side project and performance isn't a concern).

The syntax is


Execution Stages


The code interpretation process occurs in 3 stages lexing/tokenization, parsing, evaluation.


Lexing (Tokenization)


Lexing (Lexical Tokenization) is the process of converting the source code text into a sequence of semantically meaningful tokens. This is done by repeatadly matching regular expressions to the start of the unmatched source code for the different token types. Once a regular expression match is found, that prtion of the source code is removed from the start and a token instance is created.


Parsing


Parsing (Parsing or syntax analysis) takes the tokens from the lexing stage and organises them into a parse tree or abstract syntax tree in accordance to the syntax. In this expression language the parse tree is build token by token, each token is converted into a tree node and enters at the trees cursor and either stays in place or bubbles up the tree towards to root until the appropriate level is reached according to the order of precedence.


If the parse tree is found to not be complete after all tokens have been consumed (for example parathenties have not been closed) and the evaluation is happing in an interactive environment, the inteperter will promt for more intput until either a valid parse tree is formed or an error is encountered.


Evaluation


The parse tree from the previous stage is evaluated dynamically by having the nodes in the parse tree evaluate their child nodes if applicable and then apply their own operation such as addition.


The actual computations are all performed in python, there is no compilation step to machine code or a language specific VM byte code.


Once the root node has been evaluated it's result is stored in a variable and printed to screen.