Arturo: "Rebol-inspired" Language

"Arturo is a modern programming language, vaguely inspired by various other ones - including but not limited to Rebol, Forth, Ruby, Haskell, D, SmallTalk, Tcl and Lisp."

1 Like

I'd seen that when it showed up on reddit. Not really sure what to make of it. But looking at the code example I will point out that 1..10 is now a valid TUPLE!.

>> type of 1..10
== #[datatype! tuple!]

>> first 1..10
== 1

>> second 1..10
== _

>> third 1..10
== 10

In fact, a..b is also a valid tuple...as is 1..(n - 1). However, a..b is evaluative by default...since it starts with a WORD!. [a]..b isn't, though, and 'a..b would get it to suppress evaluation.

Perhaps something to exploit in dialects that want to express ranges.

1 Like

Arturo author here :slight_smile:

Since the language is rather new and Rebol-inspired (although by no means a Rebol derivative or compatible with any version), I would really like to hear your feedback.

Ideas / suggestions / contributions / anything are more than 100% welcome!

2 Likes

Hello Yanis,

I must say, not bad for a one man project. (I should translate that 'not bad' to 'well done').
There is a lot of similarity with Rebol. (In the end all programming languages will develop looking more like Rebol aren't they?)

So why the need to develop yet another new programming language, other than the fact that you must have learned very much in the process by just doing it?

Hope you will have a lot of followers and contributors, because there are too many talents going to waste in the giant populations of the top ten languages.

Best,

iArnold

Arnold, Hi!

I appreciate your comment a lot. :slight_smile:

And you sure do have a point - I'm not pretending I have the solution to a problem than nobody else thought of.

Basically this is the way the story goes: I've been programming since I was a kid, back in 1994, and I've always had a deep interest not only in programming but also in the things many could probably find boring: programming languages, compilers, kernels and chess engines.

Guess what, 26 years (and a weird choice of deciding to study... medicine) later, I'm still interested in the very same things lol - and have designed at some point all of these in different languages and different implementations.

Although my main "business" is publishing and mac applications, I still keep coming back to what I really love. And although it doesn't pay the bills by itself, it's surely very rewarding.

This is how the idea of Arturo came up (aside from being my little pet Emperor scorpion that is):

to make something that I myself will use as an easier do-it-all scripting language, you know... automation scripts, templating, latex generation and perhaps help me a bit in the packaging of webview-based applications (think of Electron, but something far more manageable and without having to deal with Node.js and the likes).

As to what this language would look like, believe me, I've drawn numerous (read: LOTS) sketches throughout the years. Adding, removing, re-thinking (and again rethinking things over). And that's pretty much how the language has been shaped

Basically, the main ideas behind Arturo are:

  • Simple, flexible syntax (hence the REBOL inspiration; perhaps it's the best least-known language out there - not that I don't see why...)

  • No OOP, nor anything overly bloated like that

  • Ability to create whatever DSL you want easily

  • Easy access to functional programming

  • Keep the easy'and'readable part of REBOL but still allow for shorter, more symbolic syntax if needed (there's already a lot of such syntactic sugar in Arturo). I'm not saying it has to resemble BF, but I definitely appreciate lots of relevant features of Ruby or Haskell.

  • Be fully independent, no install 1-2-3 in order to get something working (btw, although I've obviously written my share of Rebol/Red code, I've never managed so far to get them working on my Catalina 64-bit MacOS - Docker-based aside)

  • Batteries-included philosophy (I hate to have to import one million different libraries only to do something simple and also remember what they are called. Most core functions should already be included.) Also, size doesn't matter. (not meaning that I'm willing to end up with a 10MB binary)

  • Ambitious, but oh well: go full self-hosted. (Basically, my next step is to write an Arturo to Nim transpiler in... Arturo)

Whether Arturo and me will make it or fade into history, remains to be seen.

Sure thing is the journey has been more than awesome :slight_smile:

So... if you want to participate in any way (even constructive feedback is more than enough), you'll be all 100% welcome!

1 Like

Greetings, thanks for posting.

although I've obviously written my share of Rebol/Red code, I've never managed so far to get them working on my Catalina 64-bit MacOS - Docker-based aside)

The Ren-C branch runs on many a thing...besides 64-bit, there's Android and WebAssembly.

If you didn't see last year's conference videos:

https://2019.reb4.me/talks

So... if you want to participate in any way (even constructive feedback is more than enough), you'll be all 100% welcome!

My version of this statement would be: "if you think you've resolved any of the thousands of open questions about Rebol language semantics or mechanics, feel free to participate by posting your feedback on an issue". (!)

The problem I see with starting over from scratch is that it will lead to running into the same issues years down the line. Several smart people tried for years to work on the many unanswered questions of Rebol and most gave up on it--including Carl himself. Projects that ignore that do so at their peril...and run the risk of making something that is easily dismissed as broken and brittle.

Ren-C has been about clawing one accomplishment at a time on the basic issues. It's a slow and difficult path...but by locking features and invariants down one at a time I believe the chances of ending up with a notable language artifact are much higher.

3 Likes

@hostilefork

I perfectly understand what you are saying.

However, there is a catch: when I started designing Arturo, I knew nothing about Rebol. Basically, I knew the name, but that's pretty much it.

I started reading about it and playing with Rebol only when I discovered by chance that what I had designed was apparently similar with something that already existed. And even then, apart from trying things and reading up the basics, I've decided - 100% on purpose - not to delve too much into it, so as to avoid being too influenced and ending up with a copy of something (undoubtedbly very insteresting) that was already there.

So, I total get your point. I'm not even saying that I've done the things right.

But the way things have been done in Arturo, I wouldn't be too sure that the language shares its issues - simply because I highly doubt it even shares its virtues, or anything at all that is - apart from superficial similarities. (If you try it out, perhaps you would be the ideal candidates to tell me if it does and at what degree :slight_smile: )

If you have an example of what you think is virtuous Arturo code, then I could probably offer an alternative rewrite of it to show another way to look at the same problem.

The first sample given crosses over a point I've brought up, about whether or not the loop variables should have to be quoted...see "Speaking With Tics".

loop 1..10 'x [
    if? even? x -> print [x "is even"]
    else        -> print [x "is odd"]
]

Decorations like ? on if? are something that Rebol generally tried to avoid. Ren-C is taking it further, e.g. by making OF an infix function that quotes a reflector word on its left... so you write type of x instead of type? x. (There was a push among the open-source Rebol users to say that ending in a ? was reserved for boolean functions...due to disapproval of the idea of ending things with ? to signify that they "are functions returning a value")

Historical Rebol avoids the idea of an ELSE. But Ren-C has it...written as a generic "null reactive" infix operator. If what is on the left is NULL then it runs.

>> null else [print "left was null"]
left was null

>> if true [10]
== 10

>> if false [10]
; null

>> if false [10] else [20]
== 20

This makes ELSE work with other constructs (like SWITCH or CASE) or anything that produces nulls. There's a complementary THEN:

switch x [
    1 [print "it was one"]
    2 [print "it was two"]
] then [
    print "it was either one or two"
] else [
    print "it was neither one nor two"
]

You can also make the branches functions instead of pure blocks, and use lambdas to get the results. (x -> [code] is a shorthand for func [x] [code], except it has no RETURN of its own...hence a return inside of it is still bound to the containing context's return)

switch x [
    1 [<one>]
    2 [<two>]
] then arg -> [
    print ["branch runs if arg is <one> or <two>:" arg]
] else [
    print "it was neither <one> nor <two>"
]

The NULL-ness of control constructs that don't branch is a nice tool.

count-up x 5 [
    print [x "is" (if odd? x ["not"]) "even"]
]

1 is not even
2 is even
3 is not even
4 is even
5 is not even

We unfortunately don't have much going on in terms of looping dialects. I'd like to see something as powerful as Lisp's LOOP macro, but ideally more accessible. Seeing language like [1 to 10] (not including 10) vs. [1 thru 10] (including 10) would be nice to flesh out vs. going too heavily into shorthands and symbols for the same purpose. But I mention that ranges are something that might be an interesting application for the generic TUPLE! type.

But the way things have been done in Arturo, I wouldn't be too sure that the language shares its issues - simply because I highly doubt it even shares its virtues

I looked at the page again, and you do say "vaguely inspired". So truth in advertising, I suppose. :slight_smile:

IMHO, there's some pretty amazing machinery in Ren-C. Again: the best I can offer is just to show you "this is how we would do that" for any definition of "that".

1 Like

The way Arturo handles this thing is: in this type of constructs, where you need parameters and an action, the first one can be either a literal (if it's just one argument) or a block (if it's one or more). So, this particular example could also have been written like:

loop 1..10 [x] [ ... ]

Here I was probably more influenced by the way Ruby does things, but taking to an extreme level: if we use ? to signify something, then we'd use it in every single such case - even if it is by convention.

So, if ? is meant to signify returning a boolean, then every single function returning a boolean would be suffixed with ?, while the ones who don't, won't. For me, that's consistency. When I noticed that Rebol's type? just returned a value, while other ?-suffixed returned a boolean, and other functions returning a boolean had no ?, I was totally perplexed to be honest.

Btw, in Arturo, there are two different ifs: if and if?. The first one just checks the condition and executes (or not) the following action block. if? also returns the value (of the condition). This way (and this is the only one) it can be branched with an else (basically, else works pretty much like you say - by checking what's left in the stack, in this case by the preceding if? -- again, I had no idea Ren-C was doing it like that. It just seemed to me as a natural way of having an else statement, that could work for ifs, switch-type statements, and even try-catch statements.

To highlight my point, this in Arturo would be like: false else [print "here"]

It's not about me trying to "market" the language as something that it's not. It's just that I mentioned Rebol so that - one already knows the language - would get an idea of what this new language is about. As something similar, which it sure is.

Again, the perfect truth is, when I was designing Arturo, I had no clue what Rebol already did. And I'm still amazed that I keep coming across details that I hadn't even heard of: e.g. the thing you mentioned about the else statement. I guess I'm re-inventing a wheel whose existence I totally ignored lol.

I will absolutely do that and give you many "that"s. Just give me some time.

I already totally appreciate your input :slight_smile:

1 Like

The exact same thing written in Arturo:

which - with a tiny bit more syntactic sugar - could be written like:

loop 1..5 'x ->
    print [x "is" if odd? x -> "not" "even"]

Output:

1 is not even 
2 is even 
3 is not even 
4 is even 
5 is not even 

That was presumably born of the feeling that TYPE was too much a "noun" variable name to be "verbified"...so it was chosen to use the ? to "queryify it". I'd like to see more nouns retaken by default, e.g. switch to HEAD OF instead of HEAD.

The ability to make fast specializations with POINTFREE comes in handy here. I think it's more about letting people bend to what they want...fast...than any particular prescriptivism.

Been there, tried that.

The design of NULL is such that it represents a "non-value", e.g. it cannot be put in a BLOCK!. It is chosen as the exclusive "non-result"...returning null means no branch ran. So if a branch happens to incidentally evaluate to null, it is "voidified":

>> type of (if false [<x>])
; null

>> type of (if true [null])
== #[datatype! void!]  ; VOID! is like a Rebol2 UNSET!

This allows one to discover from the outside of a branching construct whether a branch was taken or not, with only one version.

Because voidification may not be appropriate for all situations, a proposed behavior is to distinguish the branch as opposed to the construct. So using a different type of BLOCK! (a SYM-BLOCK!)

>> if true @[null]
; null

Using annotations on the blocks instead of having to name a new construct offers expressive power, e.g. with quoting:

>> either true [print "Hello"] '[print "Hello"]
Hello

>> either false [print "Hello"] '[print "Hello"]
== [print "Hello"]

Trying to get more mileage out of the menagerie of data types instead of names is sort of a theme.

loop 1..5 'x ->
print [x "is" if odd? x -> "not" "even"]

I do think that the arrows have a disadvantage vs blocks in that they aren't paired delimiters, so you're paying two characters already without establishing the pair. (That, and I've found them very good for lambda.)

One advantage of NULL being non-valued is that it is unambiguous what you mean when you say:

 >> compose [<a> (if false [<b>]) <c>]
 == [<a> <c>]

The NULL can't be mistaken for anything that could be put into a block (as opposed to Rebol2's #[none]...which is a BLANK! (_) in Ren-C). So it really can vaporize.

2 Likes

In case you are interested in seeing a bit more about what I'm working on, I've just published a first (quite complete) reference of the language:

It sure needs various revisions. But in the meantime, feel free to comment on anything. (If you spot things that you consider really different than in Rebol/Ren-C, I would also be interested to know)

:slight_smile:

if x=2 -> print "x was 2!"

is the same as writing:

if x=2 [ print "x was 2!" ]

Just remember that -> can wrap only one terminal value.

But how can you calculate how big "one terminal value" is? Especially when it's a branch you are skipping for some reason... quite possibly because the very things needed to run the branch aren't defined?

Pseudocode:

 if function? :foo -> foo bar baz

If foo were a function, then perhaps you could get its arity...and look up bar and if it's a function or terminal, and proceed down the line. But we're trying to skip it because it isn't a function. And nothing guarantees bar is defined either. It's the branch not taken.

You can do this kind of thing in a compiled language where definitions are pinned down and not able to change. But a dynamic interpreted language needs the blocks. Even if sniffing the expression could be done reliably, you'd still be wasting time calculating that boundary when skipping a block is practically free.

split .every: 2 "hello world"
; => ["he" "ll" "ow" "or" "ld"]

From time to time, people who think saying append/dup [a b] 'c 2 = [a b c c] separates /DUP from its 2 argument, will suggest moving the /DUP to be nearer the 2. They want something like append [a b] 'c /dup 2 or append /dup 2 [a b] 'c.

But this causes trouble when what you supply to those arguments have optional parameters as well. You get into a prioritization question of "whose /dup is that". It means that people adding optional parameters to a function that didn't have that parameter before can break existing code.

So let's say SPLIT has also an OPTION:

split .every: calc-split-val .option: true "hello world"

At time of writing, imagine that CALC-SPLIT-VAL is a function that gives back an integer. Then one day someone edits CALC-SPLIT-VAL so it gets its own .OPTION.

If the only solution to such brittleness is parentheses...then people will start adding parentheses as a "best practice, just to be safe". Pretty soon you wind up back to traditional looking function calls with parentheses again.

We might soon have a working answer to the desire. Due to having an interesting ability to mix TUPLE! and PATH!, I've considered options like:

 append/dup.2 [a b] 'c
 append/dup.n [a b] 'c
 append/dup.(n + 1) [a b] 'c
 append/dup [a b] 'c 2  ; would still be legal

But generic TUPLE!s are new, and the optimizations and formalizations behind PATH!/TUPLE! are new. So it will be some time before that

print user\name

Backslashes are problematic in a lot of scenarios...and if you're passing code fragments around in C strings or to shell scripts it gets pretty ugly. I think slashes look better and they are in a more comfortable position to type on most keyboards.

Ren-C PATH!s are allowed to have blank elements. So what looks like a lone / is actually a length two PATH! of two blanks. There's some magic mechanics which allow these to dispatch as functions (e.g. divide).

Characters in Arturo can be declared using backticks: \w

Backticks are another one of those "problem characters", used typically in Markup to call out code. In fact, in the quote above of \w it lost the backticks.

Ren-C got rid of the idea of a separate character type and instead has a non-series string form that is immutable, e.g. ISSUE!. This means #a can act as a character and be asked for its codepoint, but #abc is also in the same family. They're not technically strings so it's not an infinite regression:

>> first "abc"
== #a

>> first #a
== 97  ; codepoint value, so you don't get #a again

I think it's pretty fantastic, and it's especially nice to be able to have so many common characters not require delimiting. #a is very clean.

"inline"

Not using Rebol's name PAREN! is an important thing to reject. Terrible name.

But I think GROUP! is a much better name for (...)

2 Likes

Good morning (here it's 6+am) and thanks a lot for your points!

Let me go through them.

But how can you calculate how big "one terminal value" is? Especially when it's a branch you are skipping for some reason... quite possibly because the very things needed to run the branch aren't defined ?

Regarding the first point, you're obviously right. In general, given that the only way to figure what goes with what (I mean what is whose parameter) is to know all of the functions beforehand, deferred evaluation (anything in a block practically) certainly makes things complicated. However, I'm still trying to figure out how this could be approached. On one hand, I absolutely love the idea of being able to write more terse code like the one allowed by adding some "sugar". On the other hand, this would probably add at least another pass of the parser/eval. We'll see...

As you can already imagine, I'm also in the process of deciding several things. (Btw, regarding the interpreter/compiler divide... one thing I've been thinking of is to re-shape the current implementation into an Arturo-to-Nim transpiler, and basically rewrite the whole interpreter in Arturo - and slow move to a more... self-hosted implementation. But we'll see - this may be too ambitious).

At time of writing, imagine that CALC-SPLIT-VAL is a function that gives back an integer. Then one day someone edits CALC-SPLIT-VAL so it gets its own .OPTION.

If the only solution to such brittleness is parentheses...then people will start adding parentheses as a "best practice, just to be safe". Pretty soon you wind up back to traditional looking function calls with parentheses again.

Right now, the way "attributes" work in Arturo is a bit... weird (I have no idea at all if refinements work in the same way): basically, you can set an attribute ANYWHERE in your code, it's not directly tied to a function-call. Think of it as a separate stack. An empty attribute (:attr), let's say .always is like pushing a tuple ("always", true) to our attribute stack. An attribute label (:attrLabel) is the same thing only with a user-defined value (.every: 3).

Now, it is the functions that "consume" these values and pop attributes from the attributes stack at will. For example:

f .with: z g x

(where f, g are functions, .with: z is our attribute, and x is the value passed to g).

In theory, in the above example, the attribute "goes with" function f. In reality, it is available to both functions, until one of them needs it or checks it... so... then it would be popped from the stack and consumed. So, if g consumes it, there's no .with for f. And if it doesn't, f may consume it. And if both still don't, it will remain on the stack. If we wanted to use the exact same attribute for both functions, we would have to push it twice: f.with: z g.with: q x (and no, the attributes don't have to follow the function's name necessarily, it just looks better than saying: .with: z f .with: q g x. Although we obviously could.

Backslashes are problematic in a lot of scenarios...and if you're passing code fragments around in C strings or to shell scripts it gets pretty ugly. I think slashes look better and they are in a more comfortable position to type on most keyboards.

Ren-C PATH!s are allowed to have blank elements. So what looks like a lone / is actually a length two PATH! of two blanks. There's some magic mechanics which allow these to dispatch as functions (e.g. divide).

I know Rebol/Ren-C use / as a path-delimiter. However, I needed a different symbol for that... simply because the "natural" use of / for me would be to denote division (and I think for the vast majority of programmers). And if I had used it there, using it for something so distinct would be confusing even for me.

That said, I must admit that the whole story with paths (especially in one case: where one of the indexes is not a constant... e.g. user\name looks acceptable, but user<variable> has to be handled somehow) is something that has been bugging me... and it will change this way or the other. I'm not sure if I'll change the symbol used, but I'm 100% sure that I will change something.

I think it's pretty fantastic, and it's especially nice to be able to have so many common characters not require delimiting. #a is very clean.

Again, this is one of the things that confused me. Basically, this is the way I see it:

I wanted to have two functions/symbols used to "arra-ify" and "dictionar-ify" a block - basically, the first one would calculate the block and get the values on the stack (I think that's pretty much what reduce does, no?), and the second one would calculate the block and get the declared symbols into a hash table). For the first case, we use the function array (or @) and for the second one, the function dictionary (or #). Both of the symbols are one of my earliest commitments and preferences regarding what would symbolize what.

So, basically, the # is already reserved for some specific use case. And a rather super-common one (at least for me). Wasting it that on character literals would be a shame. (basically, I get your point, although to be honest I've been tempted to totally eliminate character literals. Initially, the only way to create a character was like to :char "x". I'm not sure I wouldn't revert back to that...)

Not using Rebol's name PAREN! is an important thing to reject. Terrible name.

But I think GROUP! is a much better name for (...)

Well, PAREN does sound a bit weird, truth be told lol. I basically chose inline quite naturally. It was the "inner" way I distinguished them between square-bracket blocks. And since, the basic difference is that the ones call for inline evaluation while the second ones don't, hence the name).

Cheers :slight_smile:

You seem to have been inspired by the "no keywords" part of the philosophy of Rebol. (If, for instance, true and false can be redefined...it was non-obvious from your documentation that they could.)

But "space significance" is another pretty big premise in the Rebolverse. There's a near-majority agreement that the only things that should be allowed to break the rule are ][, )(, ](, and )[

There's only so many ASCII characters. And having spaces between things can be a virtue. I think that [a+b/c] has no particularly obvious virtue compared to [a + b / c]. And when you are willing to let spaces do their visual magic as the most natural way to seeing where symbols end...you gain access to more lexical space.

Regardless of what "most programmers" may be used to, I think it's fairly easy to see a/b and a / b as distinct...not much harder than seeing ab and a b as distinct. And if you're habituated to that kind of thinking, it opens many doors.

Right now, the way "attributes" work in Arturo is a bit... weird

Your description does sound... weird.

Guess I will just cite Alan Perlis:

"Beware of the Turing tar-pit in which everything is possible but nothing of interest is easy."

Compare with:

>> append [a b c] [d e]  ; append is arity-2 by default
== [a b c d e]

>> append/dup [a b c] [d e] 2  ; /dup adds an integer parameter to be arity 3
== [a b c d e d e]

>> apd: :append/dup  ; make APD as specialized form, assumes /DUP

>> apd [a b c] [d e] 2  ; APD is thus arity 3
== [a b c d e d e]

>> f: make frame! :append   ; make an object representing call frame for append

>> f/dup: 2  ; set the DUP field in the frame to 2

>> apd2: make action! f  ; create an arity-2 function out of the frame

>> apd2 [a b c] [d e]
== [a b c d e d e]

There's a lot of deep detail in the mechanics; e.g. of how that F can remember it's a frame for APPEND, and moreover if it were a frame for a RETURN function instance, it would even remember what function it was a return for.

2 Likes