Language Notes

Another Proposed Lambda Syntax
[2009.10.29]
stuff foreach: λ (item) send: item string
Assembly
[2009.4.25]

'r0', etc. become names of registers, even perhaps in normal Trylon. Target-specific register names too, I suppose. But any sends to them are intercepted by the compiler. One must know what one is doing, values will be translated between the two worlds unsafely.

r0 = File contents-of: 'test' r0[word-size] = r[0] r0 += word-size eax = eax string # Unsafe! Dispatch_() kills eax. r0 = r0 string # Probably safe; r0 is chosen from "free" registers. r0 = r2 + r5
C Structures
[2009.3.25]
c-struct mg_request_info string request_method, uri, query_string, post_data, remote_user int remote_ip, remote_port, post_data_len int http_version_major, http_version_minor int status_code, num_headers

Automatically generates a class wrapping a BytePtr, with accessor methods for the fields. The structure definition must be visible in C, probably through the preamble. How to deal with "struct mg_request_info" vs. a typedef'd anonymous struct? How to deal with read-only vs. read-write fields? Writing in general -- when to copy strings?

Uses
[2009.3.7]

Turns out the new parser doesn't implement a "uses" statement; the old one didn't either, apparently! What does a "uses" mean currently? It must only be used to get "Standard" into "Main".

Namespacery
[2009.3.4]

Maybe it really is better to have subclasses also be in the namespace:

class Collection class Tuple superclass = Collection # New assignment-style (but not yet an actual assignment) superclass declaration. ... class List superclass = Collection # Or maybe we really want it to be an assignment? It might be nice to # have that info available at runtime. ... args = Collection List new

But then we're tempted to "use" Collection just to get those names. The solution gets a little subtle:

class Collection class Collection # The virtual superclass. Known to insiders and outsiders alike as "Collection". class Empty # So outsiders can say "Collection Empty". class List # The rest are just "List", "Tuple", etc.

But we need "Empty" in the outer scope so someone who *doesn't* "use" Collection can say "Collection Collection"... oy...

What was my point?

Anyway, here's the CFunction library demonstrating the alternative:

class CFunction class GenericCFunction class VoidFunction superclass GenericCFunction class IntFunction superclass GenericCFunction class StringFunction superclass GenericCFunction class Copied superclass GenericCFunction class Uncopied superclass GenericCFunction
C Function Calls
[2009.3.4]

For a while I've been trying to think about whether adding C-style function call syntax, probably doing C-ABI function calls, would work. And I think we could get it to work with other ABIs (Java, Python) too. Have it be syntactic sugar:

printf("%d\n", num) # == printf call: ("%d\n", num)

Nullary call:

class VoidFn call: args call call: Collection Empty

[Hey, "Collection Empty" is finally a real use for the prototype objects.]

C calls would have args unboxed at runtime. That may require the use of an FFI lib. Different classes of C calls handle return values:

class CFunction class VoidFn call: args # Returns nil. class IntFn call: args # Returns an Int. ...

Also want a facility for declarative argument checking:

class Posix class popen superclass = CFunction PtrFn call: args check-args: args against: ('string', 'string') return super call: "popen" with: args class CFunction check-args: args against: types which-arg = 0 for type in typed arg = args at: which-arg ok = false switch type 'string' ok = (type == String) 'int' ok = (type == Int) ... if !ok throw MessageException new: "Type mismatch in C function call." which-arg += 1

Or even:

class Posix popen = VoidFn new: ('string', 'string')

...except we can't declare a a shared variable to be initialized to a non-contant value.

Action Objects
[2009.1.28]

Verb-like objects for more shell-scripting style:

git pull git commit git commit: "foo.c", "bar.c" # (with ',' binding tightly)

Actually, that's rather noun-like. And with real verbs like "ls", we run into a problem at the leaves. Maybe.

ls ls: foo/bar # Assume a "Path" object like in Potion.

But the first will do nothing, it is just a reference.

Anonymous Objects
[2008.6.6]

Finally using the curly braces and the semicolon:

class KeyValuePairsIterator current-item if stack-top == nil return nil node = stack-top node return { key = node key; value = node value }

Or can we use the comma? I think so:

return { key = node key, value = node value }

Yeah, that's easy enough to parse, at least currently when ',' has the least binding power. And it has the advantage of not adding any new punctuation.

Implementation can be optimized by sorting the fields alphabetically, and keeping a global dictionary with (orderable) tuples of the names as keys, so there's only one class of { key, value } objects, no matter how many times such objects are created.

Multiple value returns seem to be the main use case for this, and they are rare, so this isn't a high priority.

Packages Over The Internet
[2008.5.19]
trylon HTTPServer superclass http://some.guy.com/trylon-packages/TinyHTTPServer create super create: "foo"

How does this fit into the namespace system? "HTTP some.guy.com TinyHTTPServer"? Use that syntax instead?

trylon HTTPServer superclass HTTP some.guy.com/trylon-packages TinyHTTPServer

No, I think this would be better:

trylon HTTPServer superclass http://some.guy.com/trylon-packages TinyHTTPServer

Either that, or the first one, and have the package named with the full URL in Main. Yes, that. The only odd thing about that is the internal colon in the unary selector -- which is easy to define as correct: only words *ending* in a colon are keywords.

Also allow Git (eventually maybe Hg, Svn, etc.). This would grab the 'master' branch. Pull on every compile? Probably not.

Need a place to put these; specified as "remote-packages" in build-settings. Default to "~/trylon-remote-packages". It'd be nice to have a "~/.trylon-build-settings" file too for per-user global settings. So the example would be stored as "~/trylon-remote-packages/http/some.guy.com/trylon-packages/TinyHTTPServer/". "main" is grabbed by default, anything else is downloaded on demand. Try "wget" first, then "curl -O".

Someday, perhaps "http:" should handle "tar.gz", ".tgz", etc. as well.

Foreign Functions
[2008.4.17]

Replace "c-fn" with this:

foreign c == arg return Bool_(BytePtrValue_(this_) == BytePtrValue_(arg)); foreign jolt == arg (if (== (int@ self) (int@ arg)) true false)

What to do when the language isn't targeted? Currently it's an error, so "iff"s must be used. But the above shows the possibilities of simply not defining the function if the language isn't targeted.

Another possibility is foreign blocks (with a "two-level" syntax akin to "switch"):

foreign c == arg return Bool_(BytePtrValue_(this_) == BytePtrValue_(arg)); != arg return Bool_(BytePtrValue_(this_) != BytePtrValue_(arg));

I guess I prefer the simplicity of the first alternative, and don't really find it too repetitive.

Far-Out Stuff
[2008.3.6]

Still trying to unify methods and objects. Any object can be considered the activation frame of a function call. Necessitates going to a purer prototypism (var declarations create fields, not shared fields).

String = start = nil end = nil create: other = if other is-a: String this start = other start this end = other end

Umm, there seems to be some ambiguity about what "this" is. Or maybe not; maybe it always sorta points up a level lexically. So when 'create: other' is executing, 'this' would be a String, passed in like an arg. When 'String' is evaluating, 'this' would be... [String itself, I suppose.]

Anyway, another far-out idea: unify object references and BytePtrs. All objects respond to BytePtr's methods; any 'does-not-understand:' on those selectors is trapped. That includes if the object's classref is random bits.

Interpreter
[2008.2.5]

A file like this (named "foo"):

#!/usr/bin/env trylon trylon foo fields options create options = Set new name return "My name" clean sh: "rm", ".objects/*" # ^ Experimental precedence change. # ',' binds more like a binop. # Disadvantage is: ("foo", (foo bar: baz)). Extra parens. add-option: option options add: option intern

Should respond to shell commands like:

foo name # Prints name on stdout. foo clean foo add-option: svn

Parse binops or not? Think about field storage (and creation) later. (".foo.values" file in same dir?)

"cond"
[2008.1.30]
cond player is-recording draw-icon: record-icon at: icon-loc player is-playing-prerecorded draw-icon: movie-icon at: icon-loc else draw-clock-at: icon-loc
"return:" and "throw:"
[2008.1.30]

These could be turned into commands, paving the way for user-defined control structures.

if options contains: 'test' return nil if source == nil throw: (MessageException new: "Missing source!") if source == nil throw: "Missing source!"

(Adding "string message". Also add "Collection message".)

But while "break" and "continue" work nicely unchanged, other control structures ("if", "while") can't be done that way. So we're back to the ol' "first word" thing. Except now maybe it's the first word of the expression, not just the statement.

Imperative Commands
[2008.1.30]

I've needed a name for expressions starting with the verb/keyword. You could think of them as having an implied "you":

say: line # == you say: line

Which would make them "imperative commands", or just "commands".

Text Literals
[2008.1.25, 2008.1.28]

Some funny character to start them?

help-text = -> Lorum ipsum in secreo dolores. Et in hoc pluribum est. Also, there was a computer there. help-text = ¶ Lorum ipsum in secreo dolores. Et in hoc pluribum est. Also, there was a computer there.

I prefer the latter.

This suggests a form of "reader" like Scheme has. "¶" is obviously a single word that introduces a reader, but maybe we want some kind of generic reader syntax:

default-settings = -> dict font = "Arial" font-size = 12

or

default-settings = #dict font = "Arial" font-size = 12
Private fields for functions
[2007.9.8]
class CompiledClass c-name field c-name if c-name == nil c-name = long-calculation return c-name
Ellipsis for line continuation
[2007.9.8]
if name == "sdkj sdlkfj" || name == "owiej fvosdi" || ... name == "xcvm ,xmvn x, mcnv" # ...

Maybe?

[2007.8.22]
if no: function throw MessageException new: "No such function!" if no function throw MessageException new: "No such function!"

-----

Infer types of locals that are only assigned to once. But only if something can be gained: that doesn't optimize BytePtrs or Ints (which often increment themselves).

[2007.8.18]

Rename the "prepare-to-emit" phase to "resolve".

Rename "the-compiler" to "compiler".

[2007.8.15]

Double keywords:

print: at: x # == print: (at: x)

Perhaps not... but it would maybe enhance inline conditionals:

print: (if: (x && y) then: "yeh" else: "nah") send: if: X && y then: "yeh" else: "nah"

-----

New stream protocol:

stream send: "Foo" stream sendln: ("x = ", x)

Hmm, I really only want send. It means send a message -- most often that message takes the form of a line.

-----

Double indents ("indentation violations"):

foo: bar + baz bularghis scrantone

parses as a line with a subordinate block ("outer block"). The outer block contains two lines. The first is a null line with its own subordinate block ("inner block"), the second is the "scrantone" line.

That doesn't gibe well with the auto-line-extension idea. It could work, but it groups the extension with the body instead of the line. I think I really don't want auto-line-extension anyway (well, maybe after the first extension).

This could be extended to even more baroque null-line structures by simply counting tabs.

Call the subordinate block a "body" instead of a "block"? A line has the line itself (usually) and also its body.

-----

Underscore for line extension

x = _ Number new random + _ now milliseconds

Not really. If only it were a tiny ellipsis...

-----

Been thinking about trailing "=" instead of ":"/"put:" for setting:

trylon Font size return .size size= new-size .size = new-size clear-glyph-cache

Well, maybe just for the ":" case; still use "put:" 'cause "at:=" is a little weird... or is it?

trylon Array at: index # ... at: index = value # ...

Not bad, but I think I still prefer "put:". But "size=" wasn't as ugly as I thought.

[2007.7.30]

"[]" becomes "at:":

functions[new-function name] = new-function (functions at: new-function name) = new-function functions at: new-function name put: new-function
[2007.7.27]

I think I need a better name for what's now called a Function (ie. TrylidFunction). That is, the first word in an expression; the thing that's looked up in the context. Linguisticly, it unifies two roles: naming the subject (receiver) of a message, or calling a function. In the compiler, it has another dual role: emit-call: vs. emit-function:. (Note lookup-function: vs. lookup-instance-function: in CompiledProto -- "instance" shouldn't be meaningful. "lookup-own-function:"? "lookup-selector:"? Should it also check the proto's directories?)

Don't forget to fix && || ! precedence.

Excessive (redundant) name-mangling may be one reason the Trylon-built Trylid compiler goes faster than the (Trylon-built) Trylon compiler. (But the emit phase isn't the only culprit.)

Keep a dictionary of all selectors, annotated with used/defined bits, to prevent link errors. Actually, it looks like we already have such a dict (minus the annotations) in the Trylon compiler, called "object-function-names", needed to build the dispatch table.

"primitive" statement declares a variable holding a machine word. It only responds to a few methods, and the compiler can enforce that.

primitive start = old-style-string start .primitive primitive stopper = old-style-string stopper .primitive primitive p = start while p < stopper if p char-deref == `\n` return true p += 1 return false

In the compiler, Send will have to be special-cased.

When porting the building-dispatch-table phase, rename DispatchRow to SelectorRow.

Classes vs. Protos
[2007.7.26]

Still speak of "classes"; they are what your code defines. But each class automatically has a prototype, so it acts like a prototype-based system. So "MyClass" or "Standard Int" (as an expression) refers to a prototype.

Trylidization of Trylon
[2007.6.13]

The Trylon compiler needs to change to being prototype-based. Copy Trylid's CompiledProto to Trylon, adapt it to Trylon by cutting-and-pasting from CompiledClass and Package, and hook it into the Parser and Compiler. It is expected that existing Trylon code will not need modification (but should eventually have "class-fn" and "class-field" removed).

Make "nil" the only falsy value, as in Jolt & Trylid.

Proto names ('.proto-name') should be symbols, not strings.

Multi-line Comments
[2007.5.4]
# This is a block comment. All the subordinate lines of a comment statement are part of the comment.
[2007.4.29]
start = field stopper = field

Field is a macro, a special form, or perhaps even a function that creates a new field in the current object. No, it's still syntax because "field" has to get the name as an argument.

Macros
[2007.4.26]
macro "for" var (name) "in" collection (expr) .iterator = collection iterator while !(.iterator is-done) continue-catcher var = .iterator current-item body .iterator go-forward
Line Continuations
[2007.4.25]

The "--" should indicate that the following *block* is a continuation of the line. Extra indentation can be used when both a continuation block and a subordinate block are needed:

if c == 0 || -- c == `\n` || c == `\r` break
Miscellaneous
[2007.4.18]

"splice" instead of extend:

trylid MyStandard splice Standard proto Int <<+== etc.
Miscellaneous
[2007.1.21]

CamelCase for class names is the last vestige of hungarianism. Should it be eliminated?

When class extension is implemented, I can eliminate "primitive-fn". A function with no body is noted as such, and anyone can replace it. Primitives are a special case of extenders. Also, C-style separation of interface from implementation becomes possible (but not mandatory!).

trylon class math extend float sin cos

I want the interpreter (to replace bash). Maybe that could be a fun mechanical project.

More tests of object-orientation:

trylon class: 'math' extend: float with: sin cos ----- example-of-inline-syntaxes while: {x > 0} do: x -= 1

The latter reaffirms the need to eliminate verbosity in basic control statements.

Extending Classes
[2006.12.4]

Rationale: I'd like to have a Math package that, when "used", would add new methods to Int and Float.

trylon Math extend Float primitive-fn sin primitive-fn cos
Bytecode
[2006.12.1]

Register-based. A method can have as many registers as it wants, ie. they are locals (including temporaries and arguments). Operands (eg. for an "if" condition) can be a register, a field in 'this', what else? Literals? Or are literals only assignable to registers?

Arguments are registers called '.a1', '.a2', etc., and are shared by the caller and callee. Or perhaps the callee knows them only by their given names, which map to registers starting with number zero. The *last* registers in the frame are called '.a1', '.a2', etc., and are used to pass arguments downward.

If there are not enough bits in the bytecode for an operand, use words after the bytecode in UTF-8 format for easy efficient flex sizes.

op1 = op2 r1 = r2 'selector' # Any arguments must be in the '.a' registers. if op1 skip +-num if! op1 skip +-num skip +-num try catch class endcatch throw op1 primitive "Int +" # Name is mangled and that C function is called.

How to specify classes, as we will need to do? They're just items in the literals. The literal section is the one that will need to be linked if it is stored outside a running image. It'll need to link symbols as well as classes.

Mixed Languages
[2006.12.1]

Can mix with other languages, using indentation to determine language boundaries. Already it kind of does that, with separate languages at the package level and the method level.

Dot
[2006.12.1]

Like the Unix filesystem: names starting with a dot exist, but are hidden in any UI listings. '.' == 'this'. '.class' instead of 'class'.

In theory, you could be explicit about 'this':

add-to-average: value . total-items += 1 . total-values += value average return . total-values as-float / . total-items

But that'd be stupid, except in a bytecode assembler.

Unicode Characters
[2006.11.24]

First, make sure they can be used in keywords: eg. π. [Done.]

Have codegen use escapes for UTF-8 characters in strings.

Then, add certain binops to the grammar: ≠, ≤, ≥ (and @ while you're at it).

Finally, strings, including counting beginning and end quotes:

text = “This is a “string”. It is a sequence of characters.”
Character Constants Again
[2006.8.9]

Arc may have the answer (http://www.paulgraham.com/arcll1.html)! Backslash introduces character constant:

digit = c - \0 if c == \\n #...
Replacement for XML
[2006.6.27]

An example, taken from Dia's XML and simplified:

diagram = background = color = "#ffffff" guides = h-guides = true v-guides = true layers = ----- name = "Background" visible = true objects = ----- FlowchartBox: id = 0 rect = (12.825, 2.825, 17.3, 4.725) border-width = 0.05 text = string = "Send Packet" font = ("sans", 0, "Helvetica") height = 0.80 pos = (15.05, 3.925) color = "#000000" alignment = 'center' ----- FlowchartTerminal: id = 1 rect = (21.8072, 6.025, 25.9928, 7.63211) line-width = 0.05 line-color = "#000000" fill-color = "#ffffff" line-style = ('solid', 1) text = string = "FreqOK" font = ("sans", 0, "Helvetica") height = 0.8 pos = (23.9, 6.97855) color = "#000000" alignment = 'center' flip-horizontal = false flip-vertical = false ----- StandardArc: id = 3 pos = (17.275, 3.775) curve-distance = -0.76531600914007569 line-width = 0.5 line-style = 'dashed' end-arrow = 'standard' connections = ----- handle = 0 to = @0 connection = 8 ----- handle = 1 to = @1 connection = 2
Character Constants Revisited
[2006.6.15]

Reunify characters and integers? Replace awkward backtick char constants with a special form of numeric constant?

digit = char - 0c'0' lines = text split-at-delimiter: 0c'\n'
Improved "fields" Declaration; The New Lisp
[2006.3.18]

"fields" statement should handle indents as continuation.

trylon EditScreen fields file lines cur-line column key-mode actions last-action cur-insert-action command-line

Also, extending the dict-constant syntax (see 2006.3.2), so it allows non-valued dict entries (default value? nil? name as symbol?), allows this:

fields = file lines cur-line column key-mode actions last-action cur-insert-action command-line

(But not "fields = file lines"!)

More in that direction:

List = fields = head tail prepend: value = link = List new: (value, tail) return: link foreach: block = block do: head tail foreach: block # Yeah, right, more like this: foreach: block = link = this while link block do: link head link = link tail

One is almost tempted to omit the '='s. Actually, just go back to the way it already is, to avoid ambiguity between "fields" and unary function defs. (Brilliant useless insight!: the presence of any function call (that is, any expression) is what differentiates a dict-constant from a function def.)

Trying to head for a syntax that includes only function calls and assignments. But "if" and "while" remain intractable.

if: x == y then: # Needlessly verbose! do-this else: # Always a bitch! do-that while: i > 0 # Yeah, but how do we get the expr to be reevaluated every go-round? # Macros? Is this the new Lisp? Replacing s-expressions with dicts? do: block i -= 1

Also note new auto-"new:" function taking a Tuple.

Larger Constants
[2006.3.2]
text = " This is a multi-line string constant. It starts with this line. But if the "text =" line had had text, that would've been the start. Or maybe not; don't encourage ugliness. The indentation at the start of each line is not included in the string. Trailing space to indicate line continuation/wrapping? It can include blank lines. But trailing blank lines are stripped. An empty multi-line string constant is likely to be an error, so the compiler should warn about it. Hmm, maybe one of those plain-text markup languages had the right idea about line continuations: trailing space for linebreak (and strip it); no trailing space means line continuation (until a blank line). (Also that's more compatible with vim.) dict = name = "Thing" attributes = 'constant', 'fungible' color = "green" is-color = true
Quoted Functions
[2006.02.06]
at: index put: value = fn: " which-block = index / block-size if which-block > num-blocks grow-to: which-block which-block at: (index % block-size) put: value

Multi-line quotes are indent-sensitive (and don't require a closing quote-mark).

BTW, field declarations above don't allow for types.

Program/Data Unification
[2006.01.25]

The "members" view can be unified with the "code" view by treating it as a program run by the compiler. Which it essentially is anyway.

Entry = class: fields = 'name', 'value' create: name value: value = this name = name this value = value

Hmm, there still is a sort of radical context shift once you get into a function.

What about that method declaration syntax? It seems to imply a rather late detection of expressions vs. declarations.

Go back to Cleen(?)-style binop handling: "3 + 4" -> "3 plus: 4".

Need "+=" for "fields":

Entry = class: fields = 'name', 'value' fields += 'valence'

That implies Tuple '+' and Symbol "plus: object -> Tuple".

Auto Type Declarations
[2006.1.13]

Auto-declaring a variable with a constant value automatically gives the variable that type. Eg.

index = 0 type = 'none'

Having initialization to a function call might be nice, but is probably hard to implement in the current compiler:

p = text start stopper = text stopper

Note that this changes the definition of the language; it's not just an optimization. You're not allowed to store any other type of object in that variable. It can be overridden by explicitly typing to "Object":

thing (Object) = 3

Of course, *explicit* type declarations must first be made to work.

Modular Programs
[2006.1.5]

build-settings:

extra-modules += "Terminal" "TextEditor" "XMLEditor"

main:

for module-class in extra-modules module = module-class new

Or something like that. But "new" isn't an instance function of a Class object (currently). Either metaclasses can be added (ugh, implementationally; cleaner semantically), or a "Class raw-instance" function can be added (easy but uglier).

for module-class in extra-modules module = module-class raw-instance module create my-modules append: module
Arrays and Types
[2006.1.3]

Parentheses could be used for type declarations -- they should be parseable in declarations of locals, arguments, and fields -- thus freeing up [] as the index operator again.

num-items (Int) = 0 for item in collection array[num-items] += 1 num-items += 1 create: size (Int)
Tuple Precedence
[2006.1.3]

Currently tuples have a precedence below keyword calls, but maybe they should be like any other binop.

stream write-all: "obj _", c-name, " = "
Optimizing Primitive Types
[2005.12.31]

Recent experiments show the speed inferiority of Trylon when implementing a Lexer. (C is even faster than Cleet. Trylon:Cleet:C == 1500:500:300.)

Putting type names into signatures could work. When calling a function with a primitive type as one (or more) of the arguments, check at compile time if there's a method with the typed signature. If not, box. The exact check means you can't subclass primitive types (or get boxing if you do), but that's okay, since we only need to optimize primitive types anyway.

Introducing A Javascript Influence
[2005.12.6]

Use Javascript-style objects, but without the indexing. I guess that makes them more like Python objects. But anyway, classes are singletons:

class Entry fields = 'date', 'title', 'paragraphs' create paragraphs = List new class Code superclass = Paragraph fields = 'blank-lines' create super create blank-lines = 0

So class variables now become members of the singleton class object. And they assume greater importance as the compiler treats some of them specially.

(Could go even further with this: "Entry = class", "add-paragraph: = ", but it's probably better not to.)

Anonymous objects: { name = "Thing", callback = at:get: }

For a new language, try going back to traditional (non-Smalltalk-style) calls/arguments. Perhaps these can be unified with the anonymous objects somehow (and maybe that's where Javascript's indices come implicitly in).

do(action = my-action-func, name = "My Action") do = fn | action name args | try action() catch print("Function $name failed!") print(exception.message)

But can it really be compiled well? If we only pass args positionally it's fine, but I don't know how to do the keyword arguments at compile time. How does CLOS do it?

Actually, do it positionally, but the arglist, being an object, can also have names for the slots. The function can be compiled to assert that any names present in the arglist match the names of its arguments. Oppositely, function calls can be optimized by not passing argument names. Scrambled arguments can be disallowed, and/or caught by assertions and descrambled.

Returns in Lambdas
[2005.7.23]

We want to be able to return from the function from inside a lambda:

first-satisfying: test for-each: | element | if test do-with: element return element

This is a non-local return (in the C code). This could be handled by an exception. The function has an implicit try block around the whole thing:

obj_ first_satisfying(obj_ this, obj_ test) { Try_ { //... } Catch_(ReturnInLambda) { return ((struct ReturnInLambda*) __caught_object)->fields[0]; } EndCatch_ }

And add a ReturnFromLambaStatement.

The problems is that there could be a "catch Object" in the method, which would catch this too.

More Class/Package Unification
[2005.7.21]

On exhale, a class/package is exhaled as a directory if it contains other classes/packages, and as a file if not. (Unless, of course, the system kept track of how it was inhaled.)

Would like to syntactically distinguish class from instance functions. Really there are only two options: either class functions need a keyword, or method functions do.

class Foo fn new return new: "" create: name this name = name class Foo new return new: "" to create: name this name = name

The class should become more of a singleton object. (Hopefully without falling into metaclass hell.)

More Closures/Lambdas Syntax
[2005.7.21]
dict for-each-key: args key print-line: key dict for-each-key: | key | print-line: key
Closures/Lambdas Syntax
[2005.6.19]

(See 2004.12.8 for implementation.) Maybe only allow them as the last argument:

songs for-each: | song | print-line: song name

Hmm, this suggests using a sort of "unary |" to introduce a lambda.

block = | song | print-line: song name block do-with: new-song block = | | print-line: "Hello!" block do

A block has "do", "do-with:", "do-with:and:", "do-with:and:and:".

class Collection inject: value into: block for item in this value = block do-with: value and: item return value count = inject: 0 into: | count item | count + 1

Which raises the issue of a lambda's return value, here taken to be the value of the last evaluated expression.

Trylon vs. Cleet
[2005.6.5]

So far I feel like Cleet is faster than Trylon, also it doesn't handle functions with many arguments that well. Could Cleet be quickly given xlon blocks?

require-eol: String comment: String = nil if current-token.type == 'comment' comment = current-token.text consume-token if current-token.type != 'eol' throw ParseException("End of line required.", current-token) consume-token() return comment

But would implementing that really be any quicker than implementing full type-checking in Trylon? And Trylon could use both systems, vtables when possible and an rddtable otherwise.

But first we need to know if Trylon really *is* slower than Cleet.

Declarations Like Assignments; Trylon 2.0
[2005.5.19]

Python has this:

x = 3 # "x" is declared x = 7 # This time "x" is just assigned too.

It seems like a nice feature to have. But it conflicts (somewhat) with Trylon's setters.

But I have, over time, thought a little about how it might be implemented. But I can't fully remember now... I guess it might have something to do with MethodContext's lookup. It can go to its parent if it doesn't know it immediately, but if the parent doesn't know it, it can declare it. No, actually, it's Block that would do it, and it looks perfectly feasable -- and even trivally easy -- from looking at the actual Trylon compiler source.

Only names containing a single (trailing) colon would trigger an auto-declaration (eg. "name:", but not "length" or "draw:at:with:").

If you want to really go crazy, this can probably be extended to fields and classes/packages:

class Foo sinus-congestion = true # Class field. fn create this title = "(Title)" this num-items = 0

This actually achieves the age-old dream of a single syntax for class/package fields in classes and packages -- but not for class/package functions. Also note that in the class-field case, the class is a different syntax (than a block), not just a different context.

Note too that any instance function can declare a field, not just a creator. The field will be nil like any other field until someone assigns to it. However, due to the mechanics of parsing, it's likely that the field will not be visible (without a "this" prefix) to methods that are lexically before (or earlier in parse-order, however that turns out) the method in which it is first auto-declared. Parsing creators first may be desirable, in case one is lexically later.

And getting really out of control, there is no "class-fn", only "fn". I think I discussed this before. But here's the implementation: the MethodContext always checks for "this" or function calls on "this" -- and sets a flag if one is ever used. If true, or if the function has the same name as (that is, overrides) a function in a superclass, the function (and the function it overrides) is an instance function. Gee, this is all exactly what I was thinking on 2005.3.22.

Anyway, while I'm at it, I'm starting to really long for the "setting = true" syntax in the build-settings files.

And while I'm making Trylon 2, get rid of "fn" (or rather, make it optional). "class" and "iff" remain as special "statements" in a class/package. How to do primitives? Even class vs. package is unified. There are only classes, but a class with no instances needn't appear in the method table.

trylon-class SingleThing the-thing = nil thing if the-thing == nil the-thing = SingleThing new return the-thing create this name = "" #...

Also, maybe make any "create-"-prefixed name be a constructor, so we can do this: "Point new-at-x: 10 y: 20".

Revisiting the implementation of auto-declaration of locals, it's not quite as trivially simple as I thought. That's because there can be many levels of Blocks, but only the lowest one can auto-declare. Probably add Context.lookup-function-autodeclaring(name: String). FunctionCall.prepare-to-emit() calls that; nothing else calls it.

Words And Spaces
[2005.5.19]

How about making *all* symbols delimited by whitespace? (Once again, LISP was there first -- almost, anyway. I guess FORTH is one that really is like that.) Probably the worst problem is parens/brackets:

( 3 + 4 ) * 7 x [ Int ] := y

Yeah, forget it.

Mulilingualism
[2005.5.8]

A certain amount of it can be indicated by the first word on a line.

$ ls -l # Human language goes here.

A command line (or possibly an editor) would repeat the previous "prompt" (including the null prompt) as the new prompt. Indentation levels are "prompts"?

Continuation Lines
[2005.4.23]

Make them block sensitive, so the continuation lines don't have to end with the continuator?

glyph-draw-context := -- GlyphDrawContext new: context origin-x origin-y: context origin-y display: display font: cur-font start-y: context start-y end-y: context end-y

Probably best to pay attention to nested indents.

But that implies a layer between the lexer and the parser, doing a tokens->tokens transformation. But if we have such a layer, it can fix up blank lines too, placing them (properly) between blocks rather than at the end of innermost blocks. This layer could possibly be easily made part of the Lexer (but should it be?).

Block Comments
[2005.4.17]

More syntax tests:

#[ This is a big block comment, where we talk about lots of stuff. Lines ending with a space imply that the next line is part of the same paragraph, and the editor should automatically wrap that paragraph, if it's smart enough. Lines not ending with a space end a paragraph; you can block-comment out a block of code without changing its indentation (like you would if you put an "iff not-now" in front of it. #] #{ This is a big block comment, where we talk about lots of stuff. Lines ending with a space imply that the next line is part of the same paragraph, and the editor should automatically wrap that paragraph, if it's smart enough. #} #( This is a big block comment, where we talk about lots of stuff. Lines ending with a space imply that the next line is part of the same paragraph, and the editor should automatically wrap that paragraph, if it's smart enough. #)

I think I like the square brackets best.

New C
[2005.4.9]

What would C be like with a Trylon (or Python) -style syntax? Very much like a fully-typed Trylon, but with C-style function calls, I think:

fn style_co__Carbon__ATSUI__TextLayout -- this [obj_] style [obj_] -> obj_ UsingMethod_(primitive_style) atsuiStyle [ATSUStyle] := -- (ATSUStyle) Call_(primitive_style, style) ATSUSetRunStyle(textLayout, atsuiStyle, 0, 1) return NULL

No reason it can't automatically move declarations to the top. And convert inter-hyphens to underscores, "nil" to "NULL", ...

Another example, this time as an instance function with automatic args and return type:

c-fn int c-string [char*] := MakeCString_(this) # Allow hex literals to set the high bit # (ie. they're unsigned). if c-string[0] == '0' && c-string[1] == 'x' return = strtoul(c-string, nil, 0) else result = strutol(c-string, nil, 0) return BuildInt_(result)

Nice, but what about conflicting syntax ("[]", "label:", "--" etc.)?

Build Settings Files
[2005.3.38]

Since these are already hypertrophying toward a subset of Trylon, is there a way to use the existing parser to deal with them? By feeding the Parser the right Contexts, and by making the ParseNodes do interpretation as well as codegen?

More Functifying; Also Top Level Declarations
[2005.3.22]

Make declarations more function-style (in much the same way as "virtual" is functified now).

trylon-class: FrameRateControl superclass: control fields: frame font displaying fn: create: frame ...

This could be added quickly to the current compiler.

But I've been thinking that I want to ablate any prefix for function definitions. Also, use ":=" to declare fields.

How about this: use "fields:" to declare instance fields, ":=" to declare class/package variables.

Hmm, try having no distinction, syntacticly, between "class/package" functions and "instance" functions. Any function calling "this" is an instance function (and is detected as such by the compiler). This high-level concept corresponds to what's happening at the low level.

However, that requires the compiler to parse the whole function before knowing how other functions can call it (probably not really a problem for the current compiler; not necessarily an issue in a dynamic environment either). Detecting the use of "this" is a little tricky because it can be implicit. (Python doesn't have this problem; it kinda thinks it's C rt a real o-o language.) Probably "this" always shows up in the method's context; we then detect if it ever gets hit. Actually, it's the contents of "this" -- that is, the instance functions (as FunctionCallOnThis's) -- that need to have this detection. Actually, MethodContext.lookup-function() is the place to do it, and it'll be easy for that to know the method. Viola!

But syntacticosemantically, it means that any function in a class can be applied either to an instance or to the class.

The problem: detection also depends on inheritance. And how to declare a virtual (not pure-virtual) that doesn't access "this"? Maybe: a "thisless" function goes into both the class and instance functions. (Maybe the instance one is just an adapter calling the class one.)

Block Comments
[2005.3.20]

It'd be nice to have a specific "block comment".

### This is a big long block comment. It says a lot of things, and would hopefully look nicer in an editor. ###
Compiler Bugs
[2005.2.23]

Found while compiling List:

- Type declarations -- even inside a method -- can't refer to a subclass or subpackage that is declared later in the file. - "== nil" doesn't work. I want nil to still be zero, since that's how objects are initialized. Do it like this: Install a new NilFunction as the definition of "nil" in setup-main(). For "==" and "!=" (parse-equality-expression()), use a new EqualityCall (EqualityObjectCall?). That will check for NilFunction as the argument; if it's not a NilFunction, its emit-code() will make an ObjectCall and use its emit-code(). Since there are no "===" and "!==" operators, we don't need to worry about convert-to-setter-call() and copy().

Class Files
[2005.2.19]

Omit the "class " bit and the consequent indentation of the entire contents of the class. It'd be nice to have an entry declaring the name of the class, rather that just having the name as a comment. It'd also be nice to have it use the "class " syntax, but I'm not going there because I want to retain the subclass usage.

Maybe eliminate the distinction between a Package and a Class? "method" and "field"/"fld" are for instance members, "function"/"fn" and "variable"/"var" for class/package members. Hmm, I think I really want "fn" for instance functions, the commonest case. Maybe try "method" and see how it works out.

Rename the "main" file to "contents"? A class is exhaled as a single file, unless it contains other classes, in which case it is exhaled as a directory. When inhaling from a directory, note whether a "contents" file is present, and don't exhale one if all the members are classes.

Function Argument Declarations
[2005.2.12]

Could streamline the "draw: string x: x y: y" idiom:

fn draw: string x y # is the same as fn draw: string x: x y: y

Could even go all the way:

fn draw x y # is the same as fn draw: arg x: x y: y
"iff" And Primitives
[2005.2.12]

Eventually, we'll need #ifdefs in primitives. Export names in Main as "verbose__exists_", "Darwin__exists_", etc.?

All String Literals Are Symbols
[2005.2.12]

This should work! Or it would in Cleet: Symbol.==(Symbol): Bool, Symbol.==(String): Bool, String.==(Symbol): Bool. In the new lang, do it all in string:

fn ==: other if (this object-ptr == other object-ptr) return true # ... loop thru ...

Not good enough! If != is not also fast, there's no point in Symbols. How about this: "Symbol ==:" is fast; it won't match a String. "String ==:" is slow, and will match a Symbol or a String. This means that '"foo" == string' won't work. But that's okay, it's an ugly idiom (and '"foo" == symbol' does work).

class String fn ==: other iff safer other = other string if this object-ptr == other object-ptr # Fast-path optimization; != can't do this. # Actually, this will probably rarely help. "Symbol ==:" already # is its own method. return true # ... loop ...
Optimization
[2005.2.12]

"Args list as tuple" and Python-style "binding function to object" are both things where the compiler should be able to easily determine that they're used for their common cases, and optimize that (passing args on stack (or in registers) instead of building a Tuple, regular dispatch/call instead of building a binding). The former is easiest; we *know* we built it only as a formality (view draw: string x: x y: y -> view.'draw:x:y:'(string, x, y)).

Name Mangling
[2005.2.12]

__ is space in C names, other escapes as done already. There is no conflict, think of it from the perspective of a reader: _ introduces an escape sequence. __ is space, _XX_ is one of the special characters or "-XX-" if not, any other _ is a hyphen. (___ (triple) could be used for hidden implementation stuff, but don't: all that stuff should use trailing hyphens, putting it out of the mangled namespace entirely. Also, it really can't be used, due to the possibility of adjoining escapes.)

A C function name will now be a mangling of its fully-specified name: "Standard Int +:" -> "Standard__Int___pl__co_".

Oh, but escapes make trailing underscores possible in mangled names... Can we get rid of them?: "Standard__Int___pl_co". Make escapes uppercase to minimize conflicts with hyphen-alpha-alpha-hyphen? "Standard__print_CO", "Standard__Int__PL_CO".

Inhale/Exhale
[2005.2.12]

Inhaling a package/class: read "main" first; include the comments and blank lines there. Members from the directory are then added at the end. Exhale so classes/packages are always both in "main" (just the empty declaration) and the classes/packages in files/directories. (Assuming there's an occasion to exhale.)

The occasion to exhale comes in a dynamic system.

Accept "sources" as a dir for Main. [Done.]

SingleObjectIterator
[2005.2.7]

"iterator" is a function on Object, which responds with a SingleObjectIterator. So a function can iterate over all the elements of "arg", even if only a single object (not a collection) was passed as "arg". Useful for functions operating on files, among other things.

Word-based Lexing
[2005.1.27]

Words are primary; groups of characters surrounded by spaces. Try completely moving away from C-style lexing. The lexer feeds out a stream of words, some of which have special meaning. Binops, for instance.

However, commas and parentheses will still be separated from any adjoining words. '<' and '>' too? (Not periods, though; they're part of a word.)

class Token word [String] type [Symbol]

Some types: 'name', '+', '-', '*', '/', '%', '<=', '=', '+=', 'selector', 'string', 'symbol', 'integer', 'float', 'file-path', 'regex'.

'file-path' examples: foo/bar, foo.c, /dev/foo. But is "/" a 'file-path' or a binop? It's a binop; anything looking for a 'file-path' already needs to accept 'name' as well, so it's no extra difficulty to accept '/'.

Is it called 'name' or 'word'?

Smalltalk-style names
[2005.1.27]

The first arg is positional; its name is "arg". How does this fit in with command-line use? A typical command would have "arg" be a file, or a list of files. "--some-option=foo" becomes "some-option: foo". "--some-option" becomes "some-option: yes". ("yes" is a synonym for "true" in Standard.)

Examples: "ls: docs". "gcc: file.c o: file.o no-frame-pointer: true".

Well, okay, this is not quite Smalltalk style, since all args except the first have names shared by the caller and callee.

Can control structures be done this way? Combine with Ruby-style trailing block argument named "block".

if: x == y # ... else # ...

Here's our first difficulty: multi-branch control structures. Maybe if the "else:" function can get itself attached to the preceding "if:" function somehow...

Also note that they must actually be macros. This is especially clear for the "while" statement:

while: x < 3 x += 7 try # ... throw: ParseException new: "No dice." catch: ParseException write exception

Function names: "if:block:", "else:", "while:block:", "try:", "catch:block:", "continue", "throw:".

Optimizing Primitive Types
[2005.1.4]

Let's look at returning a primitive type from a function call. Such a function can generate two functions: "foo" and "foo -> Standard Int". A caller that expects the Int result will call the latter; the former is a wrapper that boxes the object (the compile-time VlangeFunction system thing can take care of this).

Arguments can be handled similarly: "foo value" and "foo value[Int]"

Misc
[2005.1.1]

How much type inference can we get if only function results are typed? Hey, then we could return "[]" to use in indexing.

Vital insight or blindingly obvious?: It is not the object but the *class* which runs operations on an object. The object is under the control of the class, which can have it change its representation in memory if it wishes.

All field references on an object from *outside* the object's class must go through a function call at runtime.

Bytecode/runtime recompilation on certain class changes. No, on function changes (maximum granularity). Arbitrary changes; unit of compilation granularity is the function. (Sometimes a class will recompile *all* of its functions. Perhaps sometimes a change in the *callers* can trigger a function to recompile. (What for?) Or a change in a called function can trigger recompilation of all callers.)

We want the bytecode to be expressible as objects, in the spirit of LISPish code manipulation. Is it a full-on compact representation of the source, including comments? Or a low-level "expressions and control structures only" representation? (Are they really that far apart? Can comments and blank lines be treated as *annotations* on the source?)

"for" loops are an important target for optimization. Ideally, we want the iterator object completely optimized out in some cases.

Argument Unification
[2005.1.1 11:17PM]

Functions are defined with argument names, but can be called without argument names. In that case, the caller passes arguments named "arg-1", "arg-2", etc. (one-based counting). The called method knows each argument by *both* names (declared and positional).

In the compiler, this means that the same function/method has two names in the dictionary (of its enclosing Context). In the runtime, ditto for any reflection data.

How to reconcile Vlange's "func arg-1 arg-2" naming (compatible with Python and positional calling) with Smalltalk's "func:with-arg:" style? Again, by having two names. (Does this work for both *declaration* styles? I guess not, but at least we can have a unified language (maybe) in which a function must be called using the same calling convention in which it was declared.) Properties of canonical function names: They encode the number of arguments. They encode the calling style.

The Deal-Breaker?
[2004.12.23]

Consider:

array object at: at set-to: set-to

This is really parsed as:

array object at: (at set-to: set-to)

When you probably meant:

array object at: (at) set-to: set-to
Conditional Compilation
[2004.12.21]

Available at both the namespace and method levels.

class Foo iff generate-c fn generate #... iff generate-cpp fn generate #... fn foo iff noisy ("Foo foo: The ", name, " is on.") write-line #...

Start with allowing a name, which is tested for presence in the global namespace (or maybe any namespace in the current context). Possibly then allow context lookups: "Curses supports-line-drawing". The final step would be to have full-on expressions, but of course only compile-time constant expressions, and allowing absence of the given name, which is treated as a false value.

The parser can completely skip a block just by counting indents and unindents.

Dispatch
[2004.12.10]

Want a table-driven approach for the first implementation. Because class hierarchies tend not to be too deep, it should be fine to list every class that implements a function in the list. The exception is functions on Object. In EE3, there are 300 classes! Cleet has 12 functions on Object, so that's burning 14K; also, dispatch becomes slow on functions defined in Object (including ==).

The solution is to have two dispatch functions. One throws a "function not defined" exception if it doesn't find it in the list, the other calls the version on Object.

MethodSpec_ foo__methods[] = { { &SomeClass, foo__SomeClass }, { &SubClass, foo__SomeClass }, { &OtherClass, foo__OtherClass }, { NULL, NULL } }; MethodSpec_ string__methods[] = { { &Int__Standard, string__Int__Standard }, { &String__Standard, string__String__Standard }, //... { NULL, string__Object } };

[2004.12.11]

Hey, it can all be done with *one* dispatch, since that last null entry indicates easily whether it's on Object or not. Also, we want to send message-not-understood instead of throwing an exception.

Compiling Lambdas To C
[2004.12.8]
struct foo__locals { struct Object* this_; struct Object* baz; } struct Object* foo__lambda_1(struct Object* arg, foo__locals* locals) { _Call_(write__sig, arg); _Call_(write__sig, locals->this_->name); _Call_(write__sig, locals->baz); return NULL; } struct Object* foo__SomeClass(struct Object* this_) { foo__locals locals_; locals_->this_ = this_; locals_->baz = _StringLiteral_("blargh"); all_names__SomeClass(&foo__lambda_1, &locals_); }

(I actually worked out some or all of this in a dream! In the dream, it was already mid-2005.)

Hmm, this is sorta edging toward having a context bound up with a function, where that context could be an object or lambda-locals.

How would lambdas work, especially when not the last argument in a function call? Use {}?

have-printed [Bool] := false; ("Three", "Five", "Seven", "Nine") do fn: { | string | if (have-printed) write string: ", " else have-printed = true write string: string } -- when: { |string| string != "Seven" }

Syntactically, it'll look like everything between the braces is still on the same line.

[2004.12.2]

- As a Command-Line Language

I'm surprised I wasn't thinking of this before, as it's actually quite well-suited for it. But then it becomes important that arguments can be given in any order, and probably that there be optional arguments. The verbosity can be mitigated by a predictive command line.

Can a Woosh-ish directories-as-objects system be the way to bootstrap the whole language?

Filesystem access falls out pretty easily, perhaps with a unary '/' operator to access the root:

/ usr bin files usr bin home documents new-language edit edit file: home documents new-language edit file: documents "Program Name" # current object is home-dir, String used as function name ee3 files: ( Notes, documents Titles )

Can't quite use programs as commands directly, but a simple spec should suffice to allow it.

ee3 files # "files" indicates that it takes files; # ee3:file: and ee3:files: will be defined. # Maybe "files" is the default, with "no-files" as an option --plain-text # adds plain-text: argument, if true, adds the arg --style-sheets-dir file # also files, comma-separated-files

That would create the following functions:

ee3:file: ee3:files: ee3:file:plain-text: ee3:files:plain-text: ee3:file:style-sheets-dir: ee3:files:style-sheets-dir: ee3:file:style-sheets-dir:plain-text: ee3:files:style-sheets-dir:plain-text:

Hmm, gets combinatorial fast. Really need optional args...

- Wrapping/continuations

Use "--" to indicate continuations?

throw -- ParseExpression new -- message: "Expression expected." -- # What kind of expression? token: tokens current-token

- Type Declarations

Considering {} for type declarations rt []:

name {String} := "Hello" name [String] := "Hello"

- Characters

Avoid Cleet's difficulties with characters by making them a separate type (Char, not Int). Comparisons, etc., from Strings/Symbols are easy, but how to do assignment?

[2004.11.30.PM]

- Call Syntax

Instead of the above try this:

name ( selector expression )*

So all arguments have names.

foo print view draw text: "Hello" x: 20 y: 100 bytes int at: flat-index * 4 set foo: "bar" // foo = "bar" becomes this foo set bar: 'baz' // foo bar = 'baz' becomes this array '()' indices: { x, y } // array(x, y) becomes this array '()' indices: { x, y } set-to: 'foo' // array(x, y) = 'foo' x '+' arg: y // x + y; "arg" is the argument name of binops

Selector names of the above:

print draw:text:x:y: int:at: set:foo: set:bar: ():indices: ():indices:set-to: +:arg:

Maybe it's fine not to sort argument names. So foo:x:y: is distinct from foo:y:x:.

Should we call nullary functions with a trailing colon too, when we need to indicate their functionness? Eg. print:.

The conceptual clarity of this approach is counterbalanced by its verbosity.

- Object Creation

foo [SomeClass] := SomeClass new name: "Globulus" // same as: (foo [SomeClass] := SomeClass raw-new. foo create name: "Globulus")

- Dispatch

Could work in reverse: selector is the main thing, then check the reciever type to choose which method. For monomorphic functions, receiver type can be verified or not.

- Tuples

Since {} are not used for blocks, we can use them for tuple-building. Roughly:

{ x, 'foo', 7 } // comes out as: temp [Tuple] := Tuple new size: 3 temp at index: 0 set: x temp at index: 1 set: 'foo' temp at index: 2 set: 7 temp

Or maybe, like Python, have a "," operator that makes tuples. It'd be lower precedence than keyword calls.

( foo, bar size, baz at index: 3 ) array(x, y) // same as: array at indices: (x, y)

We'd like these to be a single block of memory:

class (Tuple) num-items item[0] item[1] item[2]

- Comments

Just use "#"? We want to accept it anyway, and it's easier on smart editors if they only have to look for one character to start a comment.

- Wrapping

How to wrap function calls? We want to do this:

ParseExpression new message: "Expression expected." token: tokens current-token

But how do we deal with expressions in "if" and "while" statements?

if some-long-function berndehor: 'Amount' auflunginden: 3.57 do-something

I suppose with enough lookahead it would work: name eol block-start selector etc.

- Declarations

Now we don't have distinguish 'foo' from 'foo:'. Does this give us the ability to go back to colons for type declarations? Function arg declarations without colons have no type.

fn print object print text: object string fn print line: String -> String # hmm, still need a different syntax for return type line-end: String if is-mac line-end = "\r" elseif is-winblows line-end = "\r\n" else line-end = "\n" total-line: String = line + line-end print text: total-line fn bytes-output -> Int

So a selector starting a line is a declaration. But that conflicts with function call wrapping in "if"/"while" statements.

- Syntax

function-declaration: "fn" name argument-decls? return-type-decl? eol block "primitive-fn" name argument-decls? return-type-decl? eol argument-decls: ( name | selector type )* eol block-start ( name argument-decl: name selector type return-type-decl: '->' type eol block-start '->' type eol block-end eol: comment? end-of-line block: block-start block-statement* block-end block-statement: statement comment-line blank-line local-decl: selector type? ( '=' expression )? eol

- Containment

Classes can contain fields, functions, class fields, class functions, classes, superclass declarations, and maybe even packages. A class can "use" a package.

Packages can contain classes, fields, functions, and packages. Fields and functions act like class fields and class functions. A package can "use" another package.

The global level is just the package Main. Actually, no, other packages should see globals but not Main, and Standard should not be known as Main Standard. But globals will probably be implemented as a package; it contains all the same kinds of things. "nil", "true", "false", and "globals" can be defined in the globals.

So there are really functions and instance functions (same with fields). In a package, "fn" defines a function. In a class, "fn" defines an instance function, and "class-fn" defines a function.

- Return Values

*All* functions return something. If a return type is not declared, Object is assumed. All functions implicitly end in "return nil" (should compile down to a single instruction that clears a register).

- Primitives

class FileOutStream superclass OutputStream primitive-fn create path: String primitive-fn write buffer: BytePtr size: Int
[2004.11.30.AM]

- Use indentation like Python.

Honor the line! C/Pascal/Algol pretend the line doesn't exist (probably in reaction against Fortran's assumption that the line was on a punch card). We'll need to allow wrapping (can't quite rely on the editors yet); try to indicate that syntactically (including the use of indentation).

- Function call syntax:

Try to unify C/Fortran/math-style function calls with Smalltalk-style (every argument has a keyword). Every argument has a keyword. But lists of unnamed arguments can also be passed. There is no default assignment of positional arguments, except when the function takes only one argument. Otherwise, names must be given.

draw-text: "Hello." x: 23 y: 27.

matrix[x, y, z] --same as-- matrix at x: x y: y z: z

No, see, that indicates we do want default positional arguments. Or do we?

matrix[x, y, z] --same as-- matrix at: (x, y, z)

where () syntax makes a tuple

I'm leaning toward some hybrid where the first argument has a name that the caller doesn't specify, but all the other arguments are given by name. Here's some tests, with and without type specifiers.

def draw-text: text x y

...

def draw-text: text: String x: Int y: Int

...

def draw-text: text: String, x: Int, y: Int

...

def draw-text: text, x, y

...

def draw-text: text [String], x [Int], y [Int]

...

In the declaration, if the arguments are typed, commas must be used. (Actually, the compiler can easily accept not having commas now that we're using [] for typing, but probably ought to complain.)

x + y // same as: x "+": y matrix(x, y, z) // same as: matrix with: (x, y, z) matrix(x, y, z) = foo // same as: matrix "()": (x, y, z) new-value: foo

Name them like Smalltalk -- draw-text:x:y: -- but order of arguments is not meaningful.

Nullary: draw-text Unary: draw-text: Multiary: draw-text:x:y:

I guess compilers will tend to sort the argument names to canonicalize the argument. To be nice about presenting them to humans after compilation, global knowledge of the different declared orders can be used. If only one such order is declared, show that. If more than one, choose one of the declared orders rt showing the canonical order. Actually, the name of the first argument can even be presented this way.

Writing a "()" operator function:

class TwoDimensionalArray fn "()" [Int]: indices [Tuple] flat-index := indices(0) * row-size + indices(1) return bytes int-at: (flat-index * 4) fn "()": indices [Tuple], new-value [Int] flat-index := indices(0) * row-size + indices(1) bytes int-at: flat-index * 4 = new-value

More declarations:

class Foo field name [String] fn title [String] ... fn title: new-title [String] ... fn at: [String] index [Int] ... // or?: fn at: index [Int] [String] ... // but then how do you declare a return type without an argument // type? fn at: index | [String] ... fn at: index [Int] | [String] ...

- Namespaces

Get serious about namespaces. Not merely in the Cleet sense (of packages/modules), but with the idea that any given section of code exists in a namespace in which names can be looked up. Hmm, perhaps even some degree of runtime lookup would be useful, even if it means that globals must (usually) use a runtime lookup.

Can we fix Visitor this way somehow? Like, there's sort of a dual "this", and a name

with some-object ...

Single names might be function calls, and get involved in setters too[1]. Their hierarchy is like:

Local variables Function arguments Unary functions (including field gets) in "this" All dynamic namespaces, including current package and globals

[1] No they don't. It's a function call:

foo = 3 // same as: foo: 3

Is that right? Does setting fall out nicely? (The two functions are "foo" and "foo:".)

So there could almost be local functions (and why not) since variables -- now *all* variables -- are just special cases of function calls.

By the way, we're not too worried about a proliferation of names. So for example, the first argument to a function could be called "argument" as well as its given name.

Another stab at the hierarchy. These are probably installed on the fly as the compiler compiles:

Locals Arguments This Class Package Used packages Globals

Package refs:

Xlib Window new Xlib Display expose-event // class variable Xlib expose-event // package variable globals Main MyProgram // same as MyProgram this-package Display

- Optimization

Taking a different tack towards Cleet's goal of C-ish speed with Smalltalkish dynamicism. For now, still start with a compiler that reads the entire program and generates C-ABI code. It will report where it cannot use optimizations. For function-call optimization, also make use of global knowledge. Eg. if there's only one function named "eat-my-shorts" and someone calls "eat-my-shorts", anyone who calls "eat-my-shorts" must be calling that type.

- Type-Optional

An object's type is not required (Object is implied), but can be given. There will be different levels of reporting: no type check, type check for optimization... ...lost...

Anyway, if a name's type is given, how hard do we try to ensure that the object bound to that name is that type?

I think we have to have a certain amount of leeway, especially for iteration. For instance:

for view: View in views ...

It's unlikely we'll want anything like Cleet's Qualified Types, so we'll have to just trust that the object we get from "views iterator current" is really a View. In other words, a type declaration is a guarantee *by the programmer* of an object's type.

The Pascal-style syntax is becoming unwieldy. Instead, use an attribute-style syntax:

for view [View] in views ...

- Setters

Just working this out:

foo = 3 // same as: foo: 3 foo x = 3 // same as: foo x: 3 array(x, y) // same as: array "()": (x, y) // Function name is "():" x + y = z // illegal

Hmm, this is not getting us general setters on any conceivable function.

foo globulas: 3 title: "Gah!" = bar

For now, I don't think we care. It doesn't seem to be a truly useful idiom, as long as nullary and operator functions are covered. Wait: unary too:

bytes int-at: flat-index * 4 = new-value // same as: bytes int-at: flat-index * 4 new-value: new-value

- Command Line

Prompt is a "----" above. Maybe a "> " on the command line, with " " as the prompt for continuation lines (if any) so that they line up, or maybe use auto-indentation of some kind, depending on how sophisticated the command line is. Anyway, this divides the screen up into command/response sections.

- Some syntax

expression: binary-expr binary-expr selector binary-expr ( selector binary-expr )* binary-expr selector binary-expr "=" expression binary-expr: mult-expr mult-expr "&&" binary-expr mult-expr "||" binary-expr mult-expr: add-expr add-expr "*" mult-expr add-expr "/" mult-expr add-expr "%" mult-expr add-expr: primary primary "+" add-expr primary "-" add-expr primary: basic basic "(" expression ( "," expression )* ")" ( "=" expression )? basic: name literal "(" expression ")" "(" expression "," expression ( "," expression )* ")" // tuple control-statement: "if" expression eol block ( "else" eol block )* "while" expression eol block "loop" eol block "break" eol "continue" eol "try" eol block ( "catch" name opt-type block )* ( "finally" block )? expression eol name opt-type ":=" expression eol class-declaration: "class" name eol ( declaration )* declaration: "fn" name args? result-type? eol block "function" name args? result-type? eol block "field" name opt-type eol "class-fn" name args? result-type? eol block "class-function" name args? result-type? eol block "class-field" name opt-type eol args: ":" name opt-type ( ","? name opt-type )* result-type: "|"? opt-type

Or alternately:

declaration: "fn" signature result-type? eol block "function" signature result-type? eol block signature: (name | selector name type? ( ("," eol? )? name type? )*) result-type: "|"? type?

- Expression wrapping

add-expr: primary primary "+" add-expr-arg primary "-" add-expr-arg add-expr-arg: add-expr eol block eol rest-of-block

"rest-of-block" will get the remainder of the current block within the wrapped expression. It won't go out of the expression-statement.

So: foo = bar somathanonchon-ablamos-querium + baz fugison * speed-of-light

- To Do

How can class fields/functions be unified with package fields/functions and global fields/functions?

Declaration wrapping.

Object creation. Point new: 10 y: 20. Or Point new x: 10 y: 20. Point init. But the latter's not so good when only one argument is used, and it's the same as a field:

class Parser fn text: text fields text = text lexer = Lexer new text: text

The first one could be syntactic sugar for:

(Point raw-new) create: 10 y: 20