Feb 27, 2010

Smalltalk Notation

Smalltalk, a very influential language, has a very simple syntax for manipulating objects in the OO sense. In Smalltalk everything is an object and all objects respond to messages (methods).

Declaring an object is done as follows.

Object subclass: #Account
    instanceVariableNames: 'balance'
    classVariableNames: ''
!

The above declaration is a normal Smalltalk method call, Object is the receiver of the message subclass:#instanceVariableNames:#classVariableNames:#. This represents one method with 3 keyword(named) arguments. The exclamation mark is a delimiter indicating the end of a Smalltalk statement. The basic syntax of method calls is: obj method: arg. This is extended to simple functions such as +, * and -, which are messages to numbers. Account inherits from Object, like everything else in Smalltalk (Object inherits from nil). The above expression declares the Account class.


The class Account now has access to Object's methods, one of them being the primitive method new which creates a new instance of a class. In our Account class we have an instance variable balance that must be initialized to zero at the creation of a new Account. This requires us to override the new method such that it calls an initialization function when new is called.


We may open a class and insert methods into it any time we wish, the expression for opening a class is !ObjectName class methodsFor: 'description'!, the word class decides whether the methods declared operate on instances of the object(if omitted) or the class itself(if included). This expression opens a block, in this block we place a method's declaration, and close the block with an exclamation mark. We now override the new method to initialize newly created Account objects, the initialize method is declared for instances of the Account, setting balance to zero. A method is declared for accessing an Account's balance (^ is the return operator) and one further method is needed to set the balance to whatever we want.

!Account class methodsFor: 'instance-creation'!
new
    ^(super new) initialize
! !

!Account methodsFor: 'initialization'!
initialize
    balance := 0
!

balance
    ^balance
!

balance: x
    balance := x
! !



Now, using the Account class.

> a := Account new!
an Account
> a balance!
0
> a balance: 20!
20
> a balance: 20 + (a balance)
40


As a consequence of the way objects respond to method calls, expressions in Smalltalk are interpreted from left to right with no precedence. It can be suprising if you are used to normal mathematical precedence rules.

> 2+4*3!
18
>2+(4*3)!
14


The remaining Smalltalk syntax is the block. A block is a piece of code between square brackets, blocks represent anonymous functions in Smalltalk. Blocks are called using the value:# method.

> []!
a BlockClosure
> [:x | x] value: 3!
3
> [:x | x + 1] value: 3!
4
> 1 to: 5 collect: [:x | x + 1]!
#(2 3 4 5 6)


This uniform, homoiconic syntax means that extensions in the language are indistinguishable from native syntax and makes metaprogramming very easy, this is why many Smalltalkers say that Smalltalk can incorporate other programming language features simply as libraries.

Feb 26, 2010

Learning Ursala (3)

In the last post I introduced constructors in Ursala. In cooperation with deconstructors, constructors may be used in pointer expressions to take apart and then put together data (deconstruct and reconstruct data).

An example of this type of manipulation was already presented in the previous post.

    ~&rlX (1,2)
(2,1)


The above example first deconstructs the pair into its right and left component respectively and then constructs a new pair from these values.

Since constructors take two arguments to the left, a problem occurs when chaining complicated expressions. An expression such as ~&httC is invalid because the C constructor takes only 2 arguments (tt in this case), so the C constructor attempts to cons 2 tails of a list (an invalid operation) and h is then used to address this result.

The P constructor simplifies the writing of complex pointer expressions. P takes two deconstructors on its left and bonds them as one indivisible decontructor, applying them in their respective orders. If you know J, this is analogous to the bond verb (@)

    ~&httPC <1,2,3,4>
<1,3,4>
    ~&lrPrlPX ((1,2),(3,4))
(2,3)


Because pairing (or bonding) deconstructors in such a way is very common, Ursala allows you to place a natural number in a pointer expression to indicate the number of consecutive P constructors.

    ~&htt1C <1,2,3,4>
<1,3,4>
    ~&htttPPC <1,2,3,4>
<1,4>
    ~&httt2C <1,2,3,4>
<1,4>


Two other constructors are provided, G (glooming) and I (pairwise relative addressing). Both these constructors accept two left expression arguments. It is easier to discuss these constructors in terms of the transformations they perform on their arguments.

The G constructor transforms an argument of the form (a,(b,c)) into one of the form ((a,b),(a,c)). To quote from the Ursala manual [pdf], "a copy of the left is paired up with each side of the right". A consequence of this is that the expression ~&lrlPXlrrPXX is equivalent to ~&lrG. For example:

    ~&llPrX ((1,2),(3,4))
# (a,(b,c))
(1,(3,4))
    ~&llPrG ((1,2),(3,4))
# ((a,c),(a,d))
((1,3),(1,4))


The I constructor's actions are specified for four cases outlined in the table below.



The function of the I constructor is unstable right now, it is subject to future change and any uses deviating from the table are considered unspecified behaviour.

(Continued later)

Feb 24, 2010

Learning Ursala (2)

Ursala's pointer expressions have nothing to do with pointers in the C sense(the name doesn't help in alleviating that confusion), they are so named because they are used to address(or point to) elements of a data structure.

Pointers are created using the ampersand character (&). An ampersand without arguments represents the identity pointer. The tilde character (~) is used to invoke pointer expressions, it evaluates a pointer to the function induced by it. To understand what this means, the different types of pointer expressions must be discussed.

Deconstructors are one type of pointer expression, they are used to deconstruct a data structure.
They take a data structure as an argument and return a component of the argument as a result. Functions such as tail and head in functional languages can be thought of as deconstructors. Ursala offers a rich set of deconstructors for manipulating various data types.

To illustrate the idea of pointers more clearly, let us take an example of manipulating a list. Lists in Ursala are comma separated values(numbers, strings, ...) delimited by angle brackets. To ease in reading, any line of code preceded by 4 spaces is an expression entered at the prompt, the result is displayed below the input with no spaces preceding it. Comments are prefixed by a #.


<1,2,3,4>
<1,2,3,4>
# identity pointer
&
&
# call the identity pointer
~&
<0,0>
~& <1,2,3>
<1,2,3>
~&h <1,2,3>
1
~&t <1,2,3>
<2,3>


Looking at the examples above, it is clear that h and t represent the head and tail deconstructors respectively. Deconstructors may be chained together to form more complex deconstructors.


~&tt <1,2,3,4>
<3,4>
~&th <1,2,3,4>
2
~&ttt <1,2,3,4>
<4>


Since the h and t deconstructors operate on lists, such an expression is invalid.


~&ht <1,2,3,4>
0 # zero represents false or nil in Ursala.


It is now clear how we can extract single elements from a data structure, but what if we want to extract more elements? We may achieve this by arranging the pointer expressions in the form we wish to receive the final output. For a deconstructor returning a tuple data structure, we arrange the pointer expressions in the form of a tuple. For a deconstructor returning a list, Ursala provides a semicolon as syntactic sugar (think of it as a cons operator) which joins a head(single element) on the left of the semicolon to the tail(a list).


~(&ll, &rr) ((1,2),(3,4))
(1,4)
~&h:&tt <1,2,3,4>
<1,3,4>


The need for a less clumsy syntax when manipulating complex compound deconstructors is the reason constructors exist, they are the second type of pointer expression in Ursala. A constructor is a combinator which takes deconstructors as arguments, manipulating them in such a way as to construct a data structure from their results.

The X constructor takes two consecutive deconstructors on its left hand side and groups them to form a pair.


~&rlX (1,2)
(2,1)


There are constructors for each data structure, the C constructor (or cons) is used to join the result of two pointer expressions as a list. Constructors allow us to chain more deconstructors together, forming more complex selections.


~&htC <1,2,3,4>
<1,2,3,4>
~&hrlX <(1,2),(3,4)>
(1,1)


A table of the various data structures and their corresponding deconstructors/constructors is available in the Ursala manual.



Later I will be discussing more of the available constructors and deconstructors.

(Continued later)

Feb 23, 2010

Learning Ursala

I have recently been interested in combinatory logic, it is a notation developed to remove the need for explicit variables in function definitions. As you may have guessed, this dramatically reduces code size.

The essence of combinatory logic is to manipulate a set of data using function composition without having to reference the data in each of the functions. Combinatory logic is a form of lambda calculus, which is the basis of the functional programming paradigm(to use a cascade of functions to transform an input into a required output).

A combinator is a higher-order function that accepts other functions as arguments and produces a function application as output.

The language J implements some of the combinatory primitives discussed by Haskell Curry. Of particular interest are the concepts of hooks and forks in J.

Let us define a function g, that takes a number and calculates the sum of its square and factorial. An explicit definition of this function:

g(x) = (x! + x^2)

The concept of a fork defines how a series of adjacent functions is applied to an argument. In J, if f g and h are functions, a fork is reduced in the following manner.

(f g h) x = f(x) g h(x)

while a hook is reduced in the following manner:

(g h) x = x g h(x)

We may use this concept to shorten the explicit function g(x) defined above:

g=: ! + *:

Note that these combinators are implicit in J. Grouping two functions applies a hook combinator to them, grouping three functions ( a train in J terms) applies a fork to them. Various other combinators are present in the language that remove the need to explicitly reference variables in function compositions. This is one of the reasons J's syntax is so terse.

Ursala is another language that implements a form of combinatory logic, although it differs considerably from J's idea. Ursala solves the issue of eliminating variables from functions by using a form of variable addressing called pointer expressions.

(Continued later)

Hello World!

By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and in effect increases the mental power of the race.

- A. N. Whitehead