Notational

Oct 16, 2010

Visual J Interface (2)

A second take:

1) Types are represented by colours (Nouns: grey, Verbs: yellow, Adverbs: blue, Conjunctions: red). Derived entities are nested, the colour of the outermost box is the type of the derived entity. Using a blend of colours is impractical because of the large number of derived entities. Panels 1 and 2 in the first diagram use shapes in addition to colours to represent the types.

2) Hooks and forks have distinct colours, they are expanded to explicit conjunctions. When clicked the display of hooks and forks could be expanded further to the restructuring of their function arguments, this behaviour could be extended to all conjunctions.

3) Some ideas on visualising execution. A part of the structure may be highlighted and interactively executed(changing the structure as functions are applied) or the whole function may be executed step-by-step. Incomplete.

Oct 14, 2010

Visual J Interface

Reading the J general mailing list I came across a post by Bob Therriault on ideas for a visual J interface. I found his ideas very promising, he evolved the simple J tree display into something more user-friendly and informative.

Building on Therriault's ideas, I have sketched out some ideas I would like in a J visual interface.

1) A Derived entity should take a colour which is a mixture of the colours of its parts. This allows us to quickly identify how it was formed. Restricting verbs, conjunctions and nouns to primary colours makes this easier.

2) Hooks and forks should have distinct colours as they are special constructs in the language.

3) Gerunds should be shaded a common colour(close to a verb's colour preferably) to emphasize that it acts as one element.

4) The arguments to a function (x and y) should be optionally displayed(toggled or and off).

5) The user should be allowed to edit the structure by dragging and dropping elements as they wish and have the code updated automatically.

Some more ideas are detailed in the diagrams below:

An alternate version where all arrows point up, indicating data being fed into a pipeline. Expansions of derived entities are represented by dashed lines. This way the figure can be flipped upside down and still be readable.

Oct 6, 2010

Left-Leaning Bidents

Forks in J, previously discussed here, have a very useful extension when the left element is a noun.

Normally, a fork(also known as a trident) is an isolated series of 3 functions(functions placed one after the other, known as a train), when arranged in this manner they have an invisible combinator applied to them such that if f, g and h are functions and x, y are arguments:

      (f g h) y = f(y) g h(y)
    x (f g h) y = (x f(y)) g (x h(y))

A hook(bident) is an isolated series of 2 functions, this also has an invisible combinator applied to it, such that if f and g are functions and x,y are arguments:

      (g h) y = y g h(y)
    x (g h) y = x g h y

The motivation for using such constructs:

1) It eliminates explicit variables, a lot of useful expressions can be simply written as a series of functions which are converted into hooks and forks by the above mechanism.

    mean=: +/ % #

2) A series of functions can now have dyadic functions embedded inside it(e.g. the central element of a fork). This allows us to use infix functions naturally and frees us from the limitations of a composition-only function application style, a limitation such as being unable to use infix functions in a series of functions without declaring explicit variables. This added expressibility is the reason for 1).

3) It gives utility to otherwise useless but syntactically valid expressions(with minimal additions to the parser). An isolated odd number of functions arranged in a series becomes a fork, an even number of functions becomes a hook.

    diff =: - +/ % #
    diff2=: - mean
    (diff i.5) = diff2 i.5  NB. Equivalent
1
    (diff = diff2) i.5      NB. A fork equivalent to the above
1

If the left element of a fork is a noun, then the fork behaves differently, the left noun is used as a left argument to the dyadic function g(the central element of the fork):

       (n g h) y = n g h(y)
     x (n g h) y = n g (x h(y))

This is equivalent to currying g using n and is often used in the same manner, for example the expression 3n+1 can be written as a fork using this extension:

    1+3*]

The special behaviour exhibited by forks with a left noun element is termed a "Left-leaning fork".

I would like to see a similar extension applied to hooks, a "Left-leaning hook", for the case of a noun followed by a verb. This was inspired by Nial which has a similar syntax for currying. Hooks can then be used to directly curry, no longer needing the conjunction &(bond).

      (n g) y = n g y
    x (n g) y = n g y

For example:

    f1     =: (3*)`(_5+~)@.*
    collatz=: -:`(1+3*])@.(2|)
    f2     =: 3 (4 : 'x%+:>:x+y')   NB. Explicit definition curry

The motivation for this:

1) It simplifies code. No longer needing explicit currying achieved using a conjunction, the purity of hooks and forks is maintained. This is analogous to how cap([:) gives forks the ability to perform function composition, in the same vein we are able to extend hooks to perform currying.

2) It is a logical extension to a currently unused syntactic expression.

    (2+) = (2+])

The implementation of this extension is trivial; the hook will simply check the first element's type, if it is a noun, a left-leaning hook is called in the manner specified above. The current parse table doesn't need to be changed because the line matching hooks already allows nouns on the left side. I have experimented with this in the old j7 source code.

Jun 14, 2010

J's Numeric Constants

Numeric shorthands are useful in simplifying the description of compound number expressions. Most programming languages support the scientific notation for numbers. For example, 3e2 is equivalent to 3*10^2. When mixing shorthands, decimal point and scientific for example, we must choose which is obeyed first. Conventionally, the decimal separator always has the highest precedence, followed by negative signs second and the scientific exponent last.

J extends numeric shorthands so that compound numbers may be written with ease.

.            decimal separator.
_            negative sign.
r            exact ratio, quotient.
e            scientific notation.
ad ar j      complex magnitude of degree angles or radian angles, complex.
p x          pi times(apb = a*pi^b), Euler's number(axb = a*e^b).
b            base representation(2b110 = 5).

The letters are ordered in order of decreasing precedence, the decimal separator being obeyed first. Letters on the same line have the same precedence, there is no associativity so two letters with the same precedence may not be in the same numeric expression.

Some examples of applying these shorthands:

    2e2j2p1
628.319j6.28319
    2e2j_2e2
200j_200

The numeric constants constitute an LL(1) grammar.

  E0 -> E1$
  E1 -> E2 b E2
  E2 -> E3 p E3 | x E3
  E3 -> E4 ad E4 | ar E4 | j E4
  E4 -> E5 e E5
  E5 -> E6 r E6
  E6 -> _ E7
  E7 -> E8 . E8
  E8 -> num

(Elaboration on types)

Apr 14, 2010

Pascal's Triangle

Pascal's Triangle is defined recursively as:

if row = 0 or col = 0 or row = col then 1
else P(row, col) = P(row - 1, col - 1) + P(row - 1, col)

The two elements on the row above and columns directly adjacent to the current cell are added giving us a new element.

    0 | 1
    1 | 1 1
    2 | 1 2 1
    3 | 1 3 3 1
    4 | 1 4 6 4 1
       ----------
        0 1 2 3 4

    P (2,1) = P(1,0) + P(1,1) = 1 + 1 = 2

In tacit J:

    pasc=: (-&1 1 +&$: -&1 0)`1:@.(=/ +. 0&=)
    pasc 2 1
2

An alternative way of calculating pascal's triangle is to use the iteration operator:

    pasc2=: 3 : '(0&, + ,&0)^:(i.y)1'
    pasc2 5
1 0 0 0 0
1 1 0 0 0
1 2 1 0 0
1 3 3 1 0
1 4 6 4 1

Using the binomial coefficient is the best way, following directly from the definition of pascal's triangle:

    pasc3=: [: !/~ i.
    pasc3 5
1 1 1 1 1
0 1 2 3 4
0 0 1 3 6
0 0 0 1 4
0 0 0 0 1

An interesting property of pascal's triangle is that the diagonals add up to the fibonacci numbers:

    </. pasc3 5
+-+---+-----+-------+---------+-------+-----+---+-+
|1|1 0|1 1 0|1 2 0 0|1 3 1 0 0|4 3 0 0|6 1 0|4 0|1|
+-+---+-----+-------+---------+-------+-----+---+-+

    ({. [: +//. pasc3) 10
1 1 2 3 5 8 13 21 34 55

Feb 27, 2010

Smalltalk Notation

Smalltalk, a very influential language, has a very simple syntax for manipulating objects in the OO sense. In Smalltalk everything is an object and all objects respond to messages (methods).

Declaring an object is done as follows.

Object subclass: #Account
    instanceVariableNames: 'balance'
    classVariableNames: ''
!

The above declaration is a normal Smalltalk method call, Object is the receiver of the message subclass:#instanceVariableNames:#classVariableNames:#. This represents one method with 3 keyword(named) arguments. The exclamation mark is a delimiter indicating the end of a Smalltalk statement. The basic syntax of method calls is: obj method: arg. This is extended to simple functions such as +, * and -, which are messages to numbers. Account inherits from Object, like everything else in Smalltalk (Object inherits from nil). The above expression declares the Account class.

The class Account now has access to Object's methods, one of them being the primitive method new which creates a new instance of a class. In our Account class we have an instance variable balance that must be initialized to zero at the creation of a new Account. This requires us to override the new method such that it calls an initialization function when new is called.

We may open a class and insert methods into it any time we wish, the expression for opening a class is !ObjectName class methodsFor: 'description'!, the word class decides whether the methods declared operate on instances of the object(if omitted) or the class itself(if included). This expression opens a block, in this block we place a method's declaration, and close the block with an exclamation mark. We now override the new method to initialize newly created Account objects, the initialize method is declared for instances of the Account, setting balance to zero. A method is declared for accessing an Account's balance (^ is the return operator) and one further method is needed to set the balance to whatever we want.

!Account class methodsFor: 'instance-creation'!
new
    ^(super new) initialize
! !

!Account methodsFor: 'initialization'!
initialize
    balance := 0
!

balance
    ^balance
!

balance: x
    balance := x
! !

Now, using the Account class.

> a := Account new!
an Account
> a balance!
0
> a balance: 20!
20
> a balance: 20 + (a balance)
40

As a consequence of the way objects respond to method calls, expressions in Smalltalk are interpreted from left to right with no precedence. It can be suprising if you are used to normal mathematical precedence rules.

> 2+4*3!
18
>2+(4*3)!
14

The remaining Smalltalk syntax is the block. A block is a piece of code between square brackets, blocks represent anonymous functions in Smalltalk. Blocks are called using the value:# method.

> []!
a BlockClosure
> [:x | x] value: 3!
3
> [:x | x + 1] value: 3!
4
> 1 to: 5 collect: [:x | x + 1]!
#(2 3 4 5 6)

This uniform, homoiconic syntax means that extensions in the language are indistinguishable from native syntax and makes metaprogramming very easy, this is why many Smalltalkers say that Smalltalk can incorporate other programming language features simply as libraries.

Feb 26, 2010

Learning Ursala (3)

In the last post I introduced constructors in Ursala. In cooperation with deconstructors, constructors may be used in pointer expressions to take apart and then put together data (deconstruct and reconstruct data).

An example of this type of manipulation was already presented in the previous post.

    ~&rlX (1,2)
(2,1)

The above example first deconstructs the pair into its right and left component respectively and then constructs a new pair from these values.

Since constructors take two arguments to the left, a problem occurs when chaining complicated expressions. An expression such as ~&httC is invalid because the C constructor takes only 2 arguments (tt in this case), so the C constructor attempts to cons 2 tails of a list (an invalid operation) and h is then used to address this result.

The P constructor simplifies the writing of complex pointer expressions. P takes two deconstructors on its left and bonds them as one indivisible decontructor, applying them in their respective orders. If you know J, this is analogous to the bond verb (@)

    ~&httPC <1,2,3,4>
<1,3,4>
    ~&lrPrlPX ((1,2),(3,4))
(2,3)

Because pairing (or bonding) deconstructors in such a way is very common, Ursala allows you to place a natural number in a pointer expression to indicate the number of consecutive P constructors.

    ~&htt1C <1,2,3,4>
<1,3,4>
    ~&htttPPC <1,2,3,4>
<1,4>
    ~&httt2C <1,2,3,4>
<1,4>

Two other constructors are provided, G (glooming) and I (pairwise relative addressing). Both these constructors accept two left expression arguments. It is easier to discuss these constructors in terms of the transformations they perform on their arguments.

The G constructor transforms an argument of the form (a,(b,c)) into one of the form ((a,b),(a,c)). To quote from the Ursala manual [pdf], "a copy of the left is paired up with each side of the right". A consequence of this is that the expression ~&lrlPXlrrPXX is equivalent to ~&lrG. For example:

    ~&llPrX ((1,2),(3,4))
# (a,(b,c))
(1,(3,4))
    ~&llPrG ((1,2),(3,4))
# ((a,c),(a,d))
((1,3),(1,4))

The I constructor's actions are specified for four cases outlined in the table below.

The function of the I constructor is unstable right now, it is subject to future change and any uses deviating from the table are considered unspecified behaviour.

(Continued later)

Feb 24, 2010

Learning Ursala (2)

Ursala's pointer expressions have nothing to do with pointers in the C sense(the name doesn't help in alleviating that confusion), they are so named because they are used to address(or point to) elements of a data structure.

Pointers are created using the ampersand character (&). An ampersand without arguments represents the identity pointer. The tilde character (~) is used to invoke pointer expressions, it evaluates a pointer to the function induced by it. To understand what this means, the different types of pointer expressions must be discussed.

Deconstructors are one type of pointer expression, they are used to deconstruct a data structure.
They take a data structure as an argument and return a component of the argument as a result. Functions such as tail and head in functional languages can be thought of as deconstructors. Ursala offers a rich set of deconstructors for manipulating various data types.

To illustrate the idea of pointers more clearly, let us take an example of manipulating a list. Lists in Ursala are comma separated values(numbers, strings, ...) delimited by angle brackets. To ease in reading, any line of code preceded by 4 spaces is an expression entered at the prompt, the result is displayed below the input with no spaces preceding it. Comments are prefixed by a #.


    <1,2,3,4>
<1,2,3,4>
    # identity pointer
    &
&
    # call the identity pointer
    ~& 
<0,0>
    ~& <1,2,3>
<1,2,3>
    ~&h <1,2,3>
1
    ~&t <1,2,3>
<2,3>

Looking at the examples above, it is clear that h and t represent the head and tail deconstructors respectively. Deconstructors may be chained together to form more complex deconstructors.


    ~&tt <1,2,3,4>
<3,4>
    ~&th <1,2,3,4>
2
    ~&ttt <1,2,3,4>
<4>

Since the h and t deconstructors operate on lists, such an expression is invalid.


    ~&ht <1,2,3,4>
0 # zero represents false or nil in Ursala.

It is now clear how we can extract single elements from a data structure, but what if we want to extract more elements? We may achieve this by arranging the pointer expressions in the form we wish to receive the final output. For a deconstructor returning a tuple data structure, we arrange the pointer expressions in the form of a tuple. For a deconstructor returning a list, Ursala provides a semicolon as syntactic sugar (think of it as a cons operator) which joins a head(single element) on the left of the semicolon to the tail(a list).


    ~(&ll, &rr) ((1,2),(3,4))
(1,4)
    ~&h:&tt <1,2,3,4>
<1,3,4>

The need for a less clumsy syntax when manipulating complex compound deconstructors is the reason constructors exist, they are the second type of pointer expression in Ursala. A constructor is a combinator which takes deconstructors as arguments, manipulating them in such a way as to construct a data structure from their results.

The X constructor takes two consecutive deconstructors on its left hand side and groups them to form a pair.


    ~&rlX (1,2)
(2,1)

There are constructors for each data structure, the C constructor (or cons) is used to join the result of two pointer expressions as a list. Constructors allow us to chain more deconstructors together, forming more complex selections.


    ~&htC <1,2,3,4>
<1,2,3,4>
    ~&hrlX <(1,2),(3,4)>
(1,1)

A table of the various data structures and their corresponding deconstructors/constructors is available in the Ursala manual.

Later I will be discussing more of the available constructors and deconstructors.

(Continued later)

Feb 23, 2010

Learning Ursala

I have recently been interested in combinatory logic, it is a notation developed to remove the need for explicit variables in function definitions. As you may have guessed, this dramatically reduces code size.

The essence of combinatory logic is to manipulate a set of data using function composition without having to reference the data in each of the functions. Combinatory logic is a form of lambda calculus, which is the basis of the functional programming paradigm(to use a cascade of functions to transform an input into a required output).

A combinator is a higher-order function that accepts other functions as arguments and produces a function application as output.

The language J implements some of the combinatory primitives discussed by Haskell Curry. Of particular interest are the concepts of hooks and forks in J.

Let us define a function g, that takes a number and calculates the sum of its square and factorial. An explicit definition of this function:

g(x) = (x! + x^2)

The concept of a fork defines how a series of adjacent functions is applied to an argument. In J, if f g and h are functions, a fork is reduced in the following manner.

(f g h) x = f(x) g h(x)

while a hook is reduced in the following manner:

(g h) x = x g h(x)

We may use this concept to shorten the explicit function g(x) defined above:

g=: ! + *:

Note that these combinators are implicit in J. Grouping two functions applies a hook combinator to them, grouping three functions ( a train in J terms) applies a fork to them. Various other combinators are present in the language that remove the need to explicitly reference variables in function compositions. This is one of the reasons J's syntax is so terse.

Ursala is another language that implements a form of combinatory logic, although it differs considerably from J's idea. Ursala solves the issue of eliminating variables from functions by using a form of variable addressing called pointer expressions.

(Continued later)

Hello World!

By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and in effect increases the mental power of the race.

- A. N. Whitehead