
-- Clunk --

Clunk is a language for putting shapes together.  You define ASCII
shapes, and they get placed on a grid according to some rules.  These
rules are simple - (1) shapes must match other shapes at the edges
where they touch, and (2) some shapes can't be placed unless they can
touch others in enough places.  Clunk is nondeterministic.  Clunk has
no input or output, but its execution state is intended to be viewed.

This distribution contains an interpreter in perl.  Usage is:

  clunk <filename>

This interpreter uses an 80x24 grid and will probably look best in an
80x25 terminal.  It tries to use form feeds to clear the screen.

There are also some example programs:

- tree.clunk, curl.clunk and road.clunk just draw pretty patterns,
which is what this language does best.

- hello.clunk is a sort of hello world program.

- binadd.clunk demonstrates how clunk can add binary numbers.  The
bits are the letters 'I' and 'O', and the (four-bit) operands are read
left to right (least significant bit on the left).  The result
eventually appears at the top.

Be aware that these programs can take a while to perform an execution
step sometimes (e.g. maybe a minute or two on a P166).  This is partly
because the interpreter uses randomness to illustrate the
nondeterminism the language allows, partly because I didn't optimise
it much, partly because it is in perl, and partly because of the
nature of Clunk.

Licence: this stuff may be freely distributed and modified by anyone
who wants to, especially the Esoteric Awards Committee.

======================================================================


-- The Clunk Language Specification --

The source is considered as a grid of characters, and consists of a
number of ASCII shapes.  Characters 32-126 are the only ones allowed,
except for line endings, which don't count as part of the source
except to help represent a grid.

A "shape" is a maximal set of contiguous non-space characters.  A
non-space character is therefore only ever in one shape.  Diagonal
characters are not contiguous.

Shapes may appear anywhere in the source.

The start shapes are all the shapes containing the '@' character,
except if no shapes in the source contain the '@' character, in which
case all the shapes are start shapes.

The "connectitude" of a shape is given by the sum of all digits
contained in the shape.  If a shape does not contain any digits, its
connectitude is 1.  A connectitude of 0 is allowed (and is achieved by
putting one or more 0's, and no other digits, in the shape).

The "field" is an infinite grid of squares, except if the implementor
is lazy, in which case it may be an 80x24 torus, or some other size of
torus, or a finite rectangular grid, or some more exciting shape.
Initially the field is all spaces.

An execution step consists of a shape being placed on the field - that
is, its characters copied in the same relative positions.  Shapes are
not rotated, reflected, or otherwise transformed when being placed on
the field.  Shapes once placed are never removed.

A shape may not overlap a shape that is already on the field.

A character of a shape being placed is said to "abut" a non-space
character already on the field if it is next to it, not including
diagonally.

When a shape is placed on the field, each character in that shape that
abuts any characters must match all characters that it abuts.

A shape can only be in a position placed on the field if at least N of
its characters would abut characters already on the field, where N is
the connectitude of the shape being placed.  Therefore, this rule puts
no restriction on where shapes with a connectitude of 0 may be placed,
and says that shapes with a connectitude of 1 must touch something.
(The count is of the number of characters in the shape being placed,
not the total number of characters that they will abut, which could be
greater.)

At each execution step the shape and position are chosen
nondeterministically from the set allowed by the above rules.  If no
shape can be legally placed in any position, the program terminates.
(An implementation using a finite field must therefore always
terminate.)

At the first execution step, the shape which is placed must be one of
the start shapes, and any position is allowed for it - the
connectitude rule is ignored for this step.

The language has no input or output.  Implementations are therefore
encouraged to display the state of the field during execution, or at
least when the program terminates.

A "Boring Clunk Implementation" is one satisfying the above rules.  A
"Visual Clunk Implementation" is one that also displays the field
during execution.  A "Multimedium Clunk Implementation" is a Visual
Clunk Implementation which also makes a "clunk" sound whenever a shape
is placed on the field.  The "clunk" sound must be approved by the
appropriate standards body.


======================================================================


-- Unconvincing sketch of proof of possible Turing Completeness --


Assume a Clunk implementation with an infinite plane.

Consider a standard Turing machine with one semi-infinite tape (if
semi-infinite is the word I want to mean infinite in one direction
only), a tape head position, a finite alphabet of tape symbols, a
finite number of internal states, and a state transition function
which for each (symbol under head, internal state) pair, maps to a
symbol to replace the one under the head, a new internal state, and a
move (left, right, or stay in the same place).

We will recreate the whole state of the machine, including tape, on
the clunk plane for each execution step.  It will look like this:

  SthththtHth

Obviously each character here is standing for a whole bunch of clunk
characters.  Approximately, each "th" or "tH" is a single clunk shape,
and they are all different.  And they carry lots of extra data so the
right shapes can match each other too.

The S is some kind of start marker.  The t's are tape cells with an
encoding of the symbol currently in that tape position.  The h's are
placeholders beacuse the tape head is not at these positions.  The H
is next to the tape cell where the head is.  As well as indicating
where the tape head is, the H contains an encoding of the machine's
internal state.

We don't represent the whole infinite tape, but only up to the
rightmost position the tape head has ever reached.  Any tape position
off the right hand end is assumed to hold a default alphabet symbol.

The next state of the machine will be built on top of this one.  Here
the tape head has moved to the right.

  SththththtH
  SthththtHth

Our clunk shapes are defined so that for positions which aren't where
the tape head is or next to it, only an identical "th" encoding the
same tape symbol can be put on top of a "th".  For the position with
the tape head, appropriate shapes can be defined so that if the state
and symbol are such that the tape head should stay still, a "tH" will
be placed in the new row, and the "t" will be the appropriate new
symbol.  Similarly the positions next to the tape head may have a "tH"
placed if the head is to move to them.  The "t" part would remain the
same in this case, and a "th" with a possibly different "t" part would
be placed above the old tape head position.

The new shape that is placed above each position has to be next to the
tape symbol and tape head/state information for its tape cell and the
ones on either side.  I think this may be physically impossible, so
some duplication of information will be required.

A possible scheme: We ensure that the new pieces are placed from left
to right using connectitude numbers.  So in the upper row, the piece
with an "A" in it is placed first, then "B", then "C".  Each of a, b,
c, d and z represents all the old symbol+head+state data for a tape
cell ("th" or "tH" above).  Each new piece has data which must match
below, to the left, and to the right.  It also copies some old data
from below itself to assist the next piece to the right.  A, B, C are
the new values for the three tape cells.


#A#     #B#     #C#     
A #     B #     C #
# #     # #     # #     
# ##### # ##### # ##### 
z     a a     b b     c 
#a### # #b### # #c### # 
    # #     # #     # # 
#a# # # #b# # # #c# # #
a # # b b # # c c # # d
# # ### # # ### # # ###
# #     # #     # #


We may need a large number of pieces to get all the combinations
required, especially as we have to deal with things next to the start
and end as well.  But it is still a finite number and Clunk has no
limit on the number of shapes it allows.

If every row is built up left to right, then one row can start
building before the one below it is complete.  This gives a weird kind
of lazy Turing machine.  But if we say that the TM terminates when the
tape head moves off the left hand end, then we can make the Clunk
model stop building rows when this happens (with some incompatible
shape instead of the usual start marker).  Then the unfinished rows
will eventually all get filled up, at which point the Clunk program
terminates.

Alternatively we might be able to build up the rows boustrophedon
style, then only one row could be building at once.

The initial machine state is easily set up as one great big start
shape.  By convention the final state would be read off the topmost
row.  And there you go, it's a Turing machine.

Q.E.D.

are you convinced?  Me neither.  It might work though.
