Computing Pages

by Francesc Hervada-Sala


On Text Structure

Last update on Sat Apr 23, 2011.

Text Integrity

It should be possible to define integrity rules that apply to all instances of a type.

For example:

^website {
    ^(1) title : ustring
    ^(1-)stylesheet : ustring
}

Each website must have exactly one title and one or more stylesheet. Compiling should abort if a website is defined which does not fullfill this.

Such integrity validation requires a sort of "transaction" capability. (An implicit transaction commiting when leaving each level or at the end of each read OS file would be too unflexible.) One can commit a transaction only if all validation succeeds. Only commited data can be read.

The possibilty to require a constant minimum and maximum amount of instances would be surely useful, but validation should be generalized. Perhaps as a post-load trigger, that gets called when commiting a unit. One binds a unit with a post-load transformation, that gets called by the system and has the chance to return an "abort" signal that causes the commit to fail.

Text Triggers

Text transformations can be bound to a text trigger, so that they get called when a particular event occurs. For example:

Load-Trigger. Fired when the unit is requested to be loaded. It is responsible for parsing some source files or getting somehow the corresponding text and entering it.

Preprocess-Trigger. Fired when reading a unit to preprocess the UTL before parsing it.

Postload-Trigger. Fired after loading a unit. For post-process and validation purposes. It can abort the load returning an error signal.

Output-Trigger. Responsible for conversion, for example generating HTML pages, Open Office Documents or Latex files. showhide Perhaps not a trigger. Compare to a parser. Or is a parser a trigger, too? What about a ”cast“ operation to transform a unit from one type into another?

Perhaps not a trigger. Compare to a parser. Or is a parser a trigger, too? What about a ”cast“ operation to transform a unit from one type into another?

The triggers are bound to a type and get called by the system when the event occurrs at any instance of it.

Maybe triggers should be possibly bound to single units, too.

Text Query

Let us now think about what semantics query languages should have, independent of the way of coding expressions.

One can define conditions on single levels and relate them to each other. Conditions on single levels:

Conditions relating levels:

Each condition can be negated.

Text Formula

Update March 8, 2010. After I've tried to implement text mit multiple parents, roles and types (s. below), I think now this is not the right way. Not only it is difficult to implement but the implemented model is ugly: there are ”units“ on the one side and ”relationships“ between them on the other side. I think it is cleaner to have just a single rule for defining units as a four-way relationship without exceptions. The question below does not seem to me now as an open question, but I let it here for future reconsideration.

At UText/1 each text unit has exactly one parent, one role and one type. I think the real text structure is more general: a single unit can participate in a text more than once, each time having one parent, one role and one type.

Isn't it somehow obvious? We can establish relationships between symbols, each time we say something about one of them, but we can make lots of sentences with each symbol, we say this and we say that. (But perhaps this is incorrect, perhaps each text unit should be unambiguous by nature, and we should build explicitally compound units instead, being each component a partial assertion.)

The implementation of the text structure is of course not the problem. Instead of an array of 3 scalars parent, role, type, one needs an array of 4 scalars: unit, parent, role, type. The same unit occurs more than once in this array structure.

But I can not foresee if this is going to work well or difficulties will arise. What about navigating the structure? What about selectors? Perhaps one should just implement the structure in a mock up and see.

Single Type, Multiple Parents and Roles

It probably makes sense to have each unit of a unique type, which describes it internally. This unit can then participate as child in many parents, playing each time a role.

=p {
    =u ~r :t
}
=q {
    =p.u ~s :t
}

Either one must always repeat the type or only give it once, but that depends upon feed order, which is not so good.

One needs perhaps a notation not fixing the parent and enumerating children but fixing the unit and enumerating all parents.

=u :t |{
    =p ~r
    =q ~s
}

Note that the role is here the child's role, not the parent's!

Or in a single line:

=u :t |=p ~r
=u :t |=q ~s

Multiple Parents, Roles and Types

The general case.

=u :t |=p ~r
=u :v |=q ~s
Print Contact

On Text Structure

Text Integrity

Text Triggers

Text Query

Text Formula

Single Type, Multiple Parents and Roles

Multiple Parents, Roles and Types

Project Universal Text

Forerunner

UText/1

Milestones

Text Engine

Text Repository

Text Server

Text Workbench

Text OS

Design Documents

Concepts

Universal Text Language

UTL Syntax

UTL Name System

Architecture

Glossary

Discussion

On Text Structure