Language Requirements

From dis-Emi-A

Jump to: navigation, search


In order to be a usable language there needs to be a set of overriding principles, or goals, for the language. These must be taken into account when implemented the set of all the desired features, especially with respect to eliminating the common [Programming Errors].

Contents

Computable in polynomial time

All constructs of the language must be realizable in polynomial (realistically no more than N^2, though is that possible with auto-typing?) time and space. This ensures that no construct can be used inefficiently by the user -- that is, the user can freely use any construct and not have to worry about accidentally producing a program which will either take forever to compile or to run.

The N in this consideration should be taken as a unit of the code and not of input passed at run-time to that code (for obviously it will be possible to produce any series of non-polynomial algorithms in a programming language). This limitation implies that the compiler/vm is able to decipher the logic/flow of the code in polynomial time.

This restriction blocks the use of global auto-typing, limiting it to exposable notions on the signatures of functions; global auto-typing would require exponential time for function linking/expansion. This needs to be explained further, especially as to why this is rarely needed in practice.

Convertible to linear time

The above leaves potentially horrible performance holes in compiled code, for as limiting as N^2 is, N can be extremely large for complex programs. Therefore it should be limited as well that there must be a step which can be performed, that through static analysis of the code, can produce a form which is computable in linear time/space. This step itself must also be computable in polynomial time.

Typically this step will be the compiler, and thusfar the biggest conversion from polynomial time to linear time is the auto-typing and function linking.

This requirement is split to a second step since it allows for the existence of interpreters which do not perform the static analysis and are able to add/remove functions dynamically.

This probably also however means that the angelic scheme operator amb can also not be allowed... :(

Note: This is on the assumption that garbage collection can be done in linear time (which on a brief search it can be). Should this not be true we might have to relax this restriction.

Deterministic

For the most part it is clear that a language needs to be deterministic, otherwise in practice it may be somewhat useless.

However, there are situations where non-determinism may be a good choice. Such situations arise when one performs calculations, or requests, and needs only an approximate answer, or an approximate match. In theses cases it seems to be okay if the result is different based on the operating context.

These are surely exceptions though, and while they are worthwhile looking at we must assume that the basics of the language are deterministic.

This determinism may not apply to the time/space of execution however, since compilers/interpreters may use different techniques and optimizations. This would additionally mean that the precision of the results may vary. So determinism isn't an absolute requirement.

Unambiguous

For the purpose of clarity, portability, and maintenance, there can be no lanauge construct, form, name, or other use which could be interpreted in more than one way. Most languages provide this now, but only in a limited fashion (for example C++ allows masking variable declarations that by way of rules resolve an ambiguity, but do not remove it). It would be nice to remove this as well, but without causing undo strain.

Pure

The language itself wishes a program to be purely unambigious, that is there are no rules of precendence, either a single match for an expression is found or it is an error.

Precendence

In practice however one might be mixing code together and occassionally there is a name overlap which cannot be easily resolved (for example in code which you're not allowed to modify). In these cases there will be rules of precendence, but on the understanding that these are "weak errors" and that technically the program is incorrect.

Operators / Math

In order to no be impossible to use, there must be some set of allowed precedence considerations, particularily in the use of math operators -- otherwise we'll end up like lisp with endless series of )'s in the code (which reduces readability).

Incorruptible

No construct of the language may allow the program to behave in a manner not dictated by the standard, nor may any construct be allowed to have an undefined behaviour.

This basically means that you cannot corrupt the environment in which you are running if you stick to using the standard set of commands and library routines (obviously calling non-standard native routines are uncontrollable, so this cannot extend to those).

This is in stark constrast to C where you can randomly assign data wherever you'd like.

One would like to say Java provides this, but it becomes fuzzy if you start using reflection. (It is important to say that should this language also provide reflection the same fuzziness will appear).

Allow safe parallelism

Rather than stating threads must be supported (which are a concrete concept rather than an abstract one), we shall state than some manner of parallelism and asynchronous behaviour must be provided by the language.

This is very important that the language itself provides this support and not a standard library, for the ordering of instructions/memory is very dependent on how the code will be threaded. (Yes, there are lots of programs out there in existing languages that use threads and are not actually guaranteed to work by language specification -- the good will of the compiler not to optimize too much keeps those programs working).

Allow Debugging

It seems trivial, but regardless of how "perfect" we make a language there will always be a need to debug it somehow.

Be Portable

The instrinsics of a language, and its standard library, must work on a wide variety of platforms. This could be clearer stated that the code must be compilable to all common targets of gcc/linux.

This prevents the use of exotic hardware or OS features which inevitable lead to lock-in an disappointment. The logic, however, must nonetheless provide a manner with which to use such specialized features and a manner in which the program itself can determine whether such features exist (this hints towards some kind of compile, or runtime, capabilities checking).

Integrate with existing libraries

The language must provide a manner with which to integrate to existing libraries, without needing modifications to those libraries (in the general case).

It is possible however that the program may need to run a code generator/stubbing tool, in order to interact with the other languages.

Personal tools