ugBASIC User Manual

Data types

Language ugBASIC offers many built-in data types, some primitives and some complex: it has integer, floating-point, and string variables.

Defining Integers Numeric systems Strings Resources Bits

Defining variables

The ugBASIC supports various data types. Variable can be defined with a specific datatype by using the command DIM, the command VAR or by suffixes on the variable name. It also support decimal math for the default implementation of the floating point system, and the construction of array variables using the DIM keyword, as well.

Basic syntax:

DIM x AS INTEGER
VAR y AS STRING
z = "string"

Integer types

In ugBASIC the integer types are closest to the manipulation capabilities of the retrocomputer. Integers are numbers without the decimal point. They are represented by one or more bytes, depending on the maximum value, and may or may not have a sign (relative numbers).

KEYWORDSUFFIXRANGESIZE (bytes)
BYTE
CHAR
0..2551
SIGNED BYTE@-127..1271
WORD0..65.5352
INT
INTEGER
SIGNED WORD
%-32.767..32.7672
DWORD0..4.294.967.2954
LONG
SIGNED DWORD
&-2.147.483.648..2.147.483.6484
ADDRESS0..65.5352

The ugBASIC language provides a default data type for integers. This type is called the "default type". If you do not specify a different type, the default type is INT (SIGNED WORD). However, it is possible to change the default type with the DEFINE DEFAULT TYPE command:

DEFINE DEFAULT TYPE LONG

When you type a number in the source code (as a constant), i.e. 42, it is implicitly converted to a numeric type using the standard (wide) approach, that converts the number in a 2 byte value, or better, into the default data type. It is also available a optimized (narrow) approach, that converts the number in the most compact space, i.e. 1 byte.

The first approach is the standard one, which can be strengthened with the OPTION TYPE WIDE command: in this mode, the number is converted into the default data type, which is the INT (SIGNED WORD) type, and it occupies 2 bytes.

The second approach is the optimized one, which can be selected with the OPTION TYPE NARROW command: in this mode, the number is converted into the smallest data type capable of containing the number. So 42 is converted to a SIGNED BYTE value that occupies 1 byte.

Note that fact that, in the latter case, it is necessary to consider that, if you also want to take signed types into consideration, the smallest data could be the immediately higher one, precisely due to the need to represent the sign bit: so 150 will be translated to SIGNED WORD since is greater than 127.

If you want to change this behaviour, you can use the OPTION TYPE UNSIGNED command, and the 150 will be converted into a simple (unsigned) BYTE.

Other important facts about integer types:

  • BYTE and CHAR are the most efficient data type, since it is the data type more similar to the one managed by 8 bit CPUs. The BYTE is typically used as array indices and loops. The CHAR is used for single character;
  • SIGNED BYTE is the signed version of BYTE: it means that support any negative and boolean values (TRUE is -1 and FALSE is 0);
  • WORD offers an higher range of positive values with fast computation performances;
  • INT (or INTEGER or SIGNED WORD) offers an higher range of values and they support negative numbers, although it is fast but not faster than WORD;
  • ADDRESS is similar to WORD but it can be used to specify memory addresses, like in POKE or PEEK commands;
  • DWORD offers an even higher range of positive values, but it is slower than WORD;
  • SIGNED DWORD (or LONG) offers an higher range of values and they support negative numbers, although it is not faster than DWORD.

Numbers and numerals

Numbers (numeric literals) can be written in any of the following numberal systems: decimal, hexadecimal and binary. Each form can be expressed in various ways. As described by the previous chapter, binary and hexadecimal numbers are assumed unsigned or signed based on default data type.

NUMERALFORMATEXAMPLE
Decimal[-]...42
Hexadecimal$...
&...
0x...
[0-9]...h
[0-9]...H
042h
Binary%...%11000110

Strings

Strings are fixed-length or variable length series of ASCII characters. You can define (fixed) length strings by enclosing text inside double quotes ("):

"this is a static string!"

You can assign fixed string to variables, and this will transform them in dynamic strings:

x = "this is a dynamic string!"
MID$(x, 1, 4) = "that"
PRINT x : ' it will print "that is a dynamic string!"


Depending on which target you are considering, dynamic strings can be a maximum of 255 characters or 127 characters long, and take up to 1024 bytes. You can change this overall limit. Furthermore, ugBASIC is able to handle a maximum of 256 (or 128) different ones at the same time, and you can change this limit, too. There is no limit for static strings.

In order to optimize the occupied memory, or to increase the possibilities of the language, it is possible to define the maximum number of dynamic strings that the language must be able to process. This is done with the DEFINE STRING COUNT instruction:

DEFINE STRING COUNT 100

The maximum value is 128 on targets based on MOTOROLA 6809 and 256 on all the other targets.

In general, each dynamic string needs 4 bytes to be fully described, in addition obviously to the space occupied by the actual characters of the string itself. This means that, if the maximum number of strings that can be processed simultaneously is set as 100, as in the previous example, they will occupy 400 bytes.
In order to optimize the occupied memory, or to increase the possibilities of the language, it is possible to define the maximum space allocated for dynamic strings that the language must be able to manage. This is done with the DEFINE STRING SIZE instruction:

DEFINE STRING SIZE 1024

The maximum value depends on the available memory. Moreover, to manage dynamic garbage collection, we realize that the space occupied will actually be double. So, in the previous example, the effective memory used by strings will be 2048 bytes.

Resources and buffers

In ugBASIC the graphical data, the audio data and any other programs's resource is managed by specific data types, that acts as an "handle" for the real data. Since the language is isomorphic, it hides the meaning of this handles in a such way that you can only use them in a isomorphic way. The only resource that is not specialized is BUFFER.

KEYWORDKIND
BUFFERGeneric memory area
COLORColor
IMAGESingle image
IMAGESMultiple frames on a single image
MUSICMusic track (notes and instruments)
POSITIONPosition on the screen
SEQUENCEMultiple images a single image
SPRITESprite (hardware animable graphic)
THREADThread (task) identifier
TILESingle tile (character based)
TILESMultiple tiles as single tile (character based)
TILEMAPMap of tiles
TILESETSet of tiles (character/graphic based)

The BUFFER data type is used to represent a memory area of any kind, from a spare memory to a specific and big resource. Currently, you are not able to manipulate this kind of data type in way different from defining them and using them when you need them. Obviously, you can copy a buffer in another, assign it, compare or convert into another kind of resource, like IMAGE, but it is done by compiler at compilation time.

To define a buffer you can use the explicit method, like this:

x = #[42424242] : ' this define a buffer of 4 bytes, each of $42 (hexadecimal)
x = #[42424242 : ' the same, but you can omit the last character

Alternatively, you can define a BUFFER using a string. In that case, embedded sequences can be used to represent non-printing characters:
x = "{42}{42}{42}{42}" : ' this define a buffer of 4 bytes, each of $42 (hexadecimal)

The advantage of strings is that they can be added together, and therefore broken into multiple lines. Additionally, you can use auxiliary functions such as Z(...), which allows you to generate sequences of zeros. You can also use the implicit method, by loading the buffer from an external file:

x = LOAD("buffer.dat")

Now you can convert the buffer into an image or animation by "casting" it:

x = (IMAGE) LOAD("image.bin")
x = (IMAGES) LOAD("images.bin")


Note that is not a "real" conversion, but it is only a formal conversion. The data contained in "image.bin" and "images.bin" must be valid binary data image(s) for the current target. Since they changes very deeply, in order to be optimized for each target, we must avoid to do a formal convert and we need to make a substantial conversion. This is done by the various LOAD primitives, that are able to load and convert data at the same time, generating valid handles.

Bits

The ugBASIC language provides a series of instructions and primitives to handle the BIT data type, i.e. 0 or 1.

BIT variables can be used directly or in expressions. It is possible to define BIT type variables and arrays of the same type. The former will actually occupy only one bit, because they will be "collected" as much as possible in the same byte, while the latter will be collected in a minimum space. For example:

DIM x AS BIT, y AS BIT, z AS BIT : ' x, y and z occupy 1 bytes
DIM k(38) AS BIT : ' k will occupy 5 bytes


The HAS BIT / HAS NOT BIT instruction works directly with integer data types, and allows you to check whether a bit of that number is set or not, and is a more concise form than using bitmask comparisons. There are other two forms: BIT bit OF value and BIT( value, bit ) that makes the same check.

For example:

x = y HAS BIT 4 : ' this is like y AND 32
x = y HAS NOT BIT 4 : ' this is like y AND 239
x = BIT 4 OF y : ' this is like y AND 32


Any problem?

If you have found a problem, if you think there is a bug or, more simply, you would like something to be improved, write a topic on the official forum, or open an issue on GitHub.

Thank you!