25.06-98
F. Glaser |
![]() Dynamic arrays in Pascal This seems to become a new FAQ especially with programmers who upgrade from C or VB to Delphi. It is a very basic article for beginners only, dealing with type, array and pointer subjects. By Franz Glaser, Austria
Introduction In Pascal, as opposed to most other high level programming languages, data structures are so important, that they get special attention to the compiler. Niklaus Wirth decided to make the same stringent checking of data as other languages make with the code (syntax) only. Most upgraders from other languages feel that the Pascal type checking
reduced their freedom in using data structures, but this is not true. Pascal
only needs that the programmer writes the source code according to stringent
rules, declaring the data structures thoroughly before using them.
The TYPE declaration of Arrays. Data can be single numbers or strings, but often they are defined as structures (ASM86 nomenclature), as records and arrays. This is not the location where the particular structures are explained. The type declaration does not reserve memory, it is only used to explain the compiler how the data are arranged in a record or array, or in a single variable, or in a set, or in a string. Physical memory for data is reserved with the Var instruction (the global Var, not the Var in a procedure's formal parameter list). Data memory can be reserved with the Var instruction in the global data area (the DS: data segment) and in local variables within procedures, on the stack. To reserve data memory, the programmer must tell the compiler, which kind of data variable should be reserved. This is where the type declaration comes into effect. Var Counter2 : Integer; {Integer is a pre-defined
type}
As opposed to "C" in Pascal any array MUST be declared as an ARRAY,
before it is used. Arrays need not range from 0 .. nnn, but they can get
any range, eg. -20 .. +100. If the programmer first declared an enumeration
type, eg.
What is an address You should consider that in a running program the variables have no names. The names are known to the compiler at compile time only, the same applies to the type declarations. But this is nothing special, even the procedure and function names are not part of the executable code. For procedures and functions the compiler makes a code to JUMP to a particular address within the program, or more exactly, calls the piece of code, which returns upon completion to the address in the executable code stream where it was called from. The same applies to data. The compiler simply makes a reference to a particular address in the data area, in most cases using a register. Example in Assembly language: Var Price : Integer; MOV BX,offset PRICE
The "offset" is the address of the variable, which holds the price, as fixed by the compiler. The source code names it "price", in the machine code it is "3456H" or wherever it may be placed. The address of a variable is simply a number, the location of the variable in memory. It points to a particular byte. If the variable is bigger than one byte, the address points to the bottom byte. Assume a record:
ASM
Now with arrays:
Pointers A pointer is a variable that holds the address of anything in memory. Pointers in Pascal are bound to a particular data type, they "are of" the same type as the data types they point to. Pointers are variables similar to integers or words, having the same size (or double the size) but they do not contain a numerical value as usual, they hold an address in memory. Pointers in Pascal even can hold the address of a procedure or function, then they "are of a procedure type". (Turbo Pascal special). But this is not dealt with in the following explanations, they deal with pointers to data. You can consider a pointer as a means to hold a data address at runtime. It can point to different variables in turn, even to unnamed variables which are alive only for a particular time period. Assume the PersRec defined above, now I make 3 distinct persons as variables:
Var PPers : ^TPersRec;
Pointers can get a meaningful content with a few special instructions,
the most basic is Addr.
Obtaining memory from the heap Pointers are used in the vast majority of applications to obtain memory space from the heap. The heap is the large "remaining" memory until the 640kB upper limit in standard DOS programs, and almost unlimited on protected mode (DPMI) and Windows operating environments. The compiler never creates named variables on the heap, like global variables in the data area and procedure - local variables on the stack. But Pascal has a heap manager, which has the tools to obtain memory from the heap and return it when it is no longer needed. The standard procedure to get memory from the heap is the New(ptr) instruction. It gains memory bytes and returns the base address in the Ptr - pointer. The Ptr pointer is then your handle to the memory, you have nothing else. New(PPers);
Dynamic arrays There is another procedure to obtain memory from the heap, the GetMem
procedure. It has 2 parameters.
GetMem(APers,579*sizeof(TPersRec));
You can read all records from a file:
This is the most primitive way to get the dynamic array: using memory
on the heap with a pointer.
You can declare the type with a size which would never occur in reality. Because the type declaration AND the pointer declaration does not reserve physical memory (with the pointer only a few bytes), it does not matter how big you declare the array - except you violate internal limitations. Eg. the 16-bit compilers cannot reserve more than 65kB in one chunk, even in the type declaration. And you must care yourself for the range checking, since the compiler's
range checking is overridden explicitly with your program!
But this is a very primitive approach. It has one benefit: you can read
and write to the file in one single chunk, for speed. But it is tedious
to handle this huge array of data in memory.
Array of Pointers Beginners read all records from the file into memory in one single chunk. This is primitive and has some serious drawbacks. Usually there should be a means to add and remove records, but how to do that with one single memory block? It is much better to declare an array of pointers to single records: Type APers = Array[1..1999] of ^TPersRec; This needs 8kB of memory for the pointers (assumed 4 byte for each pointer). Whenever you create a new person - record, you simply
To access it use:
It is now much easyer to sort the records, since your bubble sort or quick sort procedure only needs to exchange tiny pointers in the array, the flesh - the records contents remain unmoved on their locations on the heap. If you insert a new person, you need only move pointers upwards and
insert a New(APers[nnn]) pointer in the array, the same applies to a delete
procedure.
Type TBigArr = Array[0..1023] of PElements;
Key record (indexing) In most cases it is not even necessary to fetch the whole data records
from the disk to memory. Create a key record and leave the flesh on the
disk file.
This has 2 important benefits:
A third point is important for multitasking / multiuser / networking
applications: Several users can access the data on the file simultaneously,
where each modification is written to the appropriate record immediately.
TCollection is the ideal object (class) to handle large arrays of records. It does not contain the records, they are on the heap as described above. But it handles the pointers to the records. It has insert and delete methods and it automatically extends the size of the built in pointer array if necessary. Btw.: The pointer array is maintained on the heap too, it does not need valuable data-segment space. The TSortedCollection
Look in the Objects.TPU (DOS - version) and in your manuals how to use
the TCollection. It is primarily used with Turbo Vision. A similar approach
is the TList of Delphi.
More about Pascal memory and pointers, and links to more pointer tutors: TP-memory considerations, Pointer Primer Dr.John Stockton has an issue in the Borland Pascal Extensions Page |
leileilei | Delphi considerations
For Delphi the same is valid, except that on the 32-bit environment you seem to have unlimited heap space. But consider: the heap is lost on a power fail - and your data, the work of hours or probably years, is lost too. Note that the Delphi compiler creates pointers automatically, when you make an object/class. It hides the "pointer" property even to the programmer. And Delphi has additional functionality especially for use in in procedure parameters, the SLICE function. See more in your Delphi manual and in the Math page of Earl F. Glynn below. For more "scientific" subjects look in the excellent EFG's
math FAQ
On Peter Haas' homepage you find ! dyna.zip (6kB) [mirror no maint] Subject: Re: re-dimensioning arrays (Source) Date: 1 Jul 1998 10:08:29 GMT From: Antivivisektion@t-online.de (Antivivisektion e.V.) Organization: http://Antivivisektion.base.org Newsgroups: comp.lang.pascal.borland Toby Cubitt wrote:
{$A+,B-,D+,E-,F-,I-,L+,N-,O-,R-,S-,V-,T-}
Interface
Type
Function Init (pDim,pSize: Word; Var pBound,pIndex):
Boolean;
{$IFDEF UNIT}
Function Ptr2Long(p: Pointer): LongInt; Assembler;
Function Long2Ptr(l: LongInt): Pointer; Assembler;
Function SubPtr(p: Pointer; d: LongInt): Pointer;
Function AddPtr(p: Pointer; d: LongInt): Pointer;
Function PlatauObj.Init (pDim, pSize: Word; Var pBound, pIndex):
Disp := 0;
If (MaxAvail < SizeTable^[0])
Function PlatauObj.GetAddr: Pointer;
Procedure PlatauObj.Help;
{$IFNDEF UNIT}
--
Subject: Re: Array Problem (Source) Date: 3 Jul 1998 06:57:59 GMT From: Antivivisektion@t-online.de (Antivivisektion e.V.) Organization:http://Antivivisektion.base.org Newsgroups: comp.lang.pascal.borland, de.comp.lang.pascal.misc Stephan Erlank wrote in <news:comp.lang.pascal.borland>:
The following program reads (depending on your memory) up to
It's fast. It reads, compares and sorts 36.000 lines (1,5 MB) in
{ LoadTxt 1.0 Freeware by Antivivisektion@t-online.de (C) 2. Jul
1998 }
Program LoadTxt;
Type
Var EMS: tEMSstream; TmpStr: String; Function StrCmpResult: Integer;
Function GetString(EMSpos: LongInt): pString;
Function HeapFunc (Size: Word): Integer; far;
Procedure MyExit; far;
Procedure ShowTree(Node: pNode);
Var T: Text; Line: String;
Begin
EMS.Init(1024*1024,10*1024*1024);
Assign(T,'c:\users\antivi~1\mail\inbox.'#0'c:\FILENAME.TXT');
ShowTree(Base);
--
Franz _Glaser: Delphi 4 is said to have provisions for dynamic arrays. Variant arrays (and safe arrays) are costly in terms of memory and CPU
Remember that variant arrays are of use only in special circumstances.
|