1. Technology
Send to a Friend via Email

STRING TYPES IN DELPHI
Delphi For Beginners:
Understanding and managing string data types in Delphi's Object Pascal. Learn about differences between Short, Long, Wide and null-terminated strings.

As like in any programming language, in Delphi, variables are placeholders used to store values; they have names and data types. The data type of a variable determines how the bits representing those values are stored in the computer's memory.

When we have a variable that will contain some array of characters, we can declare it to be of type String.
Delphi provides a healthy assortment of string operators, functions and procedures. Before assigning a String data type to a variable, we need to thorughly understand Delphi's four string types.

   Short String
Simply put, Short String is a counted array of (ANSII) characters, with up to 255 characters in the string. The first byte of this array stores the length of the string. Since this was the main string type in Delphi 1 (16 bit Delphi), the only reason to use Short String is for backward compatibility.
To create a ShortString type variable we use:

var s: ShortString;
s := 'Delphi Programming';
//S_Length := Ord(s[0]));
//which is the same as Length(s)

The s variable is a Short string variable capable of holding up to 256 characters, its memory is a statically allocated 256 bytes. Since this is usually wastefull - unlikely will your short string spread to the maximum length - second approach to using Short Strings is using subtypes of ShortString, whose maximum length is anywhere from 0 to 255.
var ssmall: String[50];
ssmall := 'Short string, up to 50 characters';
This creates a variable called ssmall whose maximum length is 50 characters.

Note: When we assign a value to a Short String variable, the string is truncated if it exceeds the maximum length for the type. When we pass short strings to some Delphi's string manipulationg routine, they are converted to and from long string.

   String / Long / Ansi
Delphi 2 brought to Object Pascal Long String type. Long string (in Delphi's help AnsiString) represents a dynamically allocated string whose maximum length is limited only by available memory. All 32-bit Delphi versions use long strings by default. I recomend using long strings whenever you can.

var s: String;
s := 'The s string can be of any size...';
The s variable can hold from zero to any practical number of characters. The string grows or shrinks as you assign new data to it.

We can use any string variable as an array of characters, the second character in s has the index 2. The following code

s[2]:='T';
assigns T to the second character os the s variable. Now the few of the first characters in s look like: TTe s str....
Don't be mislead, you can't use s[0] to see the length of the string, s is not ShortString.

Reference counting, copy-on-write
Since memory allocation is done by Delphi, we don't have to worry about garbage collection. When working with Long (Ansi) Strings Delphi uses reference counting. This way string copying is actually faster for long strings than for short strings.
Reference counting, by example:

var s1,s2: String;
s1 := 'first string';
s2 := s1;
When we create string s1 variable, and assign some value to it, Delphi allocates enough memory for the string. When we copy s1 to s2, Delphi does not copy the string value in memory, it ony increases the reference count and alters the s2 to point to the same memory location as s1.

To minimize copying when we pass strings to routines, Delphi uses copy-on-write techique. Suppose we are to change the value of the s2 string variable; Delphi copies the first string to a new memory location, since the change should affect only s2, not s1, and they are both pointing to the same memory location.

   Wide String
Wide strings are also dynamically allocated and managed, but they don't use reference counting or the copy-on-write semantics. Wide strings consist of 16-bit Unicode characters.

About Unicode character sets
The ANSI character set used by Windows is a single-byte character set. Unicode stores each character in the character set in 2 bytes instead of 1. Some national languages use ideographic characters, which require more than the 256 characters supported by ANSI. With 16-bit notation we can represent 65,536 different characters. Indexing of multibyte strings is not reliable, since s[i] represents the ith byte (not necessarily the i-th character) in s.

If you must use Wide characters, you should declare a string variable to be of the WideString type and your character variable of the WideChar type. If you want to examine a wide string one character at a time, be sure to test for multibite characters. Delphi doesn't support automatic type conversions betwwen Ansi and Wide string types.

var s : WideString;
    c : WideChar;
	
 s := 'Delphi_ Guide';
 s[8] := 'T';
 //s='Delphi_TGuide';	

   Null terminated
A null or zero terminated string is an array of characers, indexed by an integer starting from zero. Since the array has no length indicator, Delphi uses the ASCII 0 (NULL; #0) character to mark the boundary of the string.
This means there is essentially no difference between a null-terminated string and an array[0..NumberOfChars] of type Char, where the end of the string is marked by #0.

We use null-terminated strings in Delphi when calling Windows API functions. Object Pascal lets us avoid messing arround with pointers to zero-based arrays when handling null-terminated strings by using the PChar type. Think of a PChar as being a pointer to a null-terminated string or to the array that represents one. For more info on pointers, check: Pointers in Delphi.

For example, The GetDriveType API function determines whether a disk drive is a removable, fixed, CD-ROM, RAM disk, or network drive. The following procedure lists all the drives and their types on a users computer. Place one Button and one Memo component on a form and asign an OnClick handler of a Button:

procedure TForm1.Button1Click(Sender: TObject);
var
 Drive: Char;
 DriveLetter: String[4];
begin
 for Drive := 'A' to 'Z' do
 begin
  DriveLetter := Drive + ':\';
  case GetDriveType(PChar(Drive + ':\')) of
   DRIVE_REMOVABLE:
    Memo1.Lines.Add(DriveLetter + ' Floppy Drive');
   DRIVE_FIXED:
    Memo1.Lines.Add(DriveLetter + ' Fixed Drive');
   DRIVE_REMOTE:
    Memo1.Lines.Add(DriveLetter + ' Network Drive');
   DRIVE_CDROM:
    Memo1.Lines.Add(DriveLetter + ' CD-ROM Drive');
   DRIVE_RAMDISK:
    Memo1.Lines.Add(DriveLetter + ' RAM Disk');
   end;
 end;
end;

   Mixing Delphi's strings
We can freely mix all four different kinds of strings, Delphi will give it's best to make sence of what we are trying to do. The assignment s:=p, where s is a string variable and p is a PChar expression, copies a null-terminated string into a long string.

   Character types
In addition to four string data types, Delphi has three character types: Char, AnsiChar, and WideChar. A string constant of length 1, such as 'T', can denote a character value. The generic character type is Char, which is equivalent to AnsiChar. WideChar values are 16-bit characters ordered according to the Unicode character set. The first 256 Unicode characters correspond to the ANSI characters.

   Related

  • More articles about learning Delphi Programming from About.com.
  • Even more tutorials for beginners.
  • String Handling RTL routines.
  • ©2014 About.com. All rights reserved.