CHAPTER 9

STRINGS


In Chapter 1 you saw how to print strings of characters, and in Chapter 3 you saw how to print a string before the prompt (?) of an INPUT. A string is just a series of characters which are stored in the Sorcerer's memory. You can have as many as 255 characters in a string, or as few as none. The string containing no characters is called a null string.

Any letter, number or symbol on the keyboard can be a part of a string -- even the graphic symbols and your own user-defined symbols. (You will learn in Chapter 13 how to define your own special characters.) You cannot directly enter the quote mark (") as a character in a string. This is because Standard BASIC uses quote marks to delimit strings. (To delimit means to tell the computer where something starts and ends, to define its limits.) If you wanted to print this statement,

He said, "Hello."

and tried to do it this way,

10 PRINT "He said, "Hello""

you would get this result when you ran the program,

He said, 0

(The Sorcerer thinks Hello is a variable, equal to 0.)

So? Now what? Well, the situation is not hopeless; you can put quote marks into a string without confusing your poor Sorcerer, as you'll see in the next section.

(By the way, because the carriage return at the end of each line is a delimiter, the Sorcerer accepts strings delimited by only one quote mark, that on the left. That is

PRINT "HELLO"
PRINT "HELLO

produce the same result. Most other BASICs insist on the use of both sets of quotes, left and right. Also, leaving off the right quote mark may confuse the Sorcerer or you or someone reading your program. For these reasons, we strongly recommend always using both sets of quote marks.)


THE CHR$ FUNCTION


Each character on the Sorcerer's keyboard has a number, called its ASCII (American Standard Code for Information Interchange) number. The first 128 of these are universal; the Sorcerer gives you 64 more, most shown in outline form on the tops of the keys. (Refer to the diagrams on pages 20 and 21 of A GUIDED TOUR OF PERSONAL COMPUTING.) You can look at these characters by holding down the GRAPHIC key and pressing each of the other keys (there are no graphics characters "under" RETURN, LINE FEED, REPEAT, ESC, CTRL, RUN STOP, SHIFT, SHIFT LOCK, HOME or the RESET keys). There are another 64 which you may define yourself; look at these by pressing the GRAPHIC key and the SHIFT key at the same time as pressing each of the other keys on the keyboard. Until you define these characters, you will see only "garbage" on the screen. You will learn how to define your own graphics characters in Chapter 12. (You may also redefine the 64 that the Sorcerer has defined for you.)

If you know the ASCII code for a particular character, you may print it with the CHR$ function. See Appendix G for the Standard ASCII set.

Examples:

PRINT CHR$(65) prints a capital letter A on the screen
PRINT CHR$(97) prints a small a on the screen
PRINT CHR$(32) prints a blank on the screen
PRINT CHR$(34) prints a quote mark (") on the screen

Aha! So, try this:

10 PRINT "He said, ";CHR$(34);"Hello";CHR$(34)
RUN
He said, "Hello" Sorcerer prints this.

You can simplify this with string variables, as you will see in a little while.

Do you remember the @ sign? You use it to immediately terminate any line. Try to enter this into the machine:

10 PRINT "97 UNITS @ $3.95 EACH EQUAL:";97*3.95

Your Sorcerer will not let you get past that @. But try this:

10 PRINT "97 UNITS ";CHR$(64);" $3.95 EACH EQUAL";97*3.95

You can look at the entire keyboard with this program:

10 FOR X=32 TO 255
20 PRINT X"="CHR$(X);
30 NEXT

You can print graphics characters by putting them into a PRlNT statement within quotes, or by using their ASCII numbers. These two statements produce the same results:

10 PRINT "|"     (| is just GRAPHIC 1)
10 PRINT CHR$(128)

The first 32 characters are "special" characters. Some of these are used for cursor control (see Chapter 13). CHR$(1 2) does the same thing as the CLEAR key, while CHR$(10) is the same as LINE FEED.

Try this program to see how strings of graphics characters can print a picture on the screen.

100 PRINT CHR$(12)
110 FOR A=10 TO 0 STEP -1
120 PRINT A:PRINT
130 FOR B=1 TO 300:NEXT B
140 NEXT A
150 FOR C=1 TO 11:READ D: PRINT CHR$(D);:NEXT
160 FOR E=1 TO 900:NEXT
170 PRINT CHR$(12)
180 FOR F=1 TO 14:PRINT CHR$(10):NEXT
190 PRINT TAB(10);CHR$(171)CHR$(172)
200 PRINT TAB(10);CHR$(128)CHR$(135)
210 PRINT TAB(10);CHR$(128)CHR$(135)
220 PRINT TAB(9);CHR$(171)CHR$(128)CHR$(135)CHR$(172)
230 FOR G=1 TO 15
240 PRINT TAB(11);"*":PRINT TAB(10);"*"
250 NEXT G
260 FOR H=1 TO 30:PRINT:NEXT
270 DATA 66,76,65,83,84,79,70,70,33,33,33

STRING CONSTANTS AND VARIABLES


We call a string of characters surrounded by a pair of quotes a string constant. It is similar to a numerical constant in having a definite value. Just as there are numerical variables, which can take on many different number values, so there are string variables which can take on many string values. Like the numerical variable, a string variable is just a name which the Sorcerer gives to a block of memory locations--but instead of storing number constants in these locations, the Sorcerer stores string constants.

You can use any combination of letters and numbers as a string variable, provided:

  1. The first character is a letter.
  2. No reserved words appear in the name.
  3. The name does not exceed the length of a line on the screen.
  4. The last character is a dollar sign ($).

This last is important; it is the $ at the end of the variable name which tells the Sorcerer it is working with a string variable, rather than a numerical variable.

As with numeric variables, even though you can use string variable names longer than two characters (not counting the $), don't do it..

Assign values to string variables just as you did with numeric variables:

LET X$="HELLO"
LET Y$=CHR$(34)

Both of these statements give the same result:

PRINT "HE GOT AN "CHR$(34)"A"CHR$(34)" ON THE TEST."
X$=CHR$(34):PRINT "HE GOT AN "X$"A"X$" ON THE TEST."

PUT AND TAKE WITH STRINGS


You can use string variables and constants in INPUT, PRINT, DATA and READ statements, just as you would use numerical variables.

Example:

10 INPUT "WHAT IS YOUR STRING";A$
20 PRINT "YOUR STRING IS"A$
30 GOTO 10

When you answer an INPUT prompt (?) with a string constant, you don't have to put quotes around the string. In the program above, if you answer

WHAT IS YOUR STRING?

by typing A, the Sorcerer replies

YOUR STRING IS A

You may also delimit your response with quotes. If you answer "A" instead of A, you get exactly the same result. You can get the Sorcerer to accept strings containing quote marks if you feed it the strings through INPUT statements (and if those quote marks are not the first element of the string)

If you answer

WHAT IS YOUR STRING?

Try typing these inputs:


INPUT RESULT
"A" A
"A A
A" A"
A""""A A""""A
A"""" A""""
"" (The input of one or two quotes returns a blank.)
""" ?SN ERROR

Example:

10 PRINT
20 FOR X=1 TO 10
30 READ A$
40 PRINT A$;
50 NEXT X
60 DATA "A","B","C","D","E","F","G","H","I","J"
RUN ABCDEFGHIJ

You can mix strings and numbers in data statements, so long as you are consistent in calling them up.

10 PRINT
20 FOR X=1 TO 10
30 READ A$,B
40 PRINT A$;B;
50 NEXT X
60 DATA "A",1,"B",2,"C",3,"D",4,"E",5
70 DATA "6",6,"7",7,"8",8,"9",9,"10",10
RUN
A 1 B 2 C 3 D 4 E 5 6 6 7 7 8 8 9 9 10 10

STRING RELATIONS AND EXPRESSIONS


You can form string expressions from string variables and constants, just as you form numerical expressions from numerical variables and constants. However, there is only one string operation, while there are six numerical operations (addition, subtraction, multiplication, division, negation and exponentiation).

The string operation is called concatenation. It uses the + sign as its symbol. The effect of concatenation is to stick the string on the right side of the + onto the end of the string on the left.

Examples:

10 PRINT "ABC"+"DEF"
20 PRINT CHR$(34)+"HI"+CHR$(34)
RUN
ABCDEF
"HI"
10 B$=""
20 FOR X=1 TO 10
30 READ A$
40 B$=B$+A$
50 PRINT B$
60 NEXT X
70 DATA "A","B","C","D","E","F","G","H","I","J"
RUN
A
AB
ABC
ABCD
ABCDE
ABCDEF
ABCDEFG
ABCDEFGH
ABCDEFGHI
ABCDEFGHIJ

While a string may be up to 255 characters in length, you cannot directly define one that long, because Standard BASIC does not allow program statements longer than one 64-character line. To prove this, try entering a string longer than one program line, in the first program in the Put and Take section. (Do not attempt to enter an extremely long string or terrible things will happen.) More about how to build long strings a little later.

The double role of the + sign doesn't confuse the Sorcerer. If it sees a + surrounded by string expressions, it says to itself "Aha! Concatenation!" If it sees a + surrounded by numerical expressions, it says "Addition." If it sees a + surrounded by one string expression and one numerical expression, it says ?TM ERROR (TYPE MISMATCH).

There are also a number of useful string functions, which are explained in the section on string functions.

In Chapter 7 you learned to form numerical relations with numerical expressions and numerical relation operators. As you might suspect, there are also string relation operators. You may use the same relation operators that you used with numerical expressions. In the case of strings, the Sorcerer compares their ASCII numbers.

You can use string relations in logical expressions and, of course, in branching expressions.

Example:

10 INPUT "WHO IS BURIED IN GRANT'S TOMB";GR$
20 PRINT
30 IF GR$="GRANT" THEN 70
40 PRINT "WRONG. TRY AGAIN."
50 PRINT
60 GOTO 10
70 PRINT "RIGHT YOU ARE. HAVE A CIGAR."

By the way, if you do not know the ASCII number of a particular character, use the ASC function. First you give a value to X$ and then you ask the Sorcerer to do something with ASC(X$), such as print it, or compare it with something else. ASC(X$) always returns the ASCII number of the first character of X$. The argument of the ASC function may be a string variable or constant.

Example:

10 X$=CHR$(34):REM DOCIBLE QUOTE
20 PRINT "TO END THIS PROGRAM, ENTER "X$"END IT"X$
30 PRINT
40 PRINT "GIVE ME TWO CHARACTERS"
50 INPUT A$
60 IF A$="END IT" THEN STOP
70 INPUT B$
80 PRINT
90 PRINT "THE ASCII NUMBER OF "A$" IS"ASC(A$)
100 PRINT "THE ASCII NUMBER OF "B$" IS"ASC(B$)
110 PRINT "THE ASCII NUMBER OF ";
120 IF A$>B$ THEN PRINT "A$>B$"
130 IF A$<B$ THEN PRINT "B$>A$"
140 IF A$=B$ THEN PRINT "A$=B$
150 GOTO 30

If you wish to find out the ASCII number of the blank or the comma, you must enter them as " " and ",". Standard BASIC allows you to enter control characters as inputs. To get a preview of Chapter 13, try entering CTRL W, and CTRL A, CTRL S and CTRL Z. You cannot enter CTRL C because that immediately terminates the program. You also cannot enter @.


STRING FUNCTIONS


A string function is any function that involves string expressions, either as the value of the function, or as arguments.

LEFT$ and RIGHT$ give you the leftmost and rightmost characters of a string, respectively. LEFT$(A$,3) is just a three-character string made up of the leftmost three characters of A$ (unless A$ is less than three characters in length, in which case LEFT$ is the entire string). If A$ has the value "ABCDEF," then LEFT$(A$,3) is "ABC." Similarly, RIGHT$(A$,2) is a string consisting of the rightmost two characters of A$. If A$ has the value "ABCDEF," then RIGHT$(A$,2) is "EF." As before, if A$ is shorter in length than the number of characters specified by RIGHT$, then RIGHT$ is the entire string.

The general formats for these functions are:

LEFT$(<string expression>, <numerical expression>)
and
RIGHT$(<string expression>, <numerical expression>)

Run this program and experiment with as many different strings as you can think of:

10 INPUT "WHAT IS YOUR STRING";A$
20 PRINT
30 INPUT "WHAT IS YOUR NUMBER";X
40 PRINT
50 PRINT LEFT$(A$,X), RIGHT$(A$,X)
60 PRINT
70 GOTO 10

You can also take characters from the middle of a string using the function MID$. You specify the number of characters to take, and which character to start with. For instance, MID$(A$,3,6) consists of the six characters of A$ starting with the third.

The most general format for MID$ is:

M I D$(<string expression>, <first numerical expression>, <second numerical expression>)

where the first numerical expression is the starting point and the second numerical expression tells how many characters are in the new string.

Examples:

MID$("ABCDEF",3,2) IS CD
MID$("ABCDEF",2,3) IS BCD
MID$("ABCDEF",3,3) IS CDE

if you ask for too many characters from the string, MID$ gives you as many as it can.

Example:

MID$("ABCDEF",5,4) IS EF

You can leave out the third argument of MID$. MID$(A$,X) returns all the rightmost characters of A$, starting with the Xth one.

Example:

MID$("ABCDEF",3) IS CDEF

In fact, RIGHT$(A$,X) is the same as MID$(A$,X + 1) and LEFT$(A$,X) is the same as MID$(A$,1,X).

Example:

10 A$="ABCDEF"
20 PRINT LEFT$(A$,3),MID$(A$,1,3)
30 PRINT RIGHT$(A$,3),MID$(A$,4)
RUN
ABC ABC
DEF DEF

Here is a little program that demonstrates LEFT$, RIGHT$ and MID$:

10 A$="":REM INITIALIZE STRING VARIABLE
20 FOR A=65 TO 90
30 A$=A$+CHR$(A)
40 NEXT A
50 FOR X=1 TO 26
60 PRINT LEFT$(A$,X)
70 NEXT
80 FOR X=1 TO 26
90 PRINT TAB(26-X);RIGHT$(A$,X)
100 NEXT
110 FOR X=1 TO 13
120 PRINT TAB(13-X);MID$(A$,14-X,(X-1)*2+1)
130 NEXT
140 FOR X=12 TO 1 STEP -1
150 PRINT TAB(13-X);MID$(A$,14-X,(X-1)*2+1)
160 NEXT

The function STR$ gives the string representation of a number. For instance, STR$(999) is the string of digits "999", and STR$(-1.23456E-01) is the string of letters, digits and symbols "-1.23456E-01". The difference between 999 and STR$(999) is that the Sorcerer handles the first as a number and the second as a string.

The general format is:

STR$(<numerical expression>)

You must use a numerical expression as the argument, or you get a TM error message (TYPE MISMATCH).

Since the Sorcerer first evaluates the expression before converting it to a string notice that STR$(1+2), for example, is "3" and not "1+2".

This program illustrates the use of STR$, and also the difference between numerical addition (+) and string concatenation (+):

10 INPUT "WHAT ARE YOUR TWO NUMBERS";X,Y
20 PRINT
30 PRINT "THE SUM IS ";X+Y
40 PRINT
50 PRINT "THE CONCATENATION IS ";STR$(X)+STR$(Y)
60 PRINT
70 GOTO 10

Try this program with as many different numbers as you can think of, using integer, fixed point and floating point notation. Notice that the first character of a positive number is always a blank space.

The numerical function LEN tells you how many characters there are in a string.

Examples:

LEN("ABCDE") is 5
LEN("5") is 1
LEN(STR$(5)) is 2, since STR$(S) has two characters: one the digit 5, and one the leading space.
LEN(X$) is whatever the length of X$ is
LEN(" ") is 0

The argument for LEN must be a string or a string expression, or you get the TM error message.


STRING SPACE


String space is the total amount of memory which Standard BASIC reserves for strings, string variables and string arrays. Normally this is 50 bytes (characters); if the total number of characters in all your string constants, variables and arrays is too great, you get an OS error message (OUT OF STRING SPACE). It is difficult to predict exactly when you will run out of string space, but there is something you can do about it, as you will see in a moment.

Example:

10 FOR X=1 TO 100
20 LET A$=A$+"A"
30 PRINT A$;": LENGTH OF STRING=";LEN(A$)
40 NEXT

This program produces an OS error when LEN(A$)>25.

To reserve more string space, use the command

CLEAR<numerical expression>

The command CLEAR 100 produces 100 bytes of string space. If the value of <numerical expression> is negative, you get an FC error message. If the value is not a whole number, the Sorcerer ignores the fractional part. You can use CLEAR<expression> as a program statement to reserve string space during the execution of a program. However, be careful doing this--CLEAR<expression> sets all your numerical variables and arrays to zero, and kills all string variables, too. To avoid problems, make the CLEAR statement the first one in your program. Try changing your program this way, and see what happens when you run it:

5 CLEAR 999
10 FOR X=1 TO 1000
20 LET A$=A$+"A"
30 PRINT A$;": LENGTH OF STRING=";LEN(A$)
40 NEXT

STRING ARRAYS


You can define arrays of strings, just as you define arrays of numbers. Each element of a string array can have up to 255 characters, and the array can have as many dimensions as a numerical array (that is, as many as the memory allows, as discussed in Chapter 6). You specify the dimensions and highest index values of a string array with a DIM statement, just as you do for a numerical array. In fact, you can use the same DIM statement to simultaneously dimension numerical arrays and string arrays. Names for string arrays obey the same rules as names for string variables--the last character of the name must be a $ sign.

Think of string arrays the same way as numerical arrays. You have a number of pigeonholes with messages in them; in this case each message is a word. (The messages may be variable in length and may, in fact, be shorter than one word, or longer.) Consider the map company. They might refer to a particular volume of maps, and a particular page in that book, and on that page a certain location referenced by intersecting lines. At that junction, rather than a number standing for a distance, is the name of a town. In volume 5, on page 20, 3 down and 2 over you might find the sleepy little town of Cybele, California. Thus MP$(5,20,3,2)= "CYBELE".

Examples.

30 DIM A(1,7), JO$(3), S2(12,3,5), B$(3,3,4)

20 IF A$(X)="ABC" THEN 200

150 LET B4$(X)=MID$(A$(I+1),3,5)

This program uses an array to read a number of data elements, and then prints those elements out, forward first and then backward. After running the program once, you can give the direct mode command GOTO 80 to get the same results. You need fill the array only once.)

10 DIM JO$(2,2,2)
20 FOR X=0 TO 2
30 FOR Y=0 TO 2
40 FOR Z=0 TO 2
50 READ A$
60 JO$(X,Y,Z)=A$
70 NEXT Z,Y,X
80 FOR X=0 TO 2
90 FOR Y=0 TO 2
100 FOR Z=0 TO 2
110 PRINT JO$(X,Y,Z);
120 NEXT Z,Y,X
130 PRINT
140 FOR X=2 TO 0 STEP -1
150 FOR Y=2 TO 0 STEP -1
160 FOR Z=2 TO 0 STEP -1
170 PRINT JO$(X,Y,Z)
180 NEXT Z,Y,X
190 DATA "A","B","C","D","E","F","G","H","I","J","K","L","M"
200 DATA "N","O","P","Q","R","S","T","U","V","W","X","Y","Z"
210 DATA "!"

Here is a program that sorts a list of strings alphabetically. The program works similarly to the numeric sorts of Chapter 6. The ASCII numbers of the strings are compared. Items must be entered in capitals (although it would be easy for you to convert this program so that items could also be entered in lower case; merely subtract 32 from the ASCII number of those characters whose ASCII number lies between 97 and 122, inclusive). Also, the blank is considered to be alphabetically "lower" than an A. (You could write a subroutine to ignore blanks.) To end the entry segment of the program and begin the sort, input just a carriage return (the null character).

10 CLEAR 400
20 DIM NA$(100)
30 PRINT CHR$(12)
40 PRINT:PRINT:PRINT "ENTER NAMES TO BE ALPHABETIZED";
50 PRINT " (UP TO 100)"
60 PRINT:PRINT
70 FOR X=1 TO 100
80 INPUT "NAME";NA$(X)
90 IF NA$(X)="" THEN 110 100 NEXT X
110 Y=X-1
120 FOR W=2 TO Y
130 Z=W
140 S1$=NA$(Z-1):S2$=NA$(Z)
150 GOSUB 280
160 IF NA$(Z)=S2$ THEN 200
170 NA$(Z-1)=S1$:NA$(Z)=S2$
180 IF Z<=2 THEN 200
190 Z=Z-1:GOTO 140
200 NEXT W
210 PRINT:PRINT:PRINT "IN ALPHABETICAL ORDER:"
220 PRINT:PRINT
230 FOR X=1 TO 100
240 PRINT NA$(X)
250 IF X=Y THEN 270
260 NEXT X
270 PRINT:PRINT:STOP
280 FOR X=1 TO 12
290 S3$=MID$(S1$,X,1):S4$=MID$(S2$,X,1)
300 IF S3$="" THEN RETURN
310 IF S4$="" THEN 360 320 IF ASC(S3$)<ASC(S4$) THEN RETURN
330 IF ASC(S3$)>ASC(S4$) THEN 360
340 NEXT X
350 RETURN
360 TV$=S1$
370 S1$=S2$
380 S2$=TV$
390 RETURN

Table of Contents | Prev | Next

The Trailing Edge