COBOL Metalanguage syntax diagrams.
COBOL coding rules
Structure of COBOL programs
Purpose of the IDENTIFICATION,
ENVIRONMENT, DATA
and PROCEDURE divisions.
The Environment Division
- Contains all external references, such as to
devices, files, command sequences, collating sequences,
the currency symbol and the decimal point symbol,
are defined in the Environment Division.
- The Environment Division must be maintained
when a COBOL program is moved to a new machine,
or has new peripheral devices attached, or is required
to work in a different country.
COBOL has three main programming constructs - Sequence,
Iteration and Selection.
EXAMPLE
We want to write a program which will accept two
numbers from the users keyboard, multiply them together
and display the result on the computer screen.
Any program consists of three main things;
- 1. The computer
statements needed to do the job
- 2. Declarations
for the data items that the computer statements
need.
- 3. A plan,
or algorithm, that arranges the computer statements
in the program so that the computer executes them
in the correct order.
Program
Statements and Data items What COBOL program statements will we need to do the
job specified above and what data items will we need to access?
We will need a statement to take in the first number and store
it in the named memory location (a variable) - Num1
ACCEPT
Num1.
We will need a statement to take in the second number and store
it in the named memory location - Num2
ACCEPT
Num2.
We will need a statement to multiply the two numbers together
and to store the result in the named location - Result
MULTIPLY
Num1 BY Num2 GIVING Result.
We will need a statement to display the value in the named memory
location "Result" on the computer screen -
DISPLAY
"Result is = ", Result.
Getting
the Algorithm right
COBOL MetaLanguage. In this notation, words in uppercase are reserved words. When underlined
they are mandatory. When not underlined they are "noise"
words, used for readability only, and are optional. Because COBOL
statements are supposed to read like English sentences there are
a lot of these "noise" words.
Words in mixed case represent names that must be devised by the
programmer (like data item names).
When material is enclosed in curly braces { }, a choice
must be made from the options within the braces. If there is only
one option then that item in mandatory.
Material enclosed in square brackets [ ], indicates that
the material is optional, and may be included or omitted as required.
The ellipsis symbol ... (three dots), indicates that the
preceding syntax element may be repeated at the programmer's discretion. Some
notes on syntax diagramsTo simplify the syntax diagrams and reduce the number of rules
that must be explained, in some diagrams special operand endings
have been used (note that this is my own extension - it is not standard
COBOL).
These special operand endings have the following meanings:
$i |
uses an alphanumeric
data-item |
$il |
uses an alphanumeric
data-item or a string literal |
#i |
uses a numeric
data-item |
#il |
uses a numeric
data-item or numeric literal |
$#i |
uses a numeric
or an alphanumeric data-item |
An example
syntax diagram
This syntax diagram may be interpreted as follows;
We must start a COMPUTE statement with the
keyword COMPUTE.
We must follow the keyword with the name(s) of the numeric data
item (or items - note the ellipsis symbol (...)) to be used to receive
the result of the expression. The #i suffix at the end of word Result
tells us that a numeric identifier/data item must be used.
Since the ellipsis symbol is placed outside the curly
brackets we can interpret this to mean that each result field can
have its own ROUNDED phase. In other words
we could have a COMPUTE statement like -
COMPUTE Result1 ROUNDED,
Result2 = ((9*9)+8)/5
where Result1 would be assigned a value of 18 and
Result2 would be assigned a value of 17.8.
The square brackets after the Arithmetic Expression
indicate that the next items are optional but if used we must choose
between the ON SIZE ERROR or NOT
ON SIZE ERROR phrases.
Because the END-COMPUTE is
contained within the square brackets it must only be used when a
SIZE ERROR or NOT SIZE ERROR
phrase is used. COBOL
coding rules
Traditionally, COBOL programs were written on coding
forms and then punched on to punch cards. Although nowadays
most programs are entered directly into a computer,
some COBOL formatting conventions remain that derive
from its ancient punch-card history.
On coding forms, the first six character positions
are reserved for sequence numbers. The seventh character
position is reserved for the continuation character,
or for an asterisk that denotes a comment line.
The actual program text starts in column 8. The four
positions from 8 to 11 are known as Area A, and positions
from 12 to 72 are Area B.
Although many COBOL compilers ignore some of these
formatting restrictions, most still retain the distinction
between Area A and Area B.
When a COBOL compiler recognizes the two areas, all
division names, section names, paragraph names, FD entries
and 01 level numbers must start in Area A. All other
sentences must start in Area B.
In our example programs we use the compiler directive
(available with the NetExpress COBOL compiler) - $ SET
SOURCEFORMAT"FREE" - to free us from these
formatting restrictions.
Ancient COBOL coding form
Name
constructionAll user-defined names, such as data names, paragraph names, section
names condition names and mnemonic names, must adhere to the following
rules:
- They must contain at least one character, but not more than
30 characters.
- They must contain at least one alphabetic character.
- They must not begin or end with a hyphen.
- They must be constructed from the characters A to Z, the numbers
0 to 9, and the hyphen.
- They must not contain spaces.
- Names are not case-sensitive: TotalPay is the same as totalpay,
Totalpay or TOTALPAY.
Divisions
A division is a block of code, usually containing one or more sections,
that starts where the division name is encountered and ends with
the beginning of the next division or with the end of the program
text.
Sections
A section is a block of code usually containing one or more paragraphs.
A section begins with the section name and ends where the next section
name is encountered or where the program text ends.
Section names are devised by the programmer, or defined
by the language. A section name is followed by the word SECTION
and a period.
See the two example names below -
SelectUnpaidBills
SECTION.
FILE SECTION.
Paragraphs
A paragraph is a block of code made up of one or more sentences.
A paragraph begins with the paragraph name and ends with the next
paragraph or section name or the end of the program text.
A paragraph name is devised by the programmer or defined by the
language, and is followed by a period.
See the two example names below -
PrintFinalTotals.
PROGRAM-ID.
Sentences and statements
A sentence consists of one or more statements and is terminated
by a period.
For example:
MOVE .21 TO VatRate
MOVE 1235.76 TO ProductCost
COMPUTE VatAmount = ProductCost * VatRate.
A statement consists of a COBOL verb and an operand or operands.
For example:
SUBTRACT Tax FROM
GrossPay GIVING NetPay
The
Four DivisionsIntroduction
At the top of the COBOL hierarchy are the four divisions.
These divide the program into distinct structural elements.
Although some of the divisions may be omitted, the sequence
in which they are specified is fixed, and must follow
the order below.
IDENTIFICATION
DIVISION. Contains
program information
ENVIRONMENT
DIVISION. Contains
environment information
DATA
DIVISION. Contains
data descriptions
PROCEDURE
DIVISION. Contains
the program algorithms
The
IDENTIFICATION DIVISION
The IDENTIFICATION DIVISION supplies information
about the program to the programmer and the compiler.
Most entries in the IDENTIFICATION DIVISION are
directed at the programmer. The compiler treats them as comments.
The PROGRAM-ID clause is an exception to
this rule. Every COBOL program must have a PROGRAM-ID
because the name specified after this clause is used by the linker
when linking a number of subprograms into one run unit, and by the
CALL statement when transferring control
to a subprogram.
The IDENTIFICATION DIVISION has the following
structure:
IDENTIFICATION DIVISION
PROGRAM-ID. NameOfProgram.
[AUTHOR. YourName.]
other entries here
The keywords - IDENTIFICATION DIVISION -
represent the division header, and signal the commencement of the
program text.
PROGRAM-ID is a paragraph name that must
be specified immediately after the division header.
NameOfProgram is a name devised by the programmer, and must satisfy
the rules for user-defined names.
Here's a typical program fragment: Here's a typical program fragment:
IDENTIFICATION DIVISION. PROGRAM-ID. SequenceProgram. AUTHOR. Michael Coughlan.
|
The
ENVIRONMENT DIVISIONThe ENVIRONMENT DIVISION is used to describe
the environment in which the program will run.
The purpose of the ENVIRONMENT DIVISION
is to isolate in one place all aspects of the program that are dependant
upon a specific computer, device or encoding sequence.
The idea behind this is to make it easy to change the program
when it has to run on a different computer or one with different
peripheral devices.
In the ENVIRONMENT DIVISION, aliases are
assigned to external devices, files or command sequences. Other
environment details, such as the collating sequence, the currency
symbol and the decimal point symbol may also be defined here.
The
DATA DIVISIONAs the name suggests, the DATA DIVISION
provides descriptions of the data-items processed by the program.
The DATA DIVISION has two main sections:
the FILE SECTION and the WORKING-STORAGE
SECTION. Additional sections, such as the LINKAGE
SECTION (used in subprograms) and the REPORT
SECTION (used in Report Writer based programs) may also be
required.
The FILE SECTION is used to describe most
of the data that is sent to, or comes from, the computer's peripherals.
The WORKING-STORAGE SECTION is used to describe
the general variables used in the program.
The DATA DIVISION has the following structure
and syntax:
Below is a sample program fragment -
IDENTIFICATION DIVISION. PROGRAM-ID. SequenceProgram. AUTHOR. Michael Coughlan.
DATA DIVISION. WORKING-STORAGE SECTION. 01 Num1 PIC 9 VALUE ZEROS. 01 Num2 PIC 9 VALUE ZEROS. 01 Result PIC 99 VALUE ZEROS.
|
The
PROCEDURE DIVISION The PROCEDURE DIVISION contains the code
used to manipulate the data described in the DATA
DIVISION. It is here that the programmer describes his algorithm.
The PROCEDURE DIVISION is hierarchical in
structure and consists of sections, paragraphs, sentences and statements.
Only the section is optional. There must be at least one paragraph,
sentence and statement in the PROCEDURE DIVISION.
Paragraph and section names in the PROCEDURE DIVISION
are chosen by the programmer and must conform to the rules for user-defined
names.
Sample Program
IDENTIFICATION DIVISION.
PROGRAM-ID. SequenceProgram.
AUTHOR. Michael Coughlan.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 Num1 PIC 9 VALUE ZEROS.
01 Num2 PIC 9 VALUE ZEROS.
01 Result PIC 99 VALUE ZEROS.
PROCEDURE DIVISION.
CalculateResult.
ACCEPT Num1.
ACCEPT Num2.
MULTIPLY Num1 BY Num2 GIVING Result.
DISPLAY "Result is = ", Result.
STOP RUN.
|
Some COBOL compilers require that all the divisions be present
in a program while others only require the IDENTIFICATION
DIVISION and the PROCEDURE DIVISION.
For instance the program shown below is perfectly valid when compiled
with the Microfocus NetExpress compiler.
Minimum COBOL program
IDENTIFICATION DIVISION. PROGRAM-ID. SmallestProgram.
PROCEDURE DIVISION.
DisplayGreeting.
DISPLAY "Hello world".
STOP RUN.
|
|