# Introduction to the a simplified toy-DICOM file format III

Before we would sink into details of the DICOM standard file format, let us create a simplified version to entertain ourselves.
In the previous sections we have learned about the binary representation of data. We realized that if we are given a long byte stream then we could not tell the meaning without learning about the data types used. If one studied the problems given above then the one could list the following coding problems:

1. It is important know the different data types
2. It is important to know the exact structure of the different data types. a) In he case of numeric data we have to know how many bytes belong to one number b) Without knowing the direction in which the place values increase it would be impossible to decode the numeric values
3. We have to know the symbol that introduces a data type
4. It may be practical if we knew how many data of the same type will follow the introducing symbol

It is best if the beginner think over all the issues listed above and tries to build his or her own toy-DICOM file format alone. To make the task simpler, let me define all the types that will have to be used in this introductory section:

• Patient Name: A character string.
• A file identifier of alphanumeric characters.
• Numeric values to encode the grey levels of an image. (The numeric values will be given by words.)
• The image. (Our simplified toy-DICOM file will contain only one image. Also, the image will contain only 25 pixels: 5 rows and 5 columns.)

We have now four different data types: the word, the image, the patient name, and the file identifier. Before introducing the symbols for the different data types let us not forget about the interpretation of the words describing the numeric values belonging to the different grey levels. It seems to be practical if we start our private toy-DICOM files with the signal telling which way of the numeric interpretation will be used. Let the byte 00 signal that the space values go from the right to the left, and let the byte FF signal the opposite direction. We may introduce a symbolic way to tell the same. Let 00 and FF denoted by () and < , respectively for human consumption. Without introducing further rules let us agree that we will use the Latin-2 character set only. So far we realized that a DICOM file may have a so called header containing information about the whole file. Yes, the header is very important part of the real DICOM files.
Having agreed in all that. Let us denote the start of the patient name by 01, and let the symbol of this sign – for human consumption: PN. Let the introductory byte for the file identifier 02, and for the image, let it be 03; the corresponding „edible” symbols are UID and *. There is one more thing to agree in. Let the introductory symbols (signaling the beginning of a sequence of special data) be followed by one byte telling the length of the following data in the number of units we agreed. (units: bytes in the case of the PN, and the UID, and words for the pixels.)

Before giving a sample file to decode let us create a table to summarize what we have already said about our private file format.

 The byte Symbolic sign For the human interpreter Example for one such unit in hexa form Explanation 00 LittleEndian The words in the file will have to be read from the right to the left: The name: “LittleEndian” refers to that order. 10 2a 16*256+42=4096+42= 4138 FF BigEndian The words in the file will have to be read from the left to the right 10 2a 16+256*42=10752 01 PN The beginning of the patient name. Note that the first byte following this sign will tell the number of characters in the name, that is, it will not be the byte giving the first letter of the name. 010B 41656565656565 6F6F6F6F0300010001 01 tells that the patient name comes, 0B tells that the number of characters in the patient name will be 0B=11. The further 11 bytes will have to be interpreted (as we agreed) according to the Latin-2 character table. Since in Latin-2 41=A, 65=e, 6F=o, the patient name of 11 characters is „Aeeeeeeoooo” We do not worry about the next byte: 03 because we know the patient name contains only 10 characters. 02 UID The beginning of the file identifier. Note that the first byte following this sign will tell the number of characters in the identifier, that is, it will not be the byte giving the first character of the UID. 02020303 02: the UID comes, 02: it will have only two bytes: 0303. That is the file identifier is 33. 03 Image The beginning of the file part of the file that describes an image. Note that the first byte following this sign be A9=25 because our images are always of 5x5 pixels 03A9 0000 0000 0000 0000 0000 0001 0001 0010 0010 0010 000A 000A 000A 000A 000A 000B 000B 000B 000B 000B 03 tells that the image comes. A9 means 25, that is the image will contain 25 pixels whose pixel values are given in words. We do not know how to interpret the words yet, since we did not give the first byte of the file

Let us choose the usual method, that is, the method that we use when we interpret decimal numbers: let the place values increase from the right to the left. As a result the header of our file will be: 00, or LittleEndian:
Now, here is our DICOM byte level dump

00 01 0B 41 65 65 65 65 65 65 6F 6F 6F 6F 02 02
02 03 03 A9 00 00 00 00 00 00 00 00 00 00 00 01
00 01 00 01 00 01 00 01 00 0A 00 0A 00 0A 00 0A
00 0A 00 0B 00 0B 00 0B 00 0B 00 0B 00 00 00 00
00 00

First let us translate this dump into a readable format. Note that the first part of the file gives information about the file as a whole this is way this part will be called MetaHeader. All the other information but the image related part will be called Header:

LittleEndian

LittleEndian, (lower place values to the right.)

PN (Patient name 11 characters): Aeeeeeeoooo
UID (File identifier 2 characters): 33

Image: (Pixel values 25 words, 5x4 pixels):

0, 0, 0, 0, 0
1, 1, 1, 1, 1
10, 10, 10, 10, 10
11, 11, 11, 11, 11
0, 0, 0, 0, 0

End of the toy DICOM file.

Now, if we have the following grey scale:

grayscale

Ekkor a 33-as azonosítójú DICOM fájlunk, mely Aeeeeeeoooo nevű betegünkhöz tartozik a következő képet tárolja:

Grayscale image

Then our toy DICOM file of identifier 33, belonging to patient Aeeeeeeoooo has the following image in it.