The QBNews Page 1 Volume 3, Number 2 June 29, 1992 ---------------------------------------------------------------------- D a t a B A S I C s a n d F i l e I / O ---------------------------------------------------------------------- File Types 100 by Richard Vannoy This article will familiarize beginners to computer programming and the QuickBASIC programming language with the basics of file creation and a brief on the types of files generally used in computer programs. First, we need a few definitions that describe how data is stored. Definitions: Field: A particular type of information in a file. Common field names would be phone number, name, address, or date. Record: The sum of the fields for one person, place or thing. Field 1: Name: Richard <----- Together, these three Field 2: Phone: 777-1212 <----- fields make one Field 3: Birthday: 04\26\60 <----- record. There are generally three types of files most commonly used today. They are sequential, random access and binary. Sequential, as the name implies, means that data is written to the file in sequence, or one after the other, so if I write the name and phone numbers of a few friends in a sequential file, it might look like the line below. (The commas are called delimiters. They are put in by you or the program to separate each piece of data) Sam,777-5155,George,123-4567,Bill,323-1212 Notice that all of the information (fields and records) are slammed together and that there is no way to tell where the name Bill starts without reading the file from the beginning. To retrieve, or read the items, we read them from the beginning, sequentially, until we get to the information desired. If Richard is the 100th name in the list, we have to READ the first 99 names/phone numbers to get to it. In a random access file, the fields are defined to a specific length and any spaces not used are filled with blanks. This allows each record to be the exact same size. Like.. Name: 10 bytes |Richard | Phone: 8 bytes |777-1212| Now we have a record length of 18 bytes, no matter how long the data is, so lets write the same info as above to a file.. Sam 777-5155George 123-4567Bill 323-1212 | | | Note how a new record starts every 18 bytes, and that "unused" bytes in the records are filled with spaces, so we can predict where every The QBNews Page 2 Volume 3, Number 2 June 29, 1992 record is going to start. And we don't need separaters since we know exactly where each record and each field starts. Not only that, if we know that Richard's info is in the 100th record, we can go directly to it since the record length is constant. Because of this predictability, which transforms to SPEED when it is time to find information, random access records are well suited to storing and retrieving large quantities of data. These are the two most common storage techniques, but there are many more! One, called Indexed Sequential Access Method (ISAM) is stored somewhat like a sequential file, but is accessed through an indexing system, which gives it one of the main advantage of sequential files (packing info tightly) and also one of the main advantage of random access files (FAST recovery). Binary files... Well, ALL files are binary files to the extent that any DOS file can be opened in the binary mode. By binary, we generally mean we want the ability to see, get or write to any byte in the file. In the examples above, if we wanted to know the three digit prefix of Bill's phone number, with both sequential and random access, we would have to read in the whole number, and pull out the first three digits, but with binary, we could go right to the applicable bytes and grab just the 323 prefix. Another common use of binary files is when we want to a machine language (EXE, COM) file and perhaps correct or change just a few bytes. Also, if you have no idea what is in a file, opening it in binary lets you look around easier and snoop the file to determine the contents of field/record layout. Opening any of these types of files is handled with the OPEN command. Check your QuickBASIC reference which will expand on the use and syntax of the following examples. OPEN "DATA.FIL" FOR INPUT AS #1 This opens a file for INPUT only. You can't write to or change the file contents. You can then read the information one variable at a time or one line at a time. OPEN "DATA.FIL" FOR OUTPUT AS #1 This opens a file for output. It creates a new file and allows you to write information into it. OPEN "DATA.FIL" FOR APPEND AS #1 The APPEND mode does not create a new file. It allows you to add information on to the end of an existing file. OPEN "DATA.DBF" FOR RANDOM AS #3 This allows to to define (with the TYPE statement) the type and length of fields you want in your data file, such as: TYPE Information firstName AS STRING * 15 lastName AS STRING * 20 The QBNews Page 3 Volume 3, Number 2 June 29, 1992 age AS INTEGER salary AS SINGLE END TYPE The TYPE defines the fields and sets the proper size so you don't have to keep track of the byte counts. OPEN "FIND.EXE" FOR BINARY AS #4 Now you are telling the system to open the file and allow you to retrieve, see or write any information from one to many bytes. Each file has its preferred uses. Data bases, where there are many entries such as customers, employees, or items, typically use RANDOM files. Applications where you just need to store and retrieve a small number of items generally use SEQUENTIAL files in the INPUT or OUTPUT mode. BINARY files have many special uses such as overlays, graphic image files and other specialized applications where binary (as opposed to text or ASCII) is stored. I hope the basic introduction will give you some insight into the available file types and their uses. Have fun programming!