Searching Folders

If you have questions about any aspect of QBasic programming, or would like to help fellow programmers solve their problems, check out this board!

Moderators: Pete, Mods

Post Reply
User avatar
Zamaster
Veteran
Posts: 174
Joined: Wed Jun 15, 2005 1:51 pm

Searching Folders

Post by Zamaster »

How would I go about creating a program that searches every folder in a certain path and can then look at then files in that folder? Like a virus search program. So, itd look in a path ilke "C:\" and look through all folders and look at the files in its current folder. Even better, let the user specify a path.
C:\DOS
C:\DOS\RUN
RUN\DOS\RUN
bungytheworm
Veteran
Posts: 288
Joined: Sat Feb 18, 2006 4:02 pm

Post by bungytheworm »

If youre at DOS (i guess this works on xp console too, not sure) i would start with

Code: Select all

shell "DIR/S/B/AD > tree.txt"
It lists all directorys on tree.txt

tree.txt will look like...
C:\DOS
C:\DOS\BACKUP
C:\QBASIC
C:\QBASIC\PROJECTS
C:\QBASIC\FINISHED
C:\GAMES
C:\GAMES\TESTDRIVE
...
...
checking certain directory (and its subdirectorys) is then pretty easy.
Just DO (read file tree.txt) LOOP until begin of path = "C:\DOS" or what ever wanted.
moneo
Veteran
Posts: 451
Joined: Tue Jun 28, 2005 7:00 pm
Location: Mexico City, Mexico

Post by moneo »

Zamaster wrote:How would I go about creating a program that searches every folder in a certain path and can then look at then files in that folder? Like a virus search program. So, itd look in a path ilke "C:" and look through all folders and look at the files in its current folder. Even better, let the user specify a path.
First of all, since this is a QB site, I assume the program you want to write will be in QB. If not, then forget the rest of this post.

You mention the word "folder", which is a Windows term. QB runs under DOS, and "folders" in DOS are called "directories". In DOS, both filenames and directories have the following format:
FFFFFFFF.EEE
where FFFFFFFF is a filename with a maximum of 8 characters, and EEE is a file-extension with a maximum of 3 characters.

Windows allows long filenames, where its filenames and folder names can exceed the DOS maximums.

So, if you are looking through a "folder" and encounter a filename or another folder, that exceeds the DOS maximum, the QB or DOS commands will cause an error. In Lurah's example, he has a file at C:\GAMES\TESTDRIVE.
This is fine for Windows, but TESTDRIVE has more than 8 characters and QB/DOS cannot handle it.

QB only has a few statements for handling directories, which includes CHDIR (Change default directory). However, I don't suggest you use this for your program.

What I suggest is using the DIR command to give you a list of all the filenames in the directory structure that you want to process. Your program should do a SHELL command specifying the required DIR redirecting the output to a work file.

Then read the workfile, get each filename, skip filenames that are not valid for DOS, and process each of the files according to your requirements.

Unfortunately, my Windows machine is down so I'm running on a Linux machine, and can't test the DIR command that you will need. Get on the MSDOS commandline and do:
DIR /?
and figure out how to get the list of all the directory+filenames that you need. Do some testing. This is the critical part of your solution. The DIR has to work. BTW, the DIR will also display all Windows' long file names.

Holler if you need any help.
*****
RyanKelly
Coder
Posts: 48
Joined: Sun Jan 22, 2006 6:40 pm
Contact:

Post by RyanKelly »

Zamaster, if you are not opposed to using CALL INTERRUPT, you can use a few DOS services to to enumerate the contents of a directory.

You'll need to set the current directory to where you want to search, so you may want to save the directory your program started in and restore it when you're done.


You'll need to to declare a RegType variable (see the details section of CALL INTERRUPT in the help system) for use with CALL INTERRUPT.

You'll also need to defined a data type to use for the Disk Tranfer Area (DTA).

Code: Select all

TYPE DTAType
  reserved as string * 21 'interesting information 
                          'that you don't need right
                          'now
  FILE_ATTR as string *1  'attribute byte
  FILE_TIME as integer    'binary coded time
  FILE_DATE as integer    'binary coded date
  FILE_SIZE as integer    'size of file
  FILENAME as string *13  'null terminated string 
END TYPE
Use int &h21 service &h1A to set the DTA (this is where the other services will return information)
http://www.ctyme.com/intr/rb-2589.htm

Then use int &h21 service &h4E (Find First), to find the first file in the directory, and service &h4F (Find Next) to step through the directory one file at a time. With Find First, you'll specify what sort of files you're looking for (archives, hidden, directories) and the file name pattern (*.*, *.exe, etc).
http://www.ctyme.com/intr/rb-2977.htm
http://www.ctyme.com/intr/rb-2979.htm

So the scheme goes like this,

Change to the directory you want to search.
Set the DTA
Fill out the RegType variable with the search information.
Find the first match with the Find First service.
Store the file name returned in the DTA somewhere else.
Find and store each of the remaining files with the Find Next service.
Restore the current working directory.
User avatar
Zim
Veteran
Posts: 98
Joined: Mon Dec 05, 2005 4:31 pm
Location: Wisconsin, USA
Contact:

Post by Zim »

Here's the online help from DIR/? from Win XP:

Code: Select all

Displays a list of files and subdirectories in a directory.

DIR [drive:][path][filename] [/A[[:]attributes]] [/B] [/C] [/D] [/L] [/N]
  [/O[[:]sortorder]] [/P] [/Q] [/S] [/T[[:]timefield]] [/W] [/X] [/4]

  [drive:][path][filename]
              Specifies drive, directory, and/or files to list.

  /A          Displays files with specified attributes.
  attributes   D  Directories                R  Read-only files
               H  Hidden files               A  Files ready for archiving
               S  System files               -  Prefix meaning not
  /B          Uses bare format (no heading information or summary).
  /C          Display the thousand separator in file sizes.  This is the
              default.  Use /-C to disable display of separator.
  /D          Same as wide but files are list sorted by column.
  /L          Uses lowercase.
  /N          New long list format where filenames are on the far right.
  /O          List by files in sorted order.
  sortorder    N  By name (alphabetic)       S  By size (smallest first)
               E  By extension (alphabetic)  D  By date/time (oldest first)
               G  Group directories first    -  Prefix to reverse order
  /P          Pauses after each screenful of information.
  /Q          Display the owner of the file.
  /S          Displays files in specified directory and all subdirectories.
  /T          Controls which time field displayed or used for sorting
  timefield   C  Creation
              A  Last Access
              W  Last Written
  /W          Uses wide list format.
  /X          This displays the short names generated for non-8dot3 file
              names.  The format is that of /N with the short name inserted
              before the long name. If no short name is present, blanks are
              displayed in its place.
  /4          Displays four-digit years

Switches may be preset in the DIRCMD environment variable.  Override
preset switches by prefixing any switch with - (hyphen)--for example, /-
You can try some of these options and see what data goes into what fields, then you can read in the lines and glean all kinds of directory info. I'd suggest trying several different combos of command switches and run the output to a file. Then check using some kind of text editor (Edit or QB) and see what columns the various data start in so you know how to parse).

Note that DIR /-N gives output much like DOS 6 and earlier. (Minus before a switch turns it off, for some switches.)
--- Zim ---
--- Time flies like an arrow, but fruit flies like a banana ---
User avatar
Zamaster
Veteran
Posts: 174
Joined: Wed Jun 15, 2005 1:51 pm

Post by Zamaster »

Wow, thanks alot guys! I have a question about the Call Interrupt method mentioned by Ryan Kelly, errrr.... could ya post some codwe for me? Im very unfamiliar with the call interrupt statement and Im not sure how to impliment it. Thanks to moneo for the help with DIR. I tried that and it works fine! Im assuming however that the Call Interrupt method is faster, anybody know how to use it?
C:\DOS
C:\DOS\RUN
RUN\DOS\RUN
moneo
Veteran
Posts: 451
Joined: Tue Jun 28, 2005 7:00 pm
Location: Mexico City, Mexico

Post by moneo »

Zamaster,

By now you've figured out that there are many ways to get to the directories and filenames.

Lurah's way gets you the names of all the directories. Then, for each directory, you have to do another SHELL with a DIR to get its filenames. You also need to figure out how to handle Windows long filenames and directory names.

RyanKelly's approach is the most sofisticated, but it also requires a deeper understanding and much more testing. You'll find yourself spending more time on this than the rest of the program. As far as speed is concerned, this approach is faster than doing a SHELL with a DIR. However, since you want to read every single file, the total program time will be the same.

The approach which I've used many times is using functions from the QuickPak Professional Library.

What I recommend to you is to do some testing of the DIR, like Zim said, until you find the right combination that gives you the file that you want to process. The /X switch looks interesting, but since I can't test, I can't see what format it gives. You need something like the /X to convert long filenames. Having to do this in the program would be a real pain.

If we didn't have to worry about long filenames, the following DIR would give you exactly what you need; that is, a list of every filename prefixed by its full path.

Code: Select all

startpath$ = "c:\xxxx"   'xxxx is starting path you want
SHELL "DIR "+startpath$+" /S/B >d:\listfile"
open "d:\listfile" for input as #1
do while not eof(1)
     line input #1, f$
     open f$ for input as #2
     GOSUB PROCESSF
     close #2
loop
system
Notice that I put the listfile on the d: drive to keep out of the way.
The PROCESSF subroutine is where you process each file according to your requirements. If you intend writing onto the same file, you better make sure somehow that it's not a read-only file. This program of yours is full of loop-holes.
*****
User avatar
Zamaster
Veteran
Posts: 174
Joined: Wed Jun 15, 2005 1:51 pm

Post by Zamaster »

awesome awesome! Thanks a bunch. I guess FreeBASIC will let me look up long file names... at least I hope so.
C:\DOS
C:\DOS\RUN
RUN\DOS\RUN
User avatar
Seb McClouth
Veteran
Posts: 342
Joined: Wed Nov 09, 2005 7:47 am
Location: Inside the Matrix...
Contact:

Post by Seb McClouth »

If it doesn't, theres QBsource somewhere around on the net which is able to show long file names.

good luck!
QBinux is a Linux distribution with the aim of integrating the work of the vast community of free software developers at Pete's QBASIC Site in order to create a modern, performant, safe and easy to use system for system administrators and desktop users.
RyanKelly
Coder
Posts: 48
Joined: Sun Jan 22, 2006 6:40 pm
Contact:

Post by RyanKelly »

Zamaster, if I have time tommorow I'll code an example to list the current directory using interrupts. DOS 7+ and the NT emulator provide services to retrieve a long file name once you have the DOS 8.3 alias. This requires a little more over head if you want to write a robust program.

This technique is faster in terms of execution speed, since the DIR command uses the services I described to produce the formated output, and it is cleaner from a disk usage perspective, but this may not be relevant. If you haven't had experience with using DOS interrupts there is a learning curve, but once you have the hang of it you'll find it no more challenging than parsing the output of the DIR command.

The different between the two methods highlights a particular difference in programming perspective. The UNIX/LINUX approach values the use of existing tools. In the extreme case, you could write a program that takes a file name and path as a parameter and accomplish all your file system searching with DIR and GREP and bundle the entire thing in a batch file. The Miscrosoft (since Windows) approach leans towards tightly packaged operating system interfaces combined with highly automated programming environments. When DOS was still dominant, it was a "do it yourself" world.
In the end, it's a matter of taste.

The biggest upshot of the shell "dir" methods is that it will work equally well with a Freebasic program, but with Freebasic you also have the option of using Win32 API functions that provide the same functionality as the DOS interrupt services.
User avatar
Zamaster
Veteran
Posts: 174
Joined: Wed Jun 15, 2005 1:51 pm

Post by Zamaster »

Awesome. Thanks alot, Im looking into APIs for FB as we speak.
C:\DOS
C:\DOS\RUN
RUN\DOS\RUN
Post Reply