Tape Porting

From Stadm
Jump to navigationJump to search

OVERVIEW

There are several old tapes (namely Exabyte and DAT tapes) that store earthquake/seismic data from the past. Since these tapes are getting old, as well as the devices that allow us to read the tapes, we want to transfer the raw data that they contain to a more convenient and modern storage medium (such as DVDs or hard disk). We are currently porting them to hard disk (the shaker and hbseed machines). --- this project is being done for Jamison Steidl.

DATA TYPES AND STRUCTURING

  • The tapes that we will be porting store 2 kinds of data...TAR data and RAW data. I have provided a script (ddtape.csh located in /local/ucsb/eqshells on fablio) that will automatically extract the data from the tapes regardless of what kind of data is stored.
  • The tapes that contain RAW data are structured so that there are 3 sections/files. The first file is the header, containing information about the tape and some other stuff that isn't important for our purposes. The second file is also unimportant, as it is usually empty (0 bytes). The third file is what we care about; it contains the data that we want to extract. (you can use the mt stat command to see which file of the tape you are currently at). Even so, it doesn't hurt to get all three files from raw data tapes, and the ddtape.csh script should take care of this for you.
  • The tapes in general are structured in the following way: each tape contains a certain amount of files (usually specified on the tape itself or on the tape case). Each of these files are ended with an EOF marker (can be seen with the mt stat command). After all the files is an EOT marker marking the end of the tape.

TAPE DRIVES

  • The tape drives (there are 2 of them) are located in Girvetz 1205F connected to the machines shaker and hbseed.

PROCEDURES

NOTE: The following procedures assume that tapes have already been loaded into the tape drives located in Girvetz 1205F.
IMPORTANT: you may only get 1 pass at reading the data from the tapes since they are so old.

  1. Log into one of the machines (shaker or hbseed) using the ssh command.
  2. Change into the directory that corresponds to the machine name (i.e. - /shaker or /hbseed ).
    • you can use the df command to see the root directories and their free space
  3. cd into the appropriate directory (the directory name should reflect the earthquake/project/experiment that the tapes belong to - like jt/ for the joshua tree earthquake or landers/ for the landers earthquake)
    • if a directory does not exist for the current project being ported from tapes to hard disk, then create one (NOTE: you may not have privileges to create directories...if so, then ask Aaron to either create the new directory or give you the correct permissions)
  4. Make a new directory for the new tape with a name that follows the naming convention used (see the following note).
    • if you do a listing of the current directory's contents, you will see another directory structure consisting of T##X# (where T is the tape number starting from 0 and increasing with each new tape that is ported, and X is the machine that the tape was ported to - H for hbseed and S for shaker, also having a number: 0 if successful on the first try, 1 if the first attempt was unsuccessful and a second attempt is made after cleaning the tape drive).
  5. cd into this new directory.
  6. Type ddtape.csh -status to check the current status of the tape drive. (If ddtape.csh is not recognized, then check with Aaron to make sure that your PATH is set correctly)
  7. Use ddtape.csh. This will automatically dump the data from the tape to the current directory on disk. (see the student support website for documentation or type ddtape.csh -help).
    NOTE: this process can take a LONG time. If you are not able to wait for this process to complete (ie. - you have to leave), you can halt its execution using CTRL-C. Use mt stat to get the current position in the tape (the file no) and make a note of it. The next time you able to, use mt bsf n and mt fsf n (see below for explanation) and mt stat to position the tape at the beginning of the file no. that you had to interrupt (make sure the block no. for that file no is zero), and then use ddtape.csh -start filenum to continue reading the tape from the position you left off at.
  8. Check to see if the data transfer was successful. You can do this by cd'ing into the tape directory that you just ported (if not already in it), making a temp/ directory inside the tape directory, copying any of the f##.tar files to temp/, and untarring it (see below). If you get a checksum error, then try untarring another file. Another checksum error most likely means that the data is lost. Don't forget to remove the temp/ directory.
    NOTE: since some of the tar files are large, you can interrupt untar'ing by using CTRL-C. All we want to see is if data was extracted properly.
    • If the tape transfer was successful then use ddtape.csh -eject (type ddtape.csh -help). Add a green sticker to the tape case and label it with the corresponding directory name (ex. - T03H0 or T12S0). Store the tape with the others that are finished.
    • If the tape porting failed then use ddtape.csh -eject, get the cleaning tape from Aaron and insert it into the same tape drive. After it has been cleaned, try reading the same tape again, this time storing it's contents into the directory T##X1 (where T is the same tape number as before, and X is the machine that the tape was ported to - H for hbseed and S for shaker, this time having a number of 1). If it fails again, then ddtape.csh -eject, and put a red sticker on the tape and label it with the corresponding directory names. If it was successful then follow step #8. Store the tape with the other finished ones.
    NOTE: try reading the bad tape from the other tape drive as a last effort in reading the data. Just create a directory for it on the corresponding machine such as T##X1 where X is either H for hbseed or S for shaker.
  9. Go back to step #3 and repeat the process for each new tape.

UN-TARRING THE EXTRACTED DATA (for reading the actual data)

  • Once the data has been transferred from the tapes to hard disk, it needs to be untarr'ed since it was extracted in tar format. Use the command tar xvf filename. If you get a "tar: directory checksum error", then it means that the data is corrupt and has been lost.

LIST OF USEFUL COMMANDS

df - displays free disk space and usage
du -sk * - displays disk usage (in units of 1024 bytes)
tar xvf filename - extract the data from the .tar file
ddtape.csh [-help] - a shell script that automatically reads and copies the tape data from beginning to end
mt stat (or ddtape.csh -status) - print current status of tape (location, etc.)
mt fsf n - skip forward n files
mt bsf n - skip backward n files



created by Joe Mount 2005-08-25
modified by Joe Mount 2005-09-08