I've been building a retro computer, and it's gotten me interested in using cassettes as data storage. This poses an interesting challenge where binary information has to be converted into something that can be written to,
and reliably read from, a cassette. We have to worry about immunity to noise (tape hiss), speed fluctuations (wow/flutter), and amplitude fluctuations (dropout).
Another limitation is frequency response. Our signal has to stay safely within the range of frequencies a tape can reproduce. This range can be as narrow as 400-4,000Hz for something like a microcassette. We could send a stream of bits at a safe 2kHz, but what if we then have a very long run of all zeros (or ones)? Our signal would dip below 400Hz, and our data would be lost.
 |
frequency response of my Pearlcorder L400 microcassette recorder |
One solution is to toggle our output at least once per bit. Two bits would give a full cycle and guarantee a minimum frequency of 1kHz. The presence of an additional toggle can represent a zero, and its absence a one. If every bit had an additional toggle, it would yield the maximum frequency of 2kHz. This is the basis of Differential Manchester encoding.
 |
Differential Manchester encoding - wikipedia |
Besides it fitting nicely in a frequency range, Manchester has other advantages. Cassette recorders rarely concern themselves with the polarity of their signal (since it doesn't affect the sound) and will sometimes invert their output relative to their input. Manchester encoding only uses the presence of these "toggles" or edges, and is unaffected by being inverted.
Also, each bit spends an equal amount of time high and low. This means we have no DC offset. If the offset were irregular, our signal would drift up and down, making decoding more difficult.
It's fairly resilient in the face of speed warbles too, as we have an octave separating our ones and zeros. In other words, zero is always twice as fast as one. For comparison, one early modem standard used 1300Hz and 1700Hz for one and zero respectively.
So, Manchester encoding it is! This settles how we encode individual bits, but not how we structure our data. I chose to mimic the standard serial packet structure of "8n1". This means a zero starting bit, 8 data bits, no parity bit, and a one stop bit. This makes it easy to figure out exactly how the data is aligned when receiving.
 |
8n1 - wikimedia commons |
I opted to add a calibration tone to the beginning of my files. This gives the receiver time to detect the amplitude, and more importantly, the frequency of the signal. This tone is simply a long string of ones. The starting bit (zero) of the first byte signifies the end of the tone.
I've written a python script that will take a binary file and output a Manchester encoded audio file that can be recorded directly onto a cassette.
Python encoder:
import struct, os
from sys import argv
smplrate = 32000
baud = 3200
def out_bit(bit):
global bit_status
bit_status = not bit_status
for x in range(2):
for y in range(int(smplrate/baud/2)):
if bit_status: buf.append(0xD8)
else: buf.append(0x28)
if x == 0 and not bit: bit_status = not bit_status
def out_byte(byte):
out_bit(0)
for i in range(8):
out_bit(bool(byte & (1<<7)))
byte <<= 1
out_bit(1)
try: len(argv[1])
except IndexError:
print("Input file needed")
exit(2)
fi = open(argv[1],'rb')
fo = open(os.path.splitext(argv[1])[0]+".wav", 'wb+')
file = bytearray(fi.read())
fi.close()
buf = []
bit_status = False
checksum = 0
for i in range(smplrate): buf.append(0x80)
for i in range(256): out_bit(1)
for byte in file:
checksum += byte
out_byte(byte)
out_byte(checksum)
for i in range(smplrate): buf.append(0x80)
fo.write(str.encode("RIFF"))
fo.write((len(buf) + 36).to_bytes(4, byteorder='little'))
fo.write(str.encode("WAVEfmt "))
fo.write((16).to_bytes(4, byteorder='little'))
fo.write((1).to_bytes(2, byteorder='little'))
fo.write((1).to_bytes(2, byteorder='little'))
fo.write((smplrate).to_bytes(4, byteorder='little'))
fo.write((smplrate).to_bytes(4, byteorder='little'))
fo.write((1).to_bytes(2, byteorder='little'))
fo.write((8).to_bytes(2, byteorder='little'))
fo.write(str.encode("data"))
fo.write(len(buf).to_bytes(4, byteorder='little'))
fo.write(struct.pack('B'*len(buf), *buf))
fo.close()
Hardware Interface
Now that we can store data onto a tape, we need a way to read it back. First we'll focus on the hardware required to connect the cassette recorder to a computer.
 |
Tape to computer interface schematic |
To read from a tape, the audio is first highpassed. This reduces potential DC offset, and noise from motor rumble. The audio is then amplified, bringing the tape's ~1V line-level output closer to the 5V we want for the digital signal. The amplification stage also lowpasses the audio, reducing some hiss and noise outside of the range of our signal. Next the audio is passed through a schmitt trigger. This transforms the smooth audio to a rigid, digital signal by comparing it to two thresholds. If the audio signal goes above the high (2.6V) threshold, the output is a digital one. If it goes below the low threshold (1.5V), the output is a zero. If the signal hangs out between the two, the output does not change. This provides some noise immunity. As long as the noise doesn't swing enough to push the signal over the wrong threshold, it will simply be ignored.
 |
Schmitt trigger input(U) and thresholds (A, B) - wikipedia |
Now we have a digital signal, but it's still Manchester encoded. I selected an Arduino to run a proof-of-concept decoding program. It takes in the digital signal from our interface board (via pin D2) and outputs the decoded bytes over serial.
To do this, it listens to part of the calibration tone, and calculates the signal's timing. It uses this timing to discern ones from zeros. Three edges close together count as a zero. Two edges far apart count as a one.
When it detects the first zero (start bit), it begins constructing and transmitting bytes.
It has the ability to detect and report framing errors (incorrect start/stop bit placement), and invalid edge patterns. It's unable to recover from these errors though. It would be possible to correct framing issues by buffering bits and searching for valid frames within the buffer.
Arduino Decoder:
const byte pulsePin = 2;
int byte_count = 0;
uint32_t last_ts = 0;
byte edge_count = 0;
byte bit_count = 0;
byte rec_Byte = 0;
unsigned int hi_threshold = 0;
unsigned int cal_count = 0;
unsigned int cal_ts = 0;
bool lead_in_done = 0;
void setup() {
pinMode(pulsePin, INPUT);
attachInterrupt(digitalPinToInterrupt(pulsePin), count, CHANGE );
Serial.begin(230400);
Serial.print("Start. ");
while(!hi_threshold);
Serial.print("High threshold(us): ");
Serial.println(hi_threshold);
}
void loop() { }
void count() {
if (!hi_threshold){
if (cal_count == 32) cal_ts = 0;
cal_ts += (micros() - last_ts);
if (++cal_count == 48) hi_threshold = cal_ts / 21;
} else {
bool bit_val = ((micros() - last_ts) > hi_threshold);
if (!lead_in_done && !bit_val) lead_in_done = 1;
if (++edge_count > 2) {
Serial.println("Edge cnt err");
edge_count = 1;
}
if ((!bit_val && edge_count == 2) || (bit_val && edge_count == 1)){
if (lead_in_done) bitDone(bit_val);
edge_count = 0;
}
}
last_ts = micros();
}
void bitDone(bool bit_val) {
if (bit_val) rec_Byte |= (0x80 >> (bit_count-1));
if (bit_count == 0 && bit_val) Serial.println("Start err");
else if (bit_count == 9 && !bit_val) Serial.println("Stop err");
if (++bit_count == 10) {
Serial.print((char)rec_Byte);
bit_count = 0;
rec_Byte = 0;
}
}
Files are available on my github page.
Here are some images of my setup to read from a microcassette. I was able to use it to read data out at around 3000 baud.