Hi folks! Why I have written modsplit? Sometimes it is really anoying that you cannot recover old data without errors or failures. Certainly there are checksums, recovery code and all these stuff. But what if a backup medium gets lost or destroyed? The solution should be a distributable backup with (nearly) arbitrary redundancy and without too much overhead! I gave this idea some thoughts and remembered an old theorem, which is well known for centuries and can be applied to accomplish this goal: The Chinese Remainder Theorem states, that a number can be identified by its remainders, if the moduli have no common divisors and their product is greater than the given number. This theorem is sometimes used for proving other theorems in Complexity Theory. You can even use it to speed up multiplications. So, instead of searching the web for a solution of my problem, I felt urged to write such an easy thing myself. The program takes less than 600 lines of code. To compile modsplit under Linux, simply do gcc -o modsplit -O2 -Wall modsplit.c The usage is simple: modsplit --split will create six destination files out of your source. If you place them on 6 different mediums, possibly using network devices, you will be able to restore your source even if two of the destinations are not available! modsplit --restore will do the job... You can even use pipes: tar cvf - | modsplit --split - .. modsplit --restore - | tar tvf - Warning: If using stdin, we have no chance to get the filesize on opening. On restore, the resulting file may be up to 3 ZERO-bytes longer than expected. Some ideas... It gets really interesting, if you think about using the internet. Consider that you and five of your friends have decided to backup your harddrives. Each of you need less than 65 GB of disk space to save 40 GB. And even if two of the systems are crashing down, everybody can restore his/her 40 GB. Writing a network device driver using this technique, one could even create an redundant and fast virtual RAID-System. Think of a very large distibuted archive of data, which is mostly accessed for reading. And many people are using this archive over a network. They can share their data. And even if they all have a low upload-rate and a high download-rate (such as DSL), they can access their data with nearly full download-rate as long as they are not accessing it concurrently. Cheers, Thorsten reinecke@thorstenreinecke.de