<article>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#article10_03_03_2148245</id>
	<title>Long-Term Storage of Moderately Large Datasets?</title>
	<author>timothy</author>
	<datestamp>1267611480000</datestamp>
	<htmltext><a href="http://brockbrockticecom/" rel="nofollow">hawkeyeMI</a> writes <i>"I have a small scientific services company, and we end up generating fairly large datasets (2-3 TB) for each customer. We don't have to ship all of that, but we do need to keep some compressed archives. The best I can come up with right now is to buy some large hard drives, use software RAID in linux to make a RAID5 set out of them, and store them in a safe deposit box. I feel like there must be a better way for a small business, but despite some research into Blu-ray, I've not been able to find a good, cost-effective alternative. A tape library would be impractical at the present time. What do you recommend?"</i></htmltext>
<tokenext>hawkeyeMI writes " I have a small scientific services company , and we end up generating fairly large datasets ( 2-3 TB ) for each customer .
We do n't have to ship all of that , but we do need to keep some compressed archives .
The best I can come up with right now is to buy some large hard drives , use software RAID in linux to make a RAID5 set out of them , and store them in a safe deposit box .
I feel like there must be a better way for a small business , but despite some research into Blu-ray , I 've not been able to find a good , cost-effective alternative .
A tape library would be impractical at the present time .
What do you recommend ?
"</tokentext>
<sentencetext>hawkeyeMI writes "I have a small scientific services company, and we end up generating fairly large datasets (2-3 TB) for each customer.
We don't have to ship all of that, but we do need to keep some compressed archives.
The best I can come up with right now is to buy some large hard drives, use software RAID in linux to make a RAID5 set out of them, and store them in a safe deposit box.
I feel like there must be a better way for a small business, but despite some research into Blu-ray, I've not been able to find a good, cost-effective alternative.
A tape library would be impractical at the present time.
What do you recommend?
"</sentencetext>
</article>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351342</id>
	<title>Exactly what you're doing</title>
	<author>rwa2</author>
	<datestamp>1267615140000</datestamp>
	<modclass>Informativ</modclass>
	<modscore>4</modscore>
	<htmltext><p>I don't think you can beat a bunch of conventional hard disks in a RAID5 for both cost-per-TB and backup/restore performance, not to mention medium-term data integrity.  Might be able to make hooking up the drives more convenient with an eSATA mult-bay enclosure, but those are kinda expensive.  But I bet your backup box already has some sort of hot-swap on it already, like: <a href="http://www.amazon.com/Thermaltake-BlacX-eSATA-Docking-Station/dp/B001A4HAFS" title="amazon.com">http://www.amazon.com/Thermaltake-BlacX-eSATA-Docking-Station/dp/B001A4HAFS</a> [amazon.com]</p><p>I assume you already compress your data, since scientific datasets tend to compress well.  You might consider compressing to squashfs, since it will let you do transparent decompression later on so you can skip the restore step if you just need a handful of files.</p></htmltext>
<tokenext>I do n't think you can beat a bunch of conventional hard disks in a RAID5 for both cost-per-TB and backup/restore performance , not to mention medium-term data integrity .
Might be able to make hooking up the drives more convenient with an eSATA mult-bay enclosure , but those are kinda expensive .
But I bet your backup box already has some sort of hot-swap on it already , like : http : //www.amazon.com/Thermaltake-BlacX-eSATA-Docking-Station/dp/B001A4HAFS [ amazon.com ] I assume you already compress your data , since scientific datasets tend to compress well .
You might consider compressing to squashfs , since it will let you do transparent decompression later on so you can skip the restore step if you just need a handful of files .</tokentext>
<sentencetext>I don't think you can beat a bunch of conventional hard disks in a RAID5 for both cost-per-TB and backup/restore performance, not to mention medium-term data integrity.
Might be able to make hooking up the drives more convenient with an eSATA mult-bay enclosure, but those are kinda expensive.
But I bet your backup box already has some sort of hot-swap on it already, like: http://www.amazon.com/Thermaltake-BlacX-eSATA-Docking-Station/dp/B001A4HAFS [amazon.com]I assume you already compress your data, since scientific datasets tend to compress well.
You might consider compressing to squashfs, since it will let you do transparent decompression later on so you can skip the restore step if you just need a handful of files.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31358960</id>
	<title>hah what</title>
	<author>jon3k</author>
	<datestamp>1267723500000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>Tape is impractical but you're going to store RAID5 disk sets in safe deposit boxes?  How is tape impractical?  Speed?  Upfront cost?</htmltext>
<tokenext>Tape is impractical but you 're going to store RAID5 disk sets in safe deposit boxes ?
How is tape impractical ?
Speed ? Upfront cost ?</tokentext>
<sentencetext>Tape is impractical but you're going to store RAID5 disk sets in safe deposit boxes?
How is tape impractical?
Speed?  Upfront cost?</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354116</id>
	<title>Re:LTO-4?</title>
	<author>Chalex</author>
	<datestamp>1267633740000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>If I understand you correctly, you're insinuating that "drive compression" depends on the drive model or something.  And that it might not be possible to decode the data in the future.</p><p>In fact, hardware compression performed by the drive as it's writing to tape is part of the LTO spec.  So it's the same across all drives.  Think of it as standard LZW compression, performed by the drive hardware while it is writing.</p></htmltext>
<tokenext>If I understand you correctly , you 're insinuating that " drive compression " depends on the drive model or something .
And that it might not be possible to decode the data in the future.In fact , hardware compression performed by the drive as it 's writing to tape is part of the LTO spec .
So it 's the same across all drives .
Think of it as standard LZW compression , performed by the drive hardware while it is writing .</tokentext>
<sentencetext>If I understand you correctly, you're insinuating that "drive compression" depends on the drive model or something.
And that it might not be possible to decode the data in the future.In fact, hardware compression performed by the drive as it's writing to tape is part of the LTO spec.
So it's the same across all drives.
Think of it as standard LZW compression, performed by the drive hardware while it is writing.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353556</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31359454</id>
	<title>Big Question - How critical is the data?</title>
	<author>NeumannCons</author>
	<datestamp>1267725480000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>One thing you didn't mention in your post - how critical is the data?  If you lost it what would happen:  Nothing?  Would you lose a few hours time recreating it?  Would you go out of business?  Would you get sued for breach of contract?  Would Knees and Knuckles be paying your family a final visit?  Knowing that would make a difference in how I would store the data.<br><br>As several posters have stated and you've noticed yourself - nothing beats a hard drive for cost/byte.<br><br>But then you need to determine what to do with that - do you keep it online at all times (power, space, cooling may become issues to consider).<br><br>How many copies of the data do you keep?  Hard drives fail.  Just because it's raid doesn't mean it's safe.<br><br>What's your bigger plan for dealing testing for failure of the backup media and determining when to retire them?   Periodic testing has to be one of the most important parts of your plan.  You build something test it once and later find out that your last 2 months of backups were worthless.   Testing can help you avoid that.<br><br>Do you need to keep copies off-site?  Having 2 copies in your data-center located in your basement is no good if it floods.<br><br>How much total storage do you need, need 6 months from now, 2 years, etc.?  There's an interesting article from the online backup folks at backblaze.com.  They put together 4U enclosures that store 67 Terabytes for about $8,000 USD.    Complete instructions are on their site for how to do it (they don't sell them but use them for their business).  However, it's not exactly portable.  While not physically huge, it's gotta weigh a bit.  Perhaps 2 at separate locations with a network connection between them to keep them mirrored?<br><br>There's a number of firms that offer various online backup solutions where your data is automatically uploaded to their datacenter automatically, however I suspect that  you're going to exceed their usual offerings unless they have some "poweruser" or business option.  For individuals, "$5/mo unlimited storage" seems to be the norm.  However, their are 2 limiting factors to that "unlimited" - your/their available bandwidth.  If it takes 2 months to push out your dataset - is that acceptable?  Also, many firms delete your data 30 days after you delete it.  So if you move those hds to your safety deposit box, does the backup co see them as deleted and then delete their copy?  Comparing the cost of doing it yourself vs them may be attractive, esp if they have some appropriate business plan that's not much more then their individual plan.<br><br>Does the data need to be encrypted?  If you loose those hds on the way to your offsite location will it be merely inconvenient or life altering when someone finds it and reads it?<br><br>Finally do you need to somehow need to index the data so you can find your backups?<br><br>Making backups is easy.  Doing it right so you can actually get your data back takes a little more work.<br><br>Paul</htmltext>
<tokenext>One thing you did n't mention in your post - how critical is the data ?
If you lost it what would happen : Nothing ?
Would you lose a few hours time recreating it ?
Would you go out of business ?
Would you get sued for breach of contract ?
Would Knees and Knuckles be paying your family a final visit ?
Knowing that would make a difference in how I would store the data.As several posters have stated and you 've noticed yourself - nothing beats a hard drive for cost/byte.But then you need to determine what to do with that - do you keep it online at all times ( power , space , cooling may become issues to consider ) .How many copies of the data do you keep ?
Hard drives fail .
Just because it 's raid does n't mean it 's safe.What 's your bigger plan for dealing testing for failure of the backup media and determining when to retire them ?
Periodic testing has to be one of the most important parts of your plan .
You build something test it once and later find out that your last 2 months of backups were worthless .
Testing can help you avoid that.Do you need to keep copies off-site ?
Having 2 copies in your data-center located in your basement is no good if it floods.How much total storage do you need , need 6 months from now , 2 years , etc. ?
There 's an interesting article from the online backup folks at backblaze.com .
They put together 4U enclosures that store 67 Terabytes for about $ 8,000 USD .
Complete instructions are on their site for how to do it ( they do n't sell them but use them for their business ) .
However , it 's not exactly portable .
While not physically huge , it 's got ta weigh a bit .
Perhaps 2 at separate locations with a network connection between them to keep them mirrored ? There 's a number of firms that offer various online backup solutions where your data is automatically uploaded to their datacenter automatically , however I suspect that you 're going to exceed their usual offerings unless they have some " poweruser " or business option .
For individuals , " $ 5/mo unlimited storage " seems to be the norm .
However , their are 2 limiting factors to that " unlimited " - your/their available bandwidth .
If it takes 2 months to push out your dataset - is that acceptable ?
Also , many firms delete your data 30 days after you delete it .
So if you move those hds to your safety deposit box , does the backup co see them as deleted and then delete their copy ?
Comparing the cost of doing it yourself vs them may be attractive , esp if they have some appropriate business plan that 's not much more then their individual plan.Does the data need to be encrypted ?
If you loose those hds on the way to your offsite location will it be merely inconvenient or life altering when someone finds it and reads it ? Finally do you need to somehow need to index the data so you can find your backups ? Making backups is easy .
Doing it right so you can actually get your data back takes a little more work.Paul</tokentext>
<sentencetext>One thing you didn't mention in your post - how critical is the data?
If you lost it what would happen:  Nothing?
Would you lose a few hours time recreating it?
Would you go out of business?
Would you get sued for breach of contract?
Would Knees and Knuckles be paying your family a final visit?
Knowing that would make a difference in how I would store the data.As several posters have stated and you've noticed yourself - nothing beats a hard drive for cost/byte.But then you need to determine what to do with that - do you keep it online at all times (power, space, cooling may become issues to consider).How many copies of the data do you keep?
Hard drives fail.
Just because it's raid doesn't mean it's safe.What's your bigger plan for dealing testing for failure of the backup media and determining when to retire them?
Periodic testing has to be one of the most important parts of your plan.
You build something test it once and later find out that your last 2 months of backups were worthless.
Testing can help you avoid that.Do you need to keep copies off-site?
Having 2 copies in your data-center located in your basement is no good if it floods.How much total storage do you need, need 6 months from now, 2 years, etc.?
There's an interesting article from the online backup folks at backblaze.com.
They put together 4U enclosures that store 67 Terabytes for about $8,000 USD.
Complete instructions are on their site for how to do it (they don't sell them but use them for their business).
However, it's not exactly portable.
While not physically huge, it's gotta weigh a bit.
Perhaps 2 at separate locations with a network connection between them to keep them mirrored?There's a number of firms that offer various online backup solutions where your data is automatically uploaded to their datacenter automatically, however I suspect that  you're going to exceed their usual offerings unless they have some "poweruser" or business option.
For individuals, "$5/mo unlimited storage" seems to be the norm.
However, their are 2 limiting factors to that "unlimited" - your/their available bandwidth.
If it takes 2 months to push out your dataset - is that acceptable?
Also, many firms delete your data 30 days after you delete it.
So if you move those hds to your safety deposit box, does the backup co see them as deleted and then delete their copy?
Comparing the cost of doing it yourself vs them may be attractive, esp if they have some appropriate business plan that's not much more then their individual plan.Does the data need to be encrypted?
If you loose those hds on the way to your offsite location will it be merely inconvenient or life altering when someone finds it and reads it?Finally do you need to somehow need to index the data so you can find your backups?Making backups is easy.
Doing it right so you can actually get your data back takes a little more work.Paul</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31370242</id>
	<title>Re:I prefer online</title>
	<author>pnutjam</author>
	<datestamp>1267800660000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>Check out <a href="http://www.unitrends.com/" title="unitrends.com">unitrends</a> [unitrends.com]. They do very good d2d backup systems, very inexpensive, handles open files and replicates between units if you have multiple.</htmltext>
<tokenext>Check out unitrends [ unitrends.com ] .
They do very good d2d backup systems , very inexpensive , handles open files and replicates between units if you have multiple .</tokentext>
<sentencetext>Check out unitrends [unitrends.com].
They do very good d2d backup systems, very inexpensive, handles open files and replicates between units if you have multiple.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351690</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31357630</id>
	<title>The only real long-term solution</title>
	<author>Anonymous</author>
	<datestamp>1267716180000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>Cuneiform Tablets.</p><p>They last forever.</p></htmltext>
<tokenext>Cuneiform Tablets.They last forever .</tokentext>
<sentencetext>Cuneiform Tablets.They last forever.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352120</id>
	<title>Re:Exactly what you're doing</title>
	<author>toastar</author>
	<datestamp>1267618860000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p><div class="quote"><p>since scientific datasets tend to compress well</p></div><p>Really? The Datasets I deal with are fairly gaussian in nature, I've yet to find a good compression algorithm that works on segy.</p></div>
	</htmltext>
<tokenext>since scientific datasets tend to compress wellReally ?
The Datasets I deal with are fairly gaussian in nature , I 've yet to find a good compression algorithm that works on segy .</tokentext>
<sentencetext>since scientific datasets tend to compress wellReally?
The Datasets I deal with are fairly gaussian in nature, I've yet to find a good compression algorithm that works on segy.
	</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351342</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353852</id>
	<title>Holographic storage disc</title>
	<author>Anonymous</author>
	<datestamp>1267631100000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>50 year archive life, and up to 1.6TB capacity. No idea how much it costs.<br>http://www.inphase-technologies.com/products/default.asp?tnn=3</p></htmltext>
<tokenext>50 year archive life , and up to 1.6TB capacity .
No idea how much it costs.http : //www.inphase-technologies.com/products/default.asp ? tnn = 3</tokentext>
<sentencetext>50 year archive life, and up to 1.6TB capacity.
No idea how much it costs.http://www.inphase-technologies.com/products/default.asp?tnn=3</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353102</id>
	<title>Re:Exactly what you're doing</title>
	<author>Vellmont</author>
	<datestamp>1267625220000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p><i><br>You're storing relatively fragile hard drives in a raid5 configuration in a lock box?<br></i><br>Relatively fragile sitting in a temperature controlled safety deposit box, completely unpowered?  Uhh..  No.</p><p>Hard drives are physical devices that fail because of physical problems.  They'll easily sit for years in an unpowered state and you'll lose no data.  A couple years ago I powered up a drive from 1991 that I stopped using in 1995.  The drive sat idle for 13 years through various moves, sitting in my basement, in hot un-airconditioned rooms, tossed around through multiple moves.  It powered up just fine, and I recovered all the data off it without incident.  There's actually another wrinkle to all this.  The filesystem was an ancient Amiga FFS, which was recognized and mounted quite nicely by Linux (amazing really).  So this idea that the data on your HD is essentially going to suddenly rot is silly.  I'd trust a unpowered HD over anything but SSD any day (and maybe even over SSD).</p><p>If anything, powering up the drive to make sure it's OK is only going to make it more likely to fail.  Frankly I'd forgo the RAID5, as you're just asking for some technological change to happen and have some incompatibility happen between kernel versions.  If you're REALLY that paranoid about losing the data, copy it to two different hard drives formatted with straight EXT3.  That ought to be supported for decades.</p></htmltext>
<tokenext>You 're storing relatively fragile hard drives in a raid5 configuration in a lock box ? Relatively fragile sitting in a temperature controlled safety deposit box , completely unpowered ?
Uhh.. No.Hard drives are physical devices that fail because of physical problems .
They 'll easily sit for years in an unpowered state and you 'll lose no data .
A couple years ago I powered up a drive from 1991 that I stopped using in 1995 .
The drive sat idle for 13 years through various moves , sitting in my basement , in hot un-airconditioned rooms , tossed around through multiple moves .
It powered up just fine , and I recovered all the data off it without incident .
There 's actually another wrinkle to all this .
The filesystem was an ancient Amiga FFS , which was recognized and mounted quite nicely by Linux ( amazing really ) .
So this idea that the data on your HD is essentially going to suddenly rot is silly .
I 'd trust a unpowered HD over anything but SSD any day ( and maybe even over SSD ) .If anything , powering up the drive to make sure it 's OK is only going to make it more likely to fail .
Frankly I 'd forgo the RAID5 , as you 're just asking for some technological change to happen and have some incompatibility happen between kernel versions .
If you 're REALLY that paranoid about losing the data , copy it to two different hard drives formatted with straight EXT3 .
That ought to be supported for decades .</tokentext>
<sentencetext>You're storing relatively fragile hard drives in a raid5 configuration in a lock box?Relatively fragile sitting in a temperature controlled safety deposit box, completely unpowered?
Uhh..  No.Hard drives are physical devices that fail because of physical problems.
They'll easily sit for years in an unpowered state and you'll lose no data.
A couple years ago I powered up a drive from 1991 that I stopped using in 1995.
The drive sat idle for 13 years through various moves, sitting in my basement, in hot un-airconditioned rooms, tossed around through multiple moves.
It powered up just fine, and I recovered all the data off it without incident.
There's actually another wrinkle to all this.
The filesystem was an ancient Amiga FFS, which was recognized and mounted quite nicely by Linux (amazing really).
So this idea that the data on your HD is essentially going to suddenly rot is silly.
I'd trust a unpowered HD over anything but SSD any day (and maybe even over SSD).If anything, powering up the drive to make sure it's OK is only going to make it more likely to fail.
Frankly I'd forgo the RAID5, as you're just asking for some technological change to happen and have some incompatibility happen between kernel versions.
If you're REALLY that paranoid about losing the data, copy it to two different hard drives formatted with straight EXT3.
That ought to be supported for decades.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351504</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352990</id>
	<title>Re:Amazon S3</title>
	<author>durdur</author>
	<datestamp>1267624380000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p><div class="quote"><p>And once your stuff is in S3, you can recycle the same disks to mail them more data.</p></div><p>I think you probably can't realistically ship them stuff every time you need to backup. A more realistic strategy would be to upload a bunch then do incremental backups as needed. How feasible that is to do depends on your data size and the bandwidth of your Net connection. It could be painfully slow if you do not have  a very high bandwidth link.</p></div>
	</htmltext>
<tokenext>And once your stuff is in S3 , you can recycle the same disks to mail them more data.I think you probably ca n't realistically ship them stuff every time you need to backup .
A more realistic strategy would be to upload a bunch then do incremental backups as needed .
How feasible that is to do depends on your data size and the bandwidth of your Net connection .
It could be painfully slow if you do not have a very high bandwidth link .</tokentext>
<sentencetext>And once your stuff is in S3, you can recycle the same disks to mail them more data.I think you probably can't realistically ship them stuff every time you need to backup.
A more realistic strategy would be to upload a bunch then do incremental backups as needed.
How feasible that is to do depends on your data size and the bandwidth of your Net connection.
It could be painfully slow if you do not have  a very high bandwidth link.
	</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351420</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352270</id>
	<title>Re:Tape is crap anyway.</title>
	<author>Anonymous</author>
	<datestamp>1267619580000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>"Exabyte. For those people who don't realise that their system already has a<nobr> <wbr></nobr>/dev/null." Seriously, dude, these are not reliable tape technologies.</p><p>For TB+ scales of data, I'd be talking LTO4, T10000B, 3592 generation 3, or maybe DLT-S4. I worked for IBM Global Services for a couple of years, managing the backup systems for a number of clients; one client was using a combination of 9840C, 9940B, and T10000A, another was using LTO3, and a third was using 3592 generation 1. Except for 9940B (and that only because we were shipping relatively fragile carts offsite), all of these technologies were rock solid. 9840C is only 40 GB per cart (the fourth generation, 9840D, brings it up to a whopping 75 GB). T10000A is 500 GB per cart; generation 2 (T10kB) brings it up to 1 TB. LTO4 is 800 GB. 3592 gen 3 is up to 1 TB, depending on the cartridge. DLT-S4 is 800 GB. All figures are native; manufacturer's figures will double these, with the typical assumption of 2:1 compression.</p><p>Hard drives aren't made for shipping around. Tape is. Pick a technology - of the above, unless there's a serious desire to pay for extra reliability, I'd go for LTO4 - and make two or three copies of the data, store them in different locations, and you're set. Far easier. No need for a tape library in the short term, you can just swap tapes into the drive by hand (although I'm not sure if the IBM or StorageTek technologies are available as standalone drives, which is another reason to go with LTO4.)</p></htmltext>
<tokenext>" Exabyte .
For those people who do n't realise that their system already has a /dev/null .
" Seriously , dude , these are not reliable tape technologies.For TB + scales of data , I 'd be talking LTO4 , T10000B , 3592 generation 3 , or maybe DLT-S4 .
I worked for IBM Global Services for a couple of years , managing the backup systems for a number of clients ; one client was using a combination of 9840C , 9940B , and T10000A , another was using LTO3 , and a third was using 3592 generation 1 .
Except for 9940B ( and that only because we were shipping relatively fragile carts offsite ) , all of these technologies were rock solid .
9840C is only 40 GB per cart ( the fourth generation , 9840D , brings it up to a whopping 75 GB ) .
T10000A is 500 GB per cart ; generation 2 ( T10kB ) brings it up to 1 TB .
LTO4 is 800 GB .
3592 gen 3 is up to 1 TB , depending on the cartridge .
DLT-S4 is 800 GB .
All figures are native ; manufacturer 's figures will double these , with the typical assumption of 2 : 1 compression.Hard drives are n't made for shipping around .
Tape is .
Pick a technology - of the above , unless there 's a serious desire to pay for extra reliability , I 'd go for LTO4 - and make two or three copies of the data , store them in different locations , and you 're set .
Far easier .
No need for a tape library in the short term , you can just swap tapes into the drive by hand ( although I 'm not sure if the IBM or StorageTek technologies are available as standalone drives , which is another reason to go with LTO4 .
)</tokentext>
<sentencetext>"Exabyte.
For those people who don't realise that their system already has a /dev/null.
" Seriously, dude, these are not reliable tape technologies.For TB+ scales of data, I'd be talking LTO4, T10000B, 3592 generation 3, or maybe DLT-S4.
I worked for IBM Global Services for a couple of years, managing the backup systems for a number of clients; one client was using a combination of 9840C, 9940B, and T10000A, another was using LTO3, and a third was using 3592 generation 1.
Except for 9940B (and that only because we were shipping relatively fragile carts offsite), all of these technologies were rock solid.
9840C is only 40 GB per cart (the fourth generation, 9840D, brings it up to a whopping 75 GB).
T10000A is 500 GB per cart; generation 2 (T10kB) brings it up to 1 TB.
LTO4 is 800 GB.
3592 gen 3 is up to 1 TB, depending on the cartridge.
DLT-S4 is 800 GB.
All figures are native; manufacturer's figures will double these, with the typical assumption of 2:1 compression.Hard drives aren't made for shipping around.
Tape is.
Pick a technology - of the above, unless there's a serious desire to pay for extra reliability, I'd go for LTO4 - and make two or three copies of the data, store them in different locations, and you're set.
Far easier.
No need for a tape library in the short term, you can just swap tapes into the drive by hand (although I'm not sure if the IBM or StorageTek technologies are available as standalone drives, which is another reason to go with LTO4.
)</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351706</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31409218</id>
	<title>split</title>
	<author>countach</author>
	<datestamp>1268062800000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>I don't know much about Linux RAID, but as far as I see, if you must split across multiple disks, use the simplest approach possible so that you aren't left in some years time with a jungle of problems trying to assemble them again. Something like a simple split of the file if it is one large archive, or if it is a bunch of individual files, just distributes them between the disks somehow.</p></htmltext>
<tokenext>I do n't know much about Linux RAID , but as far as I see , if you must split across multiple disks , use the simplest approach possible so that you are n't left in some years time with a jungle of problems trying to assemble them again .
Something like a simple split of the file if it is one large archive , or if it is a bunch of individual files , just distributes them between the disks somehow .</tokentext>
<sentencetext>I don't know much about Linux RAID, but as far as I see, if you must split across multiple disks, use the simplest approach possible so that you aren't left in some years time with a jungle of problems trying to assemble them again.
Something like a simple split of the file if it is one large archive, or if it is a bunch of individual files, just distributes them between the disks somehow.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354122</id>
	<title>Re:LTO-4?</title>
	<author>afidel</author>
	<datestamp>1267633860000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>Chest height generally screws them up, waist height is generally ok in my experience (one of only 2 failed tapes out of the last 4,800 through my library was dropped from chest height, I've dropped multiple from waist height without issue).</htmltext>
<tokenext>Chest height generally screws them up , waist height is generally ok in my experience ( one of only 2 failed tapes out of the last 4,800 through my library was dropped from chest height , I 've dropped multiple from waist height without issue ) .</tokentext>
<sentencetext>Chest height generally screws them up, waist height is generally ok in my experience (one of only 2 failed tapes out of the last 4,800 through my library was dropped from chest height, I've dropped multiple from waist height without issue).</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351720</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351772</id>
	<title>rotation</title>
	<author>vlm</author>
	<datestamp>1267617180000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>You have to keep rotating onto newer media, and newer media technologies.  This sounds horrible, "oh no!  I'm generating ten full drives per year".  But realize in a couple years, all those drives will fit on a USB 4.0 stick, or on a card in your cellphone.</p><p>If you haven't read it (and recopied it) in a couple years, its probably gone.</p></htmltext>
<tokenext>You have to keep rotating onto newer media , and newer media technologies .
This sounds horrible , " oh no !
I 'm generating ten full drives per year " .
But realize in a couple years , all those drives will fit on a USB 4.0 stick , or on a card in your cellphone.If you have n't read it ( and recopied it ) in a couple years , its probably gone .</tokentext>
<sentencetext>You have to keep rotating onto newer media, and newer media technologies.
This sounds horrible, "oh no!
I'm generating ten full drives per year".
But realize in a couple years, all those drives will fit on a USB 4.0 stick, or on a card in your cellphone.If you haven't read it (and recopied it) in a couple years, its probably gone.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354042</id>
	<title>Re:Exactly what you're doing</title>
	<author>zero0ne</author>
	<datestamp>1267632720000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>yes, because 2TB of data storage @ ~$3000 per month is such a steal.</p><p>with 3000 a month locked up in data storage, you could probably end up leasing 3 2U servers + vSphere / vCenter licenses and end up saving money over a 2 year period.</p></htmltext>
<tokenext>yes , because 2TB of data storage @ ~ $ 3000 per month is such a steal.with 3000 a month locked up in data storage , you could probably end up leasing 3 2U servers + vSphere / vCenter licenses and end up saving money over a 2 year period .</tokentext>
<sentencetext>yes, because 2TB of data storage @ ~$3000 per month is such a steal.with 3000 a month locked up in data storage, you could probably end up leasing 3 2U servers + vSphere / vCenter licenses and end up saving money over a 2 year period.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351504</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354668</id>
	<title>Simply amazing!</title>
	<author>mcrbids</author>
	<datestamp>1267639740000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>Loads of comments, alternating between disk and tape, and a few ridiculous, impractical options (Amazon, etc) but nobody here's found the best, cheapest, and most highly redundant method!</p><p>You take your 2 TB file, zip compress it, Encrypt it, rename it "Britney Spears Gang Bang kdiaKiS93kDw.mpeg" and stick it on Bit Torrent! It's highly redundant, very secure, costs basically nothing, and your chances of finding it again in 10 years are at least as good as finding a tape 10 years in the future!</p></htmltext>
<tokenext>Loads of comments , alternating between disk and tape , and a few ridiculous , impractical options ( Amazon , etc ) but nobody here 's found the best , cheapest , and most highly redundant method ! You take your 2 TB file , zip compress it , Encrypt it , rename it " Britney Spears Gang Bang kdiaKiS93kDw.mpeg " and stick it on Bit Torrent !
It 's highly redundant , very secure , costs basically nothing , and your chances of finding it again in 10 years are at least as good as finding a tape 10 years in the future !</tokentext>
<sentencetext>Loads of comments, alternating between disk and tape, and a few ridiculous, impractical options (Amazon, etc) but nobody here's found the best, cheapest, and most highly redundant method!You take your 2 TB file, zip compress it, Encrypt it, rename it "Britney Spears Gang Bang kdiaKiS93kDw.mpeg" and stick it on Bit Torrent!
It's highly redundant, very secure, costs basically nothing, and your chances of finding it again in 10 years are at least as good as finding a tape 10 years in the future!</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351342</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353314</id>
	<title>Re:bzip2</title>
	<author>coolsnowmen</author>
	<datestamp>1267627260000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>sadly, even assuming submitter has an army of slaves to print/scan this stuff at his disposal.  This storage medium runs into data loss when the storage requirements get to ~20GB.  At 2-3TB, this is a technically poor solution.</p></htmltext>
<tokenext>sadly , even assuming submitter has an army of slaves to print/scan this stuff at his disposal .
This storage medium runs into data loss when the storage requirements get to ~ 20GB .
At 2-3TB , this is a technically poor solution .</tokentext>
<sentencetext>sadly, even assuming submitter has an army of slaves to print/scan this stuff at his disposal.
This storage medium runs into data loss when the storage requirements get to ~20GB.
At 2-3TB, this is a technically poor solution.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351344</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351566</id>
	<title>I'd encrypt the data and...</title>
	<author>Rivalz</author>
	<datestamp>1267616280000</datestamp>
	<modclass>Funny</modclass>
	<modscore>5</modscore>
	<htmltext>Label it something like complete american idol blueray collection and upload it on p2p to piratebay.
every couple years rename it to some other horrible popular tv series. It will be self sustaining form of storage with infinite number of redundant hosts.</htmltext>
<tokenext>Label it something like complete american idol blueray collection and upload it on p2p to piratebay .
every couple years rename it to some other horrible popular tv series .
It will be self sustaining form of storage with infinite number of redundant hosts .</tokentext>
<sentencetext>Label it something like complete american idol blueray collection and upload it on p2p to piratebay.
every couple years rename it to some other horrible popular tv series.
It will be self sustaining form of storage with infinite number of redundant hosts.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31358786</id>
	<title>Re:Exactly what you're doing</title>
	<author>Caffeinated Geek</author>
	<datestamp>1267722900000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>It depends on the time frame but the music industry has lost a fair number of digital masters that were kept on hard disk rather than digital tape. If I remember correctly the issue was heads sticking on drives that are left on a shelf for years but I could be wrong.  <br>

I guess it seemed like a good idea to scrap tape since the recordings in in question were direct digital recordings to disk. Analog tapes that are many years old can almost always be recovered at least long enough to make a duplicate. I've head stories about some pretty extraordinary steps being taken to recover damaged master tapes. Digital tape is a little more difficult but still much more recoverable.</htmltext>
<tokenext>It depends on the time frame but the music industry has lost a fair number of digital masters that were kept on hard disk rather than digital tape .
If I remember correctly the issue was heads sticking on drives that are left on a shelf for years but I could be wrong .
I guess it seemed like a good idea to scrap tape since the recordings in in question were direct digital recordings to disk .
Analog tapes that are many years old can almost always be recovered at least long enough to make a duplicate .
I 've head stories about some pretty extraordinary steps being taken to recover damaged master tapes .
Digital tape is a little more difficult but still much more recoverable .</tokentext>
<sentencetext>It depends on the time frame but the music industry has lost a fair number of digital masters that were kept on hard disk rather than digital tape.
If I remember correctly the issue was heads sticking on drives that are left on a shelf for years but I could be wrong.
I guess it seemed like a good idea to scrap tape since the recordings in in question were direct digital recordings to disk.
Analog tapes that are many years old can almost always be recovered at least long enough to make a duplicate.
I've head stories about some pretty extraordinary steps being taken to recover damaged master tapes.
Digital tape is a little more difficult but still much more recoverable.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351504</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351714</id>
	<title>Use a single tape drive, not a tape library</title>
	<author>linuxgurugamer</author>
	<datestamp>1267616880000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>You don't need a tape library.  Just get a single tape drive, and you will be able to store everything on 3-6 tapes.  Yes, you will have to swap tapes by hand, but it is a lot cheaper.</p><p>LTO-4 stores 800 gig per tape, uncompressed.  If you let the tape drive do the compression, you might even be able to get away with one or two tapes.  Tapes are inexpensive, and are designed for long term storage.</p></htmltext>
<tokenext>You do n't need a tape library .
Just get a single tape drive , and you will be able to store everything on 3-6 tapes .
Yes , you will have to swap tapes by hand , but it is a lot cheaper.LTO-4 stores 800 gig per tape , uncompressed .
If you let the tape drive do the compression , you might even be able to get away with one or two tapes .
Tapes are inexpensive , and are designed for long term storage .</tokentext>
<sentencetext>You don't need a tape library.
Just get a single tape drive, and you will be able to store everything on 3-6 tapes.
Yes, you will have to swap tapes by hand, but it is a lot cheaper.LTO-4 stores 800 gig per tape, uncompressed.
If you let the tape drive do the compression, you might even be able to get away with one or two tapes.
Tapes are inexpensive, and are designed for long term storage.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353470</id>
	<title>Re:use a tape drive</title>
	<author>Quietlife2k</author>
	<datestamp>1267628700000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>I hope modern tape drives are more reliable and accurate than the old DDS2 days.<br>
<br>
I once had a situation where I had three IBM servers on my desk to commission.  My boss told me "test everything" - I was young and reckless so I did.<br>
Everything was great until I started testing the tape backup.  Of the three machines -<br>
"Machine A" would read it's own and "Machine B"'s backups, but not "Machine C"'s.<br>
"Machine B" was happy with tapes from any of the machines.<br>
"Machine C" could only read tapes from "Machine B" and itself.<br>
<br>
Individually no single machine actually had a reportable fault, yet in combination they proved to provide a nightmare scenario.<br>
We could not know if a particular backup from one tape drive was going to restore on a different drive.  We wound up getting IBM to "tune" our drives into compatibility with each other, even then they would "drift" over time requiring regular checks.</htmltext>
<tokenext>I hope modern tape drives are more reliable and accurate than the old DDS2 days .
I once had a situation where I had three IBM servers on my desk to commission .
My boss told me " test everything " - I was young and reckless so I did .
Everything was great until I started testing the tape backup .
Of the three machines - " Machine A " would read it 's own and " Machine B " 's backups , but not " Machine C " 's .
" Machine B " was happy with tapes from any of the machines .
" Machine C " could only read tapes from " Machine B " and itself .
Individually no single machine actually had a reportable fault , yet in combination they proved to provide a nightmare scenario .
We could not know if a particular backup from one tape drive was going to restore on a different drive .
We wound up getting IBM to " tune " our drives into compatibility with each other , even then they would " drift " over time requiring regular checks .</tokentext>
<sentencetext>I hope modern tape drives are more reliable and accurate than the old DDS2 days.
I once had a situation where I had three IBM servers on my desk to commission.
My boss told me "test everything" - I was young and reckless so I did.
Everything was great until I started testing the tape backup.
Of the three machines -
"Machine A" would read it's own and "Machine B"'s backups, but not "Machine C"'s.
"Machine B" was happy with tapes from any of the machines.
"Machine C" could only read tapes from "Machine B" and itself.
Individually no single machine actually had a reportable fault, yet in combination they proved to provide a nightmare scenario.
We could not know if a particular backup from one tape drive was going to restore on a different drive.
We wound up getting IBM to "tune" our drives into compatibility with each other, even then they would "drift" over time requiring regular checks.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351604</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352012</id>
	<title>Re:Exactly what you're doing</title>
	<author>Anonymous</author>
	<datestamp>1267618380000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>Out of curiosity, what's wrong with S3 for protecting data?</p></htmltext>
<tokenext>Out of curiosity , what 's wrong with S3 for protecting data ?</tokentext>
<sentencetext>Out of curiosity, what's wrong with S3 for protecting data?</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351504</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353130</id>
	<title>Re:Use RAID6 not RAID5</title>
	<author>Anonymous</author>
	<datestamp>1267625520000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>Define "very safe".  One fire or power spike could damage the whole lot at once.  the biggest problem with RAID is that all the drives must be stored in the same physical box.  To be really safe, you need an offsite backup.  Maybe that means 2 RAID setups, but somehow you have to update from one to the other periodically...and with each update, a failure during the update could mean data loss.</p></htmltext>
<tokenext>Define " very safe " .
One fire or power spike could damage the whole lot at once .
the biggest problem with RAID is that all the drives must be stored in the same physical box .
To be really safe , you need an offsite backup .
Maybe that means 2 RAID setups , but somehow you have to update from one to the other periodically...and with each update , a failure during the update could mean data loss .</tokentext>
<sentencetext>Define "very safe".
One fire or power spike could damage the whole lot at once.
the biggest problem with RAID is that all the drives must be stored in the same physical box.
To be really safe, you need an offsite backup.
Maybe that means 2 RAID setups, but somehow you have to update from one to the other periodically...and with each update, a failure during the update could mean data loss.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351578</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31366458</id>
	<title>taps for tape</title>
	<author>epine</author>
	<datestamp>1267718400000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p><a href="http://www.theregister.co.uk/2009/08/13/imation\_lto5\_media/" title="theregister.co.uk">Is LTO-5 the last hurrah for tape?</a> [theregister.co.uk]</p><p>What about the problem of having a working LTO-4 drive ten years from now, if the tape industry begins to wither as other solutions continue to eat away at the tape market?</p><p>I think the creativity here is negotiating the nature of the SLA contract with the clients.  My preference is just to set up a local disk array with enough spinning capacity, not promising to survive a site disaster, charging for the service only so long as the data remains live.</p><p>To complement this, send each client a master LTO-4 tape (or a disk drive) and tell them "it's up to you if you need to recover this, but I'll help you out if I'm able".</p><p>Otherwise, you get into this horrible risk calculus where the client is not thinking through the cost benefit with rational comprehension.</p><p>I would try to find some way to unpack convenience from autonomy from ultimate responsibility, because if you don't, your clients shouldn't be balking at the price of Amazon S3.  If they are balking, it's because they don't really want all three of these packed together, but the timid bureaucrats don't wish to admit this, in case the day comes when data is lost.</p><p>In economics, it is common to do net present value calculations.  It would be interesting to do a backward discount on prudence if the day comes when the shit hits the fan.  There's a lot of weird asymmetry in human psychology associated with risk.</p><p>From a business perspective, it's sometimes good to give your customer's options priced at levels where you expect not to get many takers.  See the story about The Economist subscription model in Ariely's lecture:</p><p><a href="http://www.ted.com/talks/dan\_ariely\_on\_our\_buggy\_moral\_code.html" title="ted.com">Dan Ariely on our buggy moral code</a> [ted.com]</p><p>It's amazingly hard to find residential fire statistics on a per annum risk basis, if we're looking at personal acts of god rather than communal acts of god (hurricane, earthquake, etc.)  The firefighters meticulously count the number of times they respond, but seem not to talk to the fire insurance people about the number of structures insured.  Not one report I looked at from the UK, the US, or Canada denominated the statistics per residence.</p><p>A loose estimate for Canada in 2002 is a residential risk rate of 1 in 300 per annum.  Older building stock with plush curtains and deep fat friers will have higher rates, recent building stock with working fire detectors and no children will have lower risk.</p><p>Another table shows me that the risk of a 45 year old male being diagnosed with cancer by age 50 is 1.5\%, or about 300:1 against per annum.  Radon gas causes 15\% of lung cancer, and 15\% of American homes exceed recommended action levels.  How many of the stripes+parity+fail\_over+hot\_spare+IronMountain crowd here have bothered to purchase the $50 home radon test (excluding smokers who smoke indoors, who are in a different risk category altogether)?</p><p>The human mind seems to incorporate an instinctive Bayesian prior that if you are actively discussing a risk, the risk is immediately ten or a hundred times greater than it was five minutes ago, before the risk entered the conversation.  Likewise, any hill you are standing at the bottom of seems ten times higher or steeper than any comparable hill on the other side of the valley.</p></htmltext>
<tokenext>Is LTO-5 the last hurrah for tape ?
[ theregister.co.uk ] What about the problem of having a working LTO-4 drive ten years from now , if the tape industry begins to wither as other solutions continue to eat away at the tape market ? I think the creativity here is negotiating the nature of the SLA contract with the clients .
My preference is just to set up a local disk array with enough spinning capacity , not promising to survive a site disaster , charging for the service only so long as the data remains live.To complement this , send each client a master LTO-4 tape ( or a disk drive ) and tell them " it 's up to you if you need to recover this , but I 'll help you out if I 'm able " .Otherwise , you get into this horrible risk calculus where the client is not thinking through the cost benefit with rational comprehension.I would try to find some way to unpack convenience from autonomy from ultimate responsibility , because if you do n't , your clients should n't be balking at the price of Amazon S3 .
If they are balking , it 's because they do n't really want all three of these packed together , but the timid bureaucrats do n't wish to admit this , in case the day comes when data is lost.In economics , it is common to do net present value calculations .
It would be interesting to do a backward discount on prudence if the day comes when the shit hits the fan .
There 's a lot of weird asymmetry in human psychology associated with risk.From a business perspective , it 's sometimes good to give your customer 's options priced at levels where you expect not to get many takers .
See the story about The Economist subscription model in Ariely 's lecture : Dan Ariely on our buggy moral code [ ted.com ] It 's amazingly hard to find residential fire statistics on a per annum risk basis , if we 're looking at personal acts of god rather than communal acts of god ( hurricane , earthquake , etc .
) The firefighters meticulously count the number of times they respond , but seem not to talk to the fire insurance people about the number of structures insured .
Not one report I looked at from the UK , the US , or Canada denominated the statistics per residence.A loose estimate for Canada in 2002 is a residential risk rate of 1 in 300 per annum .
Older building stock with plush curtains and deep fat friers will have higher rates , recent building stock with working fire detectors and no children will have lower risk.Another table shows me that the risk of a 45 year old male being diagnosed with cancer by age 50 is 1.5 \ % , or about 300 : 1 against per annum .
Radon gas causes 15 \ % of lung cancer , and 15 \ % of American homes exceed recommended action levels .
How many of the stripes + parity + fail \ _over + hot \ _spare + IronMountain crowd here have bothered to purchase the $ 50 home radon test ( excluding smokers who smoke indoors , who are in a different risk category altogether ) ? The human mind seems to incorporate an instinctive Bayesian prior that if you are actively discussing a risk , the risk is immediately ten or a hundred times greater than it was five minutes ago , before the risk entered the conversation .
Likewise , any hill you are standing at the bottom of seems ten times higher or steeper than any comparable hill on the other side of the valley .</tokentext>
<sentencetext>Is LTO-5 the last hurrah for tape?
[theregister.co.uk]What about the problem of having a working LTO-4 drive ten years from now, if the tape industry begins to wither as other solutions continue to eat away at the tape market?I think the creativity here is negotiating the nature of the SLA contract with the clients.
My preference is just to set up a local disk array with enough spinning capacity, not promising to survive a site disaster, charging for the service only so long as the data remains live.To complement this, send each client a master LTO-4 tape (or a disk drive) and tell them "it's up to you if you need to recover this, but I'll help you out if I'm able".Otherwise, you get into this horrible risk calculus where the client is not thinking through the cost benefit with rational comprehension.I would try to find some way to unpack convenience from autonomy from ultimate responsibility, because if you don't, your clients shouldn't be balking at the price of Amazon S3.
If they are balking, it's because they don't really want all three of these packed together, but the timid bureaucrats don't wish to admit this, in case the day comes when data is lost.In economics, it is common to do net present value calculations.
It would be interesting to do a backward discount on prudence if the day comes when the shit hits the fan.
There's a lot of weird asymmetry in human psychology associated with risk.From a business perspective, it's sometimes good to give your customer's options priced at levels where you expect not to get many takers.
See the story about The Economist subscription model in Ariely's lecture:Dan Ariely on our buggy moral code [ted.com]It's amazingly hard to find residential fire statistics on a per annum risk basis, if we're looking at personal acts of god rather than communal acts of god (hurricane, earthquake, etc.
)  The firefighters meticulously count the number of times they respond, but seem not to talk to the fire insurance people about the number of structures insured.
Not one report I looked at from the UK, the US, or Canada denominated the statistics per residence.A loose estimate for Canada in 2002 is a residential risk rate of 1 in 300 per annum.
Older building stock with plush curtains and deep fat friers will have higher rates, recent building stock with working fire detectors and no children will have lower risk.Another table shows me that the risk of a 45 year old male being diagnosed with cancer by age 50 is 1.5\%, or about 300:1 against per annum.
Radon gas causes 15\% of lung cancer, and 15\% of American homes exceed recommended action levels.
How many of the stripes+parity+fail\_over+hot\_spare+IronMountain crowd here have bothered to purchase the $50 home radon test (excluding smokers who smoke indoors, who are in a different risk category altogether)?The human mind seems to incorporate an instinctive Bayesian prior that if you are actively discussing a risk, the risk is immediately ten or a hundred times greater than it was five minutes ago, before the risk entered the conversation.
Likewise, any hill you are standing at the bottom of seems ten times higher or steeper than any comparable hill on the other side of the valley.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351572</id>
	<title>On the right track</title>
	<author>LoudMusic</author>
	<datestamp>1267616340000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>Hard drives are by far the best route. The only thing I might change would be to use a pre-packaged external disk array system. Depending on what you get out of your compression, a two or four drive external system with 1TB drives would suffice. You can get them in USB, Firewire, esata, and SCSI. A nice USB2 (or upcoming USB3) system would seem to make lots of sense and be acceptable by future systems. They tend to be fairly compact and have decent read / write speeds.</p><p>My main caveot with these things is that just because it's external does not mean it's particularly portable. The larger capacity devices use "desktop" hard drives which are more susceptible to movement than the smaller drives used in laptops. Give it time to power down before moving and move it as little as possible. I've seen too many of these things die because the user would move it while it was on or use it as a transport device. That is NOT what they are for.</p></htmltext>
<tokenext>Hard drives are by far the best route .
The only thing I might change would be to use a pre-packaged external disk array system .
Depending on what you get out of your compression , a two or four drive external system with 1TB drives would suffice .
You can get them in USB , Firewire , esata , and SCSI .
A nice USB2 ( or upcoming USB3 ) system would seem to make lots of sense and be acceptable by future systems .
They tend to be fairly compact and have decent read / write speeds.My main caveot with these things is that just because it 's external does not mean it 's particularly portable .
The larger capacity devices use " desktop " hard drives which are more susceptible to movement than the smaller drives used in laptops .
Give it time to power down before moving and move it as little as possible .
I 've seen too many of these things die because the user would move it while it was on or use it as a transport device .
That is NOT what they are for .</tokentext>
<sentencetext>Hard drives are by far the best route.
The only thing I might change would be to use a pre-packaged external disk array system.
Depending on what you get out of your compression, a two or four drive external system with 1TB drives would suffice.
You can get them in USB, Firewire, esata, and SCSI.
A nice USB2 (or upcoming USB3) system would seem to make lots of sense and be acceptable by future systems.
They tend to be fairly compact and have decent read / write speeds.My main caveot with these things is that just because it's external does not mean it's particularly portable.
The larger capacity devices use "desktop" hard drives which are more susceptible to movement than the smaller drives used in laptops.
Give it time to power down before moving and move it as little as possible.
I've seen too many of these things die because the user would move it while it was on or use it as a transport device.
That is NOT what they are for.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351720</id>
	<title>LTO-4?</title>
	<author>Chalex</author>
	<datestamp>1267616940000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>With easily compressible data (e.g. genomics data), I've gotten as much as 5TB onto a single LTO-4 tape using the regular drive compression.</p><p>An LTO-4 tape costs me ~$50.  It's smaller than a 3.5" SATA drive and easier to handle.  It can probably even survive a drop to the floor from chest height.</p><p>You'll need to spend some money on a drive or tape library.  So it depends on how many datasets like this you need to write.</p></htmltext>
<tokenext>With easily compressible data ( e.g .
genomics data ) , I 've gotten as much as 5TB onto a single LTO-4 tape using the regular drive compression.An LTO-4 tape costs me ~ $ 50 .
It 's smaller than a 3.5 " SATA drive and easier to handle .
It can probably even survive a drop to the floor from chest height.You 'll need to spend some money on a drive or tape library .
So it depends on how many datasets like this you need to write .</tokentext>
<sentencetext>With easily compressible data (e.g.
genomics data), I've gotten as much as 5TB onto a single LTO-4 tape using the regular drive compression.An LTO-4 tape costs me ~$50.
It's smaller than a 3.5" SATA drive and easier to handle.
It can probably even survive a drop to the floor from chest height.You'll need to spend some money on a drive or tape library.
So it depends on how many datasets like this you need to write.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351970</id>
	<title>Re:Exactly what you're doing</title>
	<author>Sponge Bath</author>
	<datestamp>1267618200000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p> <i>Depending on how important the data is...</i></p><p>That's the key question the author needs to address. Is it important enough to throw a few thousand dollars per dataset into archiving? A few tens of thousands? The best suggestion seems to be multiple copies on multiple non RAID hard drives stored at different physical locations with periodic integrity checks and regularly scheduled drive replacements.</p></htmltext>
<tokenext>Depending on how important the data is...That 's the key question the author needs to address .
Is it important enough to throw a few thousand dollars per dataset into archiving ?
A few tens of thousands ?
The best suggestion seems to be multiple copies on multiple non RAID hard drives stored at different physical locations with periodic integrity checks and regularly scheduled drive replacements .</tokentext>
<sentencetext> Depending on how important the data is...That's the key question the author needs to address.
Is it important enough to throw a few thousand dollars per dataset into archiving?
A few tens of thousands?
The best suggestion seems to be multiple copies on multiple non RAID hard drives stored at different physical locations with periodic integrity checks and regularly scheduled drive replacements.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351504</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31358044</id>
	<title>Anonymous Coward</title>
	<author>Anonymous</author>
	<datestamp>1267718820000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>How about using blue-ray juke box or some sort of WORM technology?  Google Powerfile (I have no vested interest in this company) and you might find their technology useful.  It's proprietary, but I'm assuming long-term data storage cost should go down significantly.  Obviously, there are other WORM technologies out there that might do the job just fine.</p></htmltext>
<tokenext>How about using blue-ray juke box or some sort of WORM technology ?
Google Powerfile ( I have no vested interest in this company ) and you might find their technology useful .
It 's proprietary , but I 'm assuming long-term data storage cost should go down significantly .
Obviously , there are other WORM technologies out there that might do the job just fine .</tokentext>
<sentencetext>How about using blue-ray juke box or some sort of WORM technology?
Google Powerfile (I have no vested interest in this company) and you might find their technology useful.
It's proprietary, but I'm assuming long-term data storage cost should go down significantly.
Obviously, there are other WORM technologies out there that might do the job just fine.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31360730</id>
	<title>Online cloud storage.</title>
	<author>barfcat</author>
	<datestamp>1267731300000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>Barracuda networks offers a backup service that scales very well with large databases. They also use a deduplication technology that examines each file part by part. If two parts are the same then only one is kept and the other has a marker. In most cases the deduplication ratio is at least 5x (meaning 1tb turns into 200gb) and sometimes it can go as high as 50x deduplication (this is mostly the case with images). After dedup, the data is encrypted and compressed (compression far out weighs the encryption in terms of size) and THEN it's sent off to the cloud.</htmltext>
<tokenext>Barracuda networks offers a backup service that scales very well with large databases .
They also use a deduplication technology that examines each file part by part .
If two parts are the same then only one is kept and the other has a marker .
In most cases the deduplication ratio is at least 5x ( meaning 1tb turns into 200gb ) and sometimes it can go as high as 50x deduplication ( this is mostly the case with images ) .
After dedup , the data is encrypted and compressed ( compression far out weighs the encryption in terms of size ) and THEN it 's sent off to the cloud .</tokentext>
<sentencetext>Barracuda networks offers a backup service that scales very well with large databases.
They also use a deduplication technology that examines each file part by part.
If two parts are the same then only one is kept and the other has a marker.
In most cases the deduplication ratio is at least 5x (meaning 1tb turns into 200gb) and sometimes it can go as high as 50x deduplication (this is mostly the case with images).
After dedup, the data is encrypted and compressed (compression far out weighs the encryption in terms of size) and THEN it's sent off to the cloud.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352046</id>
	<title>Compression could be your friend</title>
	<author>stubby326</author>
	<datestamp>1267618560000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>The question I have is what happens when you run your data sets through a compression program like RAR? One of our standard Database backups weighs in at 60gb, but compresses nicely down to about 5gb. If you can get your compressed data set under 2tb, I suggest backing up the compressed data set on to a mirror set and store it. The trick is getting it small enough to fit on a single drive. If it doesn't compress nicely, and still ends up requiring some kind of disk spanning, just stick with what you're doing. It's a solid solution.</htmltext>
<tokenext>The question I have is what happens when you run your data sets through a compression program like RAR ?
One of our standard Database backups weighs in at 60gb , but compresses nicely down to about 5gb .
If you can get your compressed data set under 2tb , I suggest backing up the compressed data set on to a mirror set and store it .
The trick is getting it small enough to fit on a single drive .
If it does n't compress nicely , and still ends up requiring some kind of disk spanning , just stick with what you 're doing .
It 's a solid solution .</tokentext>
<sentencetext>The question I have is what happens when you run your data sets through a compression program like RAR?
One of our standard Database backups weighs in at 60gb, but compresses nicely down to about 5gb.
If you can get your compressed data set under 2tb, I suggest backing up the compressed data set on to a mirror set and store it.
The trick is getting it small enough to fit on a single drive.
If it doesn't compress nicely, and still ends up requiring some kind of disk spanning, just stick with what you're doing.
It's a solid solution.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351446</id>
	<title>Go with Blu-ray</title>
	<author>sabreofsd</author>
	<datestamp>1267615740000</datestamp>
	<modclass>Interestin</modclass>
	<modscore>3</modscore>
	<htmltext>With the advent of 2TB drives, you could easily combine 3 of these with software RAID 5 as you suggested.  Depending on how long you need to keep the data, recording them to dual-layer blu-ray disks might be a better solution.  Ya, it's a lot of disks (you can buy 100GB discs now), but they'll last longer and you don't have to worry so much about mechanical failure or needing a certain OS when you want to restore them.</htmltext>
<tokenext>With the advent of 2TB drives , you could easily combine 3 of these with software RAID 5 as you suggested .
Depending on how long you need to keep the data , recording them to dual-layer blu-ray disks might be a better solution .
Ya , it 's a lot of disks ( you can buy 100GB discs now ) , but they 'll last longer and you do n't have to worry so much about mechanical failure or needing a certain OS when you want to restore them .</tokentext>
<sentencetext>With the advent of 2TB drives, you could easily combine 3 of these with software RAID 5 as you suggested.
Depending on how long you need to keep the data, recording them to dual-layer blu-ray disks might be a better solution.
Ya, it's a lot of disks (you can buy 100GB discs now), but they'll last longer and you don't have to worry so much about mechanical failure or needing a certain OS when you want to restore them.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351420</id>
	<title>Amazon S3</title>
	<author>friedo</author>
	<datestamp>1267615560000</datestamp>
	<modclass>Informativ</modclass>
	<modscore>3</modscore>
	<htmltext><p>It can get a little pricey for huge datasets, but Amazon S3 now has an option where you can <a href="http://aws.amazon.com/importexport/" title="amazon.com">ship your data</a> [amazon.com] on a big set of disks directly to them, they will import everything into S3, and it will live there forever. The nice thing about S3 is unlike physical disks, it can grow essentially forever, and comes with retention and redundancy guarantees. And once your stuff is in S3, you can recycle the same disks to mail them more data.</p></htmltext>
<tokenext>It can get a little pricey for huge datasets , but Amazon S3 now has an option where you can ship your data [ amazon.com ] on a big set of disks directly to them , they will import everything into S3 , and it will live there forever .
The nice thing about S3 is unlike physical disks , it can grow essentially forever , and comes with retention and redundancy guarantees .
And once your stuff is in S3 , you can recycle the same disks to mail them more data .</tokentext>
<sentencetext>It can get a little pricey for huge datasets, but Amazon S3 now has an option where you can ship your data [amazon.com] on a big set of disks directly to them, they will import everything into S3, and it will live there forever.
The nice thing about S3 is unlike physical disks, it can grow essentially forever, and comes with retention and redundancy guarantees.
And once your stuff is in S3, you can recycle the same disks to mail them more data.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351602</id>
	<title>Extra Care Required!!!</title>
	<author>mpapet</author>
	<datestamp>1267616400000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>There are going to be quite a few storage service names thrown out as well as compression schemes.</p><p>1. Storage vendors you run real risk of having the data go away.  There's a huge liability balancing act going this route.<br>2. Compression schemes.  As someone who has lost data to compression errors, the consequences of 'just' compressing a file can be huge.  <a href="http://www.linuxquestions.org/questions/linux-software-2/recovering-files-from-corrupt-tar-archive...-326716/" title="linuxquestions.org">http://www.linuxquestions.org/questions/linux-software-2/recovering-files-from-corrupt-tar-archive...-326716/</a> [linuxquestions.org]  (not my post, but similar story)</p><p>I would suggest building tape archives, but as I mentioned above, this can be more hazardous than it should be.  (ANY backup exec admin who have blindly relied on Symantec's solution without testing, testing, testing have horror stories)</p><p>Finally, I'd probably go with WORM optical media as the final storage media with a tape backup.  There are lots of process decisions after just recommending hardware, so you are hardly done.</p></htmltext>
<tokenext>There are going to be quite a few storage service names thrown out as well as compression schemes.1 .
Storage vendors you run real risk of having the data go away .
There 's a huge liability balancing act going this route.2 .
Compression schemes .
As someone who has lost data to compression errors , the consequences of 'just ' compressing a file can be huge .
http : //www.linuxquestions.org/questions/linux-software-2/recovering-files-from-corrupt-tar-archive...-326716/ [ linuxquestions.org ] ( not my post , but similar story ) I would suggest building tape archives , but as I mentioned above , this can be more hazardous than it should be .
( ANY backup exec admin who have blindly relied on Symantec 's solution without testing , testing , testing have horror stories ) Finally , I 'd probably go with WORM optical media as the final storage media with a tape backup .
There are lots of process decisions after just recommending hardware , so you are hardly done .</tokentext>
<sentencetext>There are going to be quite a few storage service names thrown out as well as compression schemes.1.
Storage vendors you run real risk of having the data go away.
There's a huge liability balancing act going this route.2.
Compression schemes.
As someone who has lost data to compression errors, the consequences of 'just' compressing a file can be huge.
http://www.linuxquestions.org/questions/linux-software-2/recovering-files-from-corrupt-tar-archive...-326716/ [linuxquestions.org]  (not my post, but similar story)I would suggest building tape archives, but as I mentioned above, this can be more hazardous than it should be.
(ANY backup exec admin who have blindly relied on Symantec's solution without testing, testing, testing have horror stories)Finally, I'd probably go with WORM optical media as the final storage media with a tape backup.
There are lots of process decisions after just recommending hardware, so you are hardly done.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353002</id>
	<title>Not-so-safe deposit</title>
	<author>Anonymous</author>
	<datestamp>1267624440000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>While working from home several years back got a safe deposit box to store hard-drives in in this manner.<br>I never had a problem, but about six months in read the Terms &amp; Conditions again -there was a very strongly worded statement (in the fine print of course) along the lines that they <b>did not recommend storing magnetic data in the facility</b>.<br>When I thought about it this made sense -this was a general safe deposit facility; drug money laundered, secret agent stuff, jewels, <i>rare-earth magnet collections...</i></p><p>Unless the facility you use (for hard disk or tape) explicitly sets out to protect you from it, you can't be sure what other people put in the box next to yours.<br>If the backup is very important, make sure you check the facilities policy and enforcement on this.</p></htmltext>
<tokenext>While working from home several years back got a safe deposit box to store hard-drives in in this manner.I never had a problem , but about six months in read the Terms &amp; Conditions again -there was a very strongly worded statement ( in the fine print of course ) along the lines that they did not recommend storing magnetic data in the facility.When I thought about it this made sense -this was a general safe deposit facility ; drug money laundered , secret agent stuff , jewels , rare-earth magnet collections...Unless the facility you use ( for hard disk or tape ) explicitly sets out to protect you from it , you ca n't be sure what other people put in the box next to yours.If the backup is very important , make sure you check the facilities policy and enforcement on this .</tokentext>
<sentencetext>While working from home several years back got a safe deposit box to store hard-drives in in this manner.I never had a problem, but about six months in read the Terms &amp; Conditions again -there was a very strongly worded statement (in the fine print of course) along the lines that they did not recommend storing magnetic data in the facility.When I thought about it this made sense -this was a general safe deposit facility; drug money laundered, secret agent stuff, jewels, rare-earth magnet collections...Unless the facility you use (for hard disk or tape) explicitly sets out to protect you from it, you can't be sure what other people put in the box next to yours.If the backup is very important, make sure you check the facilities policy and enforcement on this.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353082</id>
	<title>Questions about tape</title>
	<author>inhuman\_4</author>
	<datestamp>1267625040000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>I often hear on slashdot that tapes are much better for long term storage. However I have never used one myself.</p><p>What kind of experiences do slashdot readers have with tapes? How long do they last in storage? At what size of storage do they become cost effective?<br>I know some manufacture somewhere will have data sheets that could answer all my questions, but I would prefer to hear first hand accounts rather then marketing ads.</p></htmltext>
<tokenext>I often hear on slashdot that tapes are much better for long term storage .
However I have never used one myself.What kind of experiences do slashdot readers have with tapes ?
How long do they last in storage ?
At what size of storage do they become cost effective ? I know some manufacture somewhere will have data sheets that could answer all my questions , but I would prefer to hear first hand accounts rather then marketing ads .</tokentext>
<sentencetext>I often hear on slashdot that tapes are much better for long term storage.
However I have never used one myself.What kind of experiences do slashdot readers have with tapes?
How long do they last in storage?
At what size of storage do they become cost effective?I know some manufacture somewhere will have data sheets that could answer all my questions, but I would prefer to hear first hand accounts rather then marketing ads.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31357758</id>
	<title>Preservation</title>
	<author>Mark Trade</author>
	<datestamp>1267716960000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>Other than the bare physical layer, you will want to think about restoring and reusing the data at some point in the future when the software used to generate the data is no longer available. What good is a RAID 5 of a couple Gigs of data when you can't use it? Or don't know what the data actually meant?</p></htmltext>
<tokenext>Other than the bare physical layer , you will want to think about restoring and reusing the data at some point in the future when the software used to generate the data is no longer available .
What good is a RAID 5 of a couple Gigs of data when you ca n't use it ?
Or do n't know what the data actually meant ?</tokentext>
<sentencetext>Other than the bare physical layer, you will want to think about restoring and reusing the data at some point in the future when the software used to generate the data is no longer available.
What good is a RAID 5 of a couple Gigs of data when you can't use it?
Or don't know what the data actually meant?</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351626</id>
	<title>For what it's worth</title>
	<author>Anonymous</author>
	<datestamp>1267616520000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>every hard drive (7 or 8) that I've put in my safe deposit box has come out with file system errors.  It looks like random bit-flips to me, and all of them were recoverable to some extent (minor data loss).</p><p>I have no idea why this happens, and nobody I've talked with can come up with a good explanation.  There were probably 3 different brands and not all from the same series from each brand.  Some had been used reliably for years prior to storage and others were brand new.</p><p>Bottom line:  You probably shouldn't assume that safety from physical access is the same as data safety.  For all you know, someone is storing their magnet collection above you.</p></htmltext>
<tokenext>every hard drive ( 7 or 8 ) that I 've put in my safe deposit box has come out with file system errors .
It looks like random bit-flips to me , and all of them were recoverable to some extent ( minor data loss ) .I have no idea why this happens , and nobody I 've talked with can come up with a good explanation .
There were probably 3 different brands and not all from the same series from each brand .
Some had been used reliably for years prior to storage and others were brand new.Bottom line : You probably should n't assume that safety from physical access is the same as data safety .
For all you know , someone is storing their magnet collection above you .</tokentext>
<sentencetext>every hard drive (7 or 8) that I've put in my safe deposit box has come out with file system errors.
It looks like random bit-flips to me, and all of them were recoverable to some extent (minor data loss).I have no idea why this happens, and nobody I've talked with can come up with a good explanation.
There were probably 3 different brands and not all from the same series from each brand.
Some had been used reliably for years prior to storage and others were brand new.Bottom line:  You probably shouldn't assume that safety from physical access is the same as data safety.
For all you know, someone is storing their magnet collection above you.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353350</id>
	<title>Data Domain De-duplicated NAS</title>
	<author>good soldier svejk</author>
	<datestamp>1267627620000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>Basically it is a NAS that breaks your data into sub-block sized chunks at ingestion, checksums it and single instances the chunks. Most users get 20:1 average compression. Obviously different data types de-dupe at different efficiencies. Our VM images typically hit 40:1, SQL Server DBs about 30:1, medical images 5:1.

DDOS now includes some archive specific features, which it sounds like you might find useful. It also has a very robust replication feature. Since it was originally designed for backup, it is optimized for data integrity. DD is dead simple to configure and operate and the support is great.

<a href="http://www.datadomain.com/products/appliances.html" title="datadomain.com">http://www.datadomain.com/products/appliances.html</a> [datadomain.com]</htmltext>
<tokenext>Basically it is a NAS that breaks your data into sub-block sized chunks at ingestion , checksums it and single instances the chunks .
Most users get 20 : 1 average compression .
Obviously different data types de-dupe at different efficiencies .
Our VM images typically hit 40 : 1 , SQL Server DBs about 30 : 1 , medical images 5 : 1 .
DDOS now includes some archive specific features , which it sounds like you might find useful .
It also has a very robust replication feature .
Since it was originally designed for backup , it is optimized for data integrity .
DD is dead simple to configure and operate and the support is great .
http : //www.datadomain.com/products/appliances.html [ datadomain.com ]</tokentext>
<sentencetext>Basically it is a NAS that breaks your data into sub-block sized chunks at ingestion, checksums it and single instances the chunks.
Most users get 20:1 average compression.
Obviously different data types de-dupe at different efficiencies.
Our VM images typically hit 40:1, SQL Server DBs about 30:1, medical images 5:1.
DDOS now includes some archive specific features, which it sounds like you might find useful.
It also has a very robust replication feature.
Since it was originally designed for backup, it is optimized for data integrity.
DD is dead simple to configure and operate and the support is great.
http://www.datadomain.com/products/appliances.html [datadomain.com]</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352752</id>
	<title>Re:Agree with the tape option..;.</title>
	<author>rrohbeck</author>
	<datestamp>1267622640000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>DAT is no fun for a couple of terabytes.</p></htmltext>
<tokenext>DAT is no fun for a couple of terabytes .</tokentext>
<sentencetext>DAT is no fun for a couple of terabytes.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351838</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354712</id>
	<title>Re:Agree with the tape option..;.</title>
	<author>PhunkySchtuff</author>
	<datestamp>1267640160000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>OMG. You're not advocating the use of DDS tapes are you? DDS tapes are notoriously unreliable and temperamental. Tape needs to be done properly - that pretty much means LTO these days.</p></htmltext>
<tokenext>OMG .
You 're not advocating the use of DDS tapes are you ?
DDS tapes are notoriously unreliable and temperamental .
Tape needs to be done properly - that pretty much means LTO these days .</tokentext>
<sentencetext>OMG.
You're not advocating the use of DDS tapes are you?
DDS tapes are notoriously unreliable and temperamental.
Tape needs to be done properly - that pretty much means LTO these days.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351838</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352776</id>
	<title>Re:GMail Drive</title>
	<author>rrohbeck</author>
	<datestamp>1267622880000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>Bah. Put it out on Bittorrrent and eMule, give it a few enticing names, and it's safe for decades.</p></htmltext>
<tokenext>Bah .
Put it out on Bittorrrent and eMule , give it a few enticing names , and it 's safe for decades .</tokentext>
<sentencetext>Bah.
Put it out on Bittorrrent and eMule, give it a few enticing names, and it's safe for decades.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351348</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352698</id>
	<title>Depends on frequency of access</title>
	<author>adosch</author>
	<datestamp>1267622280000</datestamp>
	<modclass>Informativ</modclass>
	<modscore>4</modscore>
	<htmltext><p>I work as a contractor for the <a href="http://www.usgs.gov/" title="usgs.gov">USGS</a> [usgs.gov] and the projects I've been involved with host, archive and provide means for customers to access all our different satellite data products.  We've got a Long-term archive method for tons of data products (digitally and tangible) and I can honestly tell you the first thing that always comes up is:  how often will the data need to be accessed?</p><p>For the longest time (almost a decade) we used 3 big, STK tape silos for data archive and retrieval for custom orders.  The problem behind that type of design is we used a archive in a completely wrong manner in the fact that we tried to use it as a archive and a quasi-online retrieval system into a caching filesystem.   We had tape mount counts in the hundreds and thousands, constant mechanical tape issues because of the excessive use, ect.  We actually decided to move it all to online storage using enterprise RAID (EMC Clarion) and moved to a small LTO-4 tape unit for almost permanent, maybe-once-in-a-great-while storage and the rest we leave completely on spinning disk and control the access to it via application layer network protocols as needed.</p><p>IMHO, I really think it's going to depend on the access frequency of your data.  If that custom needs their data once, and maybe never again in case they lose it, put it on tape.  If it's a requirement they can get the data from you any time they want and you've got the hardware and administrative resources, power and bandwidth, put it some RAID.</p></htmltext>
<tokenext>I work as a contractor for the USGS [ usgs.gov ] and the projects I 've been involved with host , archive and provide means for customers to access all our different satellite data products .
We 've got a Long-term archive method for tons of data products ( digitally and tangible ) and I can honestly tell you the first thing that always comes up is : how often will the data need to be accessed ? For the longest time ( almost a decade ) we used 3 big , STK tape silos for data archive and retrieval for custom orders .
The problem behind that type of design is we used a archive in a completely wrong manner in the fact that we tried to use it as a archive and a quasi-online retrieval system into a caching filesystem .
We had tape mount counts in the hundreds and thousands , constant mechanical tape issues because of the excessive use , ect .
We actually decided to move it all to online storage using enterprise RAID ( EMC Clarion ) and moved to a small LTO-4 tape unit for almost permanent , maybe-once-in-a-great-while storage and the rest we leave completely on spinning disk and control the access to it via application layer network protocols as needed.IMHO , I really think it 's going to depend on the access frequency of your data .
If that custom needs their data once , and maybe never again in case they lose it , put it on tape .
If it 's a requirement they can get the data from you any time they want and you 've got the hardware and administrative resources , power and bandwidth , put it some RAID .</tokentext>
<sentencetext>I work as a contractor for the USGS [usgs.gov] and the projects I've been involved with host, archive and provide means for customers to access all our different satellite data products.
We've got a Long-term archive method for tons of data products (digitally and tangible) and I can honestly tell you the first thing that always comes up is:  how often will the data need to be accessed?For the longest time (almost a decade) we used 3 big, STK tape silos for data archive and retrieval for custom orders.
The problem behind that type of design is we used a archive in a completely wrong manner in the fact that we tried to use it as a archive and a quasi-online retrieval system into a caching filesystem.
We had tape mount counts in the hundreds and thousands, constant mechanical tape issues because of the excessive use, ect.
We actually decided to move it all to online storage using enterprise RAID (EMC Clarion) and moved to a small LTO-4 tape unit for almost permanent, maybe-once-in-a-great-while storage and the rest we leave completely on spinning disk and control the access to it via application layer network protocols as needed.IMHO, I really think it's going to depend on the access frequency of your data.
If that custom needs their data once, and maybe never again in case they lose it, put it on tape.
If it's a requirement they can get the data from you any time they want and you've got the hardware and administrative resources, power and bandwidth, put it some RAID.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352006</id>
	<title>have you considered cloud storage?</title>
	<author>sneakyimp</author>
	<datestamp>1267618320000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>I have a bunch of old data backups on CDR that were great for years but they've started to degrade.  I'd be willing to bet that any magnetic disk would be even more vulnerable to data corruption over time.  I don't think your RAID 5 storage technique is a good long-term option.</p><p>This could be a ridiculous suggestion, but have you considered something like cloud storage for this? You could encrypt the data and store it in somebody's cloud and let them worry about backing everything up.</p></htmltext>
<tokenext>I have a bunch of old data backups on CDR that were great for years but they 've started to degrade .
I 'd be willing to bet that any magnetic disk would be even more vulnerable to data corruption over time .
I do n't think your RAID 5 storage technique is a good long-term option.This could be a ridiculous suggestion , but have you considered something like cloud storage for this ?
You could encrypt the data and store it in somebody 's cloud and let them worry about backing everything up .</tokentext>
<sentencetext>I have a bunch of old data backups on CDR that were great for years but they've started to degrade.
I'd be willing to bet that any magnetic disk would be even more vulnerable to data corruption over time.
I don't think your RAID 5 storage technique is a good long-term option.This could be a ridiculous suggestion, but have you considered something like cloud storage for this?
You could encrypt the data and store it in somebody's cloud and let them worry about backing everything up.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352396</id>
	<title>Re:Tape is crap anyway.</title>
	<author>Anonymous</author>
	<datestamp>1267620180000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>If your experience was with Exabyte 8mm then its no wonder you remember headaches.</p></htmltext>
<tokenext>If your experience was with Exabyte 8mm then its no wonder you remember headaches .</tokentext>
<sentencetext>If your experience was with Exabyte 8mm then its no wonder you remember headaches.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351706</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351706</id>
	<title>Tape is crap anyway.</title>
	<author>aussersterne</author>
	<datestamp>1267616820000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>I've never had good experience with tape, from DC6150 SCSI linear tape at home all the way through an Exabyte library with stacks and stacks of 8mm tapes. Two decades of tape has been two decades of heartache and frustration for me and the companies I've worked with. These days I'm no longer in tech or IT (thank god) but for my personal needs I use RAID-1 for live and DVD-RAM (as cumbersome, slow, and small as it is) for offline.</p><p>Tapes just bleed data at an alarming rate, and they are about as reliable as a drunk gabling addict living under the subway next to the OTB shop.</p><p>Do the hard drive thing, and store them well, with redundancy, under good conditions, and replace them often.</p></htmltext>
<tokenext>I 've never had good experience with tape , from DC6150 SCSI linear tape at home all the way through an Exabyte library with stacks and stacks of 8mm tapes .
Two decades of tape has been two decades of heartache and frustration for me and the companies I 've worked with .
These days I 'm no longer in tech or IT ( thank god ) but for my personal needs I use RAID-1 for live and DVD-RAM ( as cumbersome , slow , and small as it is ) for offline.Tapes just bleed data at an alarming rate , and they are about as reliable as a drunk gabling addict living under the subway next to the OTB shop.Do the hard drive thing , and store them well , with redundancy , under good conditions , and replace them often .</tokentext>
<sentencetext>I've never had good experience with tape, from DC6150 SCSI linear tape at home all the way through an Exabyte library with stacks and stacks of 8mm tapes.
Two decades of tape has been two decades of heartache and frustration for me and the companies I've worked with.
These days I'm no longer in tech or IT (thank god) but for my personal needs I use RAID-1 for live and DVD-RAM (as cumbersome, slow, and small as it is) for offline.Tapes just bleed data at an alarming rate, and they are about as reliable as a drunk gabling addict living under the subway next to the OTB shop.Do the hard drive thing, and store them well, with redundancy, under good conditions, and replace them often.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351636</id>
	<title>Re:Tape is your friend</title>
	<author>toastar</author>
	<datestamp>1267616580000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>This. Depending on how long you want to store it tape lasts longer, and once the upfront cost of the drive is paid off the per unit cost is cheaper too. Also dealing with offsite storage places(iron mountain) is easier with tape then with HDDs.</p><p>Lastly, I've been told you have to spin up the HDD's every so often or the lifetime rating is even less then what they are rated for. Although I'm not sure I believe that part.</p></htmltext>
<tokenext>This .
Depending on how long you want to store it tape lasts longer , and once the upfront cost of the drive is paid off the per unit cost is cheaper too .
Also dealing with offsite storage places ( iron mountain ) is easier with tape then with HDDs.Lastly , I 've been told you have to spin up the HDD 's every so often or the lifetime rating is even less then what they are rated for .
Although I 'm not sure I believe that part .</tokentext>
<sentencetext>This.
Depending on how long you want to store it tape lasts longer, and once the upfront cost of the drive is paid off the per unit cost is cheaper too.
Also dealing with offsite storage places(iron mountain) is easier with tape then with HDDs.Lastly, I've been told you have to spin up the HDD's every so often or the lifetime rating is even less then what they are rated for.
Although I'm not sure I believe that part.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351414</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354644</id>
	<title>Re:Exactly what you're doing</title>
	<author>afabbro</author>
	<datestamp>1267639560000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p><div class="quote"><p> <i>
You're storing relatively fragile hard drives in a raid5 configuration in a lock box?
</i>
Relatively fragile sitting in a temperature controlled safety deposit box, completely unpowered?  Uhh..  No.</p></div><p>As info, I don't think most banks guarantee these conditions.  They guarantee no one will break in and steal your stuff (no doubt with lots of liability-limiting clauses in the agreement you sign), but I don't think they guarantee either temperature or humidity control.  I'm sure your box won't get to 451F or something, but a swing of 15-20 degrees during the day wouldn't surprise me, at the same humidity as the lobby.</p><p>There is storage like that available (e.g., Iron Mountain) but it's more expensive.</p></div>
	</htmltext>
<tokenext>You 're storing relatively fragile hard drives in a raid5 configuration in a lock box ?
Relatively fragile sitting in a temperature controlled safety deposit box , completely unpowered ?
Uhh.. No.As info , I do n't think most banks guarantee these conditions .
They guarantee no one will break in and steal your stuff ( no doubt with lots of liability-limiting clauses in the agreement you sign ) , but I do n't think they guarantee either temperature or humidity control .
I 'm sure your box wo n't get to 451F or something , but a swing of 15-20 degrees during the day would n't surprise me , at the same humidity as the lobby.There is storage like that available ( e.g. , Iron Mountain ) but it 's more expensive .</tokentext>
<sentencetext> 
You're storing relatively fragile hard drives in a raid5 configuration in a lock box?
Relatively fragile sitting in a temperature controlled safety deposit box, completely unpowered?
Uhh..  No.As info, I don't think most banks guarantee these conditions.
They guarantee no one will break in and steal your stuff (no doubt with lots of liability-limiting clauses in the agreement you sign), but I don't think they guarantee either temperature or humidity control.
I'm sure your box won't get to 451F or something, but a swing of 15-20 degrees during the day wouldn't surprise me, at the same humidity as the lobby.There is storage like that available (e.g., Iron Mountain) but it's more expensive.
	</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353102</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354306</id>
	<title>Long term, infrequent access</title>
	<author>ceoyoyo</author>
	<datestamp>1267635780000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>From the description it sounds like you need long term storage and infrequent (if ever) access.  Optical is obviously bad for that.  Hard drives are too.  Storing hard drives for long periods of time without spinning them up can result in the platters seizing.  Tape really is your only reasonable option.  You can pile it above some poor post doc's desk like they do at my lab or you can store it properly, your choice, but either way it's better than the other options.</p></htmltext>
<tokenext>From the description it sounds like you need long term storage and infrequent ( if ever ) access .
Optical is obviously bad for that .
Hard drives are too .
Storing hard drives for long periods of time without spinning them up can result in the platters seizing .
Tape really is your only reasonable option .
You can pile it above some poor post doc 's desk like they do at my lab or you can store it properly , your choice , but either way it 's better than the other options .</tokentext>
<sentencetext>From the description it sounds like you need long term storage and infrequent (if ever) access.
Optical is obviously bad for that.
Hard drives are too.
Storing hard drives for long periods of time without spinning them up can result in the platters seizing.
Tape really is your only reasonable option.
You can pile it above some poor post doc's desk like they do at my lab or you can store it properly, your choice, but either way it's better than the other options.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352506</id>
	<title>Know the feeling</title>
	<author>xettera</author>
	<datestamp>1267620900000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>I the same problem when it comes to backing up my moderately large dataset of study material for human anatomy. Granted, all of the data exists in the cloud, but I would hate to have to spend another 10,000 hours browsing porn sites to download it all again</htmltext>
<tokenext>I the same problem when it comes to backing up my moderately large dataset of study material for human anatomy .
Granted , all of the data exists in the cloud , but I would hate to have to spend another 10,000 hours browsing porn sites to download it all again</tokentext>
<sentencetext>I the same problem when it comes to backing up my moderately large dataset of study material for human anatomy.
Granted, all of the data exists in the cloud, but I would hate to have to spend another 10,000 hours browsing porn sites to download it all again</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353136</id>
	<title>cloud storage at zetta.net</title>
	<author>Anonymous</author>
	<datestamp>1267625520000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>Try cloud storage at www.zetta.net</p></htmltext>
<tokenext>Try cloud storage at www.zetta.net</tokentext>
<sentencetext>Try cloud storage at www.zetta.net</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351872</id>
	<title>Re:Practice your Recovery Method</title>
	<author>PitaBred</author>
	<datestamp>1267617600000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>That's why the original poster suggested that they're already using Linux software RAID. Much more robust than depending on a single firmware version.</p></htmltext>
<tokenext>That 's why the original poster suggested that they 're already using Linux software RAID .
Much more robust than depending on a single firmware version .</tokentext>
<sentencetext>That's why the original poster suggested that they're already using Linux software RAID.
Much more robust than depending on a single firmware version.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351448</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352682</id>
	<title>Re:Amazon S3</title>
	<author>h4rr4r</author>
	<datestamp>1267622100000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>So what happens when Amazon loses your data?<br>What are they willing to pay you for doing that?</p></htmltext>
<tokenext>So what happens when Amazon loses your data ? What are they willing to pay you for doing that ?</tokentext>
<sentencetext>So what happens when Amazon loses your data?What are they willing to pay you for doing that?</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351420</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352074</id>
	<title>Never us DVDs as long term storage.</title>
	<author>strangeattraction</author>
	<datestamp>1267618620000</datestamp>
	<modclass>Interestin</modclass>
	<modscore>4</modscore>
	<htmltext>Repeat never use DVDs as long term storage. I have seen them go unreadable anywhere from 2-5 years. I have fired up disk drives 10 years later with no problems.  They are cheap reliable and fast. Don't try and get fancy just compress and store data sets over multiple volumes. Don't use RAID.</htmltext>
<tokenext>Repeat never use DVDs as long term storage .
I have seen them go unreadable anywhere from 2-5 years .
I have fired up disk drives 10 years later with no problems .
They are cheap reliable and fast .
Do n't try and get fancy just compress and store data sets over multiple volumes .
Do n't use RAID .</tokentext>
<sentencetext>Repeat never use DVDs as long term storage.
I have seen them go unreadable anywhere from 2-5 years.
I have fired up disk drives 10 years later with no problems.
They are cheap reliable and fast.
Don't try and get fancy just compress and store data sets over multiple volumes.
Don't use RAID.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352438</id>
	<title>None of the above</title>
	<author>butalearner</author>
	<datestamp>1267620420000</datestamp>
	<modclass>Flamebait</modclass>
	<modscore>0</modscore>
	<htmltext><p>You obviously haven't been in business very long, and I have seen no reasonable answers in this thread at all.  Nuke the data as soon as you ship and offer to "recover" it (that is, regenerate it) for an additional fee should they require it.</p><p>Noobs.</p></htmltext>
<tokenext>You obviously have n't been in business very long , and I have seen no reasonable answers in this thread at all .
Nuke the data as soon as you ship and offer to " recover " it ( that is , regenerate it ) for an additional fee should they require it.Noobs .</tokentext>
<sentencetext>You obviously haven't been in business very long, and I have seen no reasonable answers in this thread at all.
Nuke the data as soon as you ship and offer to "recover" it (that is, regenerate it) for an additional fee should they require it.Noobs.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351950</id>
	<title>Re:Use RAID6 not RAID5</title>
	<author>Vancorps</author>
	<datestamp>1267618080000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>Or just get 4 LTO4 tapes or 2 LTO5 tapes at ~$70 a pop to achieve the same capacity with no software setups or drives that fail when they are not spun up regularly. Common hard drives are not good long term storage. They are great for online or near-line storage but at some point, bite the bullet and just get a tape drive. Given the datasets are only 3TB a single tape drive is sufficient, at that range you could have two drives backing up identical data and storing the tapes in separate locations. This is much safer. The LTO4/5 drives will recoup their cost after about 4 or 5 clients when compared with solutions such as yours. I have no idea how many clients the poster had though.</htmltext>
<tokenext>Or just get 4 LTO4 tapes or 2 LTO5 tapes at ~ $ 70 a pop to achieve the same capacity with no software setups or drives that fail when they are not spun up regularly .
Common hard drives are not good long term storage .
They are great for online or near-line storage but at some point , bite the bullet and just get a tape drive .
Given the datasets are only 3TB a single tape drive is sufficient , at that range you could have two drives backing up identical data and storing the tapes in separate locations .
This is much safer .
The LTO4/5 drives will recoup their cost after about 4 or 5 clients when compared with solutions such as yours .
I have no idea how many clients the poster had though .</tokentext>
<sentencetext>Or just get 4 LTO4 tapes or 2 LTO5 tapes at ~$70 a pop to achieve the same capacity with no software setups or drives that fail when they are not spun up regularly.
Common hard drives are not good long term storage.
They are great for online or near-line storage but at some point, bite the bullet and just get a tape drive.
Given the datasets are only 3TB a single tape drive is sufficient, at that range you could have two drives backing up identical data and storing the tapes in separate locations.
This is much safer.
The LTO4/5 drives will recoup their cost after about 4 or 5 clients when compared with solutions such as yours.
I have no idea how many clients the poster had though.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351578</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351896</id>
	<title>How to</title>
	<author>brennz</author>
	<datestamp>1267617780000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>Buy a netapp.  Yay, RAID-DP.</p><p>That was hard!</p><p>* wipes brow *</p></htmltext>
<tokenext>Buy a netapp .
Yay , RAID-DP.That was hard !
* wipes brow *</tokentext>
<sentencetext>Buy a netapp.
Yay, RAID-DP.That was hard!
* wipes brow *</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353030</id>
	<title>Just copy everything to you unlimited drive!</title>
	<author>Anonymous</author>
	<datestamp>1267624620000</datestamp>
	<modclass>Funny</modclass>
	<modscore>2</modscore>
	<htmltext>I've been doing backups by copying everything to my unlimited drive for years now. It's amazing - it never fills up!<br> <br>Just type <br> <b>copy Edit.* NUL</b> <br>at your command prompt (or <b>cp *<nobr> <wbr></nobr>/dev/null</b> if using Unix).<br> <br>One day I'm gonna look to see how much data I have in that damn thing!</htmltext>
<tokenext>I 've been doing backups by copying everything to my unlimited drive for years now .
It 's amazing - it never fills up !
Just type copy Edit .
* NUL at your command prompt ( or cp * /dev/null if using Unix ) .
One day I 'm gon na look to see how much data I have in that damn thing !</tokentext>
<sentencetext>I've been doing backups by copying everything to my unlimited drive for years now.
It's amazing - it never fills up!
Just type  copy Edit.
* NUL at your command prompt (or cp * /dev/null if using Unix).
One day I'm gonna look to see how much data I have in that damn thing!</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354292</id>
	<title>Re:Tape is your friend</title>
	<author>cprice</author>
	<datestamp>1267635600000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>yes but the lto tape and drives are expensive compared to 2tb sata drives. I'm guessing he doesnt have much of a budget, for if he did a 4 or 8 tape lto libray with attached to a san fabirc and striping to the tapes would be a good solution.</p></htmltext>
<tokenext>yes but the lto tape and drives are expensive compared to 2tb sata drives .
I 'm guessing he doesnt have much of a budget , for if he did a 4 or 8 tape lto libray with attached to a san fabirc and striping to the tapes would be a good solution .</tokentext>
<sentencetext>yes but the lto tape and drives are expensive compared to 2tb sata drives.
I'm guessing he doesnt have much of a budget, for if he did a 4 or 8 tape lto libray with attached to a san fabirc and striping to the tapes would be a good solution.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351414</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31356148</id>
	<title>Too bad</title>
	<author>Anonymous</author>
	<datestamp>1267701300000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p><i> A tape library would be impractical at the present time. What do you recommend?"</i> </p><p>If it's your business then you need to make it practical.  There's a reason why tape is still the industry standard for backup and offsite archiving.  If you can't find a way to fit secure backup and archiving into your business model then perhaps there is an issue with your business model?</p></htmltext>
<tokenext>A tape library would be impractical at the present time .
What do you recommend ?
" If it 's your business then you need to make it practical .
There 's a reason why tape is still the industry standard for backup and offsite archiving .
If you ca n't find a way to fit secure backup and archiving into your business model then perhaps there is an issue with your business model ?</tokentext>
<sentencetext> A tape library would be impractical at the present time.
What do you recommend?
" If it's your business then you need to make it practical.
There's a reason why tape is still the industry standard for backup and offsite archiving.
If you can't find a way to fit secure backup and archiving into your business model then perhaps there is an issue with your business model?</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31357770</id>
	<title>The key is redundancy</title>
	<author>Taylor123456789</author>
	<datestamp>1267717080000</datestamp>
	<modclass>None</modclass>
	<modscore>-1</modscore>
	<htmltext><p>The key to long term backup is redundancy.  Any one solution may have problems over time.  I would do your hard drive system as you are doing.  I would also use an unlimited online backup service and explore a tape option.</p></htmltext>
<tokenext>The key to long term backup is redundancy .
Any one solution may have problems over time .
I would do your hard drive system as you are doing .
I would also use an unlimited online backup service and explore a tape option .</tokentext>
<sentencetext>The key to long term backup is redundancy.
Any one solution may have problems over time.
I would do your hard drive system as you are doing.
I would also use an unlimited online backup service and explore a tape option.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352080</id>
	<title>Another approach...</title>
	<author>namgge</author>
	<datestamp>1267618680000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>
As <em>you</em> generate the datasets, you should seriously consider archiving the method (scripts, software, <i>etc.</i>) you used to generate them rather than the output.
</p><p>
Namgge
</p></htmltext>
<tokenext>As you generate the datasets , you should seriously consider archiving the method ( scripts , software , etc .
) you used to generate them rather than the output .
Namgge</tokentext>
<sentencetext>
As you generate the datasets, you should seriously consider archiving the method (scripts, software, etc.
) you used to generate them rather than the output.
Namgge
</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351960</id>
	<title>or build a couple of these...</title>
	<author>jjoelc</author>
	<datestamp>1267618140000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p><a href="http://blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-cheap-cloud-storage/" title="backblaze.com">http://blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-cheap-cloud-storage/</a> [backblaze.com]</p><p>I'm sure the price has come down some since this article was published...</p><p>For those too lazy or paranoid to read the link... It describes how backblaze builds "cheap" 67 TB storage boxes for use in their online backup service. All the hardware specs are open sourced and freely available. They also talk a little bit about the software for managing all of the spce they have, but not in any real detail...</p></htmltext>
<tokenext>http : //blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-cheap-cloud-storage/ [ backblaze.com ] I 'm sure the price has come down some since this article was published...For those too lazy or paranoid to read the link... It describes how backblaze builds " cheap " 67 TB storage boxes for use in their online backup service .
All the hardware specs are open sourced and freely available .
They also talk a little bit about the software for managing all of the spce they have , but not in any real detail.. .</tokentext>
<sentencetext>http://blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-cheap-cloud-storage/ [backblaze.com]I'm sure the price has come down some since this article was published...For those too lazy or paranoid to read the link... It describes how backblaze builds "cheap" 67 TB storage boxes for use in their online backup service.
All the hardware specs are open sourced and freely available.
They also talk a little bit about the software for managing all of the spce they have, but not in any real detail...</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31355168</id>
	<title>Sata Toaster?</title>
	<author>citrustech</author>
	<datestamp>1267645740000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>I'd be tempted by a SATA toaster <a href="http://bit.ly/bXDpcx" title="bit.ly" rel="nofollow">http://bit.ly/bXDpcx</a> [bit.ly], but worried about it being on disk frankly.  What would be nice isis some one could produce an authoratative summary for everyone.  Bit like lifehacker's hive.</htmltext>
<tokenext>I 'd be tempted by a SATA toaster http : //bit.ly/bXDpcx [ bit.ly ] , but worried about it being on disk frankly .
What would be nice isis some one could produce an authoratative summary for everyone .
Bit like lifehacker 's hive .</tokentext>
<sentencetext>I'd be tempted by a SATA toaster http://bit.ly/bXDpcx [bit.ly], but worried about it being on disk frankly.
What would be nice isis some one could produce an authoratative summary for everyone.
Bit like lifehacker's hive.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351748</id>
	<title>Moderately Large Dataset?</title>
	<author>Anonymous</author>
	<datestamp>1267617000000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext>Heh. That's what <i>she</i> said! Heh<nobr> <wbr></nobr>;-)</htmltext>
<tokenext>Heh .
That 's what she said !
Heh ; - )</tokentext>
<sentencetext>Heh.
That's what she said!
Heh ;-)</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351984</id>
	<title>Use a company that specializes in it.</title>
	<author>Anonymous</author>
	<datestamp>1267618260000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext>I have a friend who works here, they might be 1 possibility?

<a href="http://www.ironmountain.com/digital-archiving/digital-archiving-services.html" title="ironmountain.com" rel="nofollow">http://www.ironmountain.com/digital-archiving/digital-archiving-services.html</a> [ironmountain.com]</htmltext>
<tokenext>I have a friend who works here , they might be 1 possibility ?
http : //www.ironmountain.com/digital-archiving/digital-archiving-services.html [ ironmountain.com ]</tokentext>
<sentencetext>I have a friend who works here, they might be 1 possibility?
http://www.ironmountain.com/digital-archiving/digital-archiving-services.html [ironmountain.com]</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31355404</id>
	<title>Folks suggesting LTO-4 should also mention</title>
	<author>Anonymous</author>
	<datestamp>1267735020000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>the very high data rates (80+ MB/sec) required to keep the drives streaming without "shoe-shining" (starting and stopping a lot, causing the tape and mechanism to wear out).  You really can't just plug an LTO drive into your white-box PC and dump the hard drive file system to it, especially if you're doing other stuff with the PC at the same time.  Enterprise backup systems generally have a RAID box dedicated to backup.  You transfer the data from the PC to the RAID over your local network, then dump from the RAID to tape.  Often you use a robot to change the tapes.  Even after all that, it's still a huge pain.  The place I worked at gave up on tape after a while and uses redundant disks for everything (about 2000 computers backing each other up).</p></htmltext>
<tokenext>the very high data rates ( 80 + MB/sec ) required to keep the drives streaming without " shoe-shining " ( starting and stopping a lot , causing the tape and mechanism to wear out ) .
You really ca n't just plug an LTO drive into your white-box PC and dump the hard drive file system to it , especially if you 're doing other stuff with the PC at the same time .
Enterprise backup systems generally have a RAID box dedicated to backup .
You transfer the data from the PC to the RAID over your local network , then dump from the RAID to tape .
Often you use a robot to change the tapes .
Even after all that , it 's still a huge pain .
The place I worked at gave up on tape after a while and uses redundant disks for everything ( about 2000 computers backing each other up ) .</tokentext>
<sentencetext>the very high data rates (80+ MB/sec) required to keep the drives streaming without "shoe-shining" (starting and stopping a lot, causing the tape and mechanism to wear out).
You really can't just plug an LTO drive into your white-box PC and dump the hard drive file system to it, especially if you're doing other stuff with the PC at the same time.
Enterprise backup systems generally have a RAID box dedicated to backup.
You transfer the data from the PC to the RAID over your local network, then dump from the RAID to tape.
Often you use a robot to change the tapes.
Even after all that, it's still a huge pain.
The place I worked at gave up on tape after a while and uses redundant disks for everything (about 2000 computers backing each other up).</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352592</id>
	<title>Torrents</title>
	<author>Aurisor</author>
	<datestamp>1267621440000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>Just set up some torrents on the Pirate Bay and let the entire internet do your backup for you!</p></htmltext>
<tokenext>Just set up some torrents on the Pirate Bay and let the entire internet do your backup for you !</tokentext>
<sentencetext>Just set up some torrents on the Pirate Bay and let the entire internet do your backup for you!</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31361184</id>
	<title>Re:Tape is your friend</title>
	<author>petermgreen</author>
	<datestamp>1267733460000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p><i>DVD claims 30 to 100 years [osta.org]. </i><br>Though it's too new a tech to really trust those claims. Accelerated aging is a crude approximation of real aging.</p><p><i>Neither lasts as long as necessary for archival storage.</i><br>Since afaict there is no storage tech that offers acceptable density while also having been arround long enough to give realistic lifetime claims over 10 years or so IMO the only real option for long term archival storage is do all of the following</p><p>1: keep lots of redundancy (preferally using something like parchive)<br>2: regulally (say once a year perhaps more often) read and test the media replacing any that fails to read correctly by using the redundancy.<br>3: if using media with seperate drives take steps to ensure that a single misaligned drive isn't a single point of failure.<br>4: move the data onto new media when it makes sense to do so.</p></htmltext>
<tokenext>DVD claims 30 to 100 years [ osta.org ] .
Though it 's too new a tech to really trust those claims .
Accelerated aging is a crude approximation of real aging.Neither lasts as long as necessary for archival storage.Since afaict there is no storage tech that offers acceptable density while also having been arround long enough to give realistic lifetime claims over 10 years or so IMO the only real option for long term archival storage is do all of the following1 : keep lots of redundancy ( preferally using something like parchive ) 2 : regulally ( say once a year perhaps more often ) read and test the media replacing any that fails to read correctly by using the redundancy.3 : if using media with seperate drives take steps to ensure that a single misaligned drive is n't a single point of failure.4 : move the data onto new media when it makes sense to do so .</tokentext>
<sentencetext>DVD claims 30 to 100 years [osta.org].
Though it's too new a tech to really trust those claims.
Accelerated aging is a crude approximation of real aging.Neither lasts as long as necessary for archival storage.Since afaict there is no storage tech that offers acceptable density while also having been arround long enough to give realistic lifetime claims over 10 years or so IMO the only real option for long term archival storage is do all of the following1: keep lots of redundancy (preferally using something like parchive)2: regulally (say once a year perhaps more often) read and test the media replacing any that fails to read correctly by using the redundancy.3: if using media with seperate drives take steps to ensure that a single misaligned drive isn't a single point of failure.4: move the data onto new media when it makes sense to do so.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31355302</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351618</id>
	<title>i just got off the toilet</title>
	<author>Anonymous</author>
	<datestamp>1267616520000</datestamp>
	<modclass>Troll</modclass>
	<modscore>-1</modscore>
	<htmltext><p>i shit out an obama.<br>
&nbsp; <br>plop!</p></htmltext>
<tokenext>i shit out an obama .
  plop !</tokentext>
<sentencetext>i shit out an obama.
  plop!</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31361604</id>
	<title>Realistic Comparison - Tape vs Disk</title>
	<author>jon3k</author>
	<datestamp>1267735500000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>You can get an LTO4 Internal drive for $2.5k (HP 1760 Ultrium) hooked to a cheap desktop PC if you don't mind swapping the tapes yourself about $3-3.5k total expenditure.  The "enterprise-y" route would be $7-$9k if you want a robot to do the swapping for you (look at HP MLS2024 for example).  And that will get you 2U with robot and I believe two 12 slot magazines all LTO4.  You can hook this up to a very inexpensive single socket server for around $2k.  Then you've got media.  LTO4 tapes are running around $40 for 800/1.6 tapes.  Total solution that way figure $9k-$11k.
<br> <br>
Now compared to disk I'd go with $180 2TB Seagate Barracuda drives for a reasonable option.  Three of them give you 3.5+ TB of usable space per customer which meets your requirements with some breathing room.  Now the important part - you need a server to put these drives in.  So when you're comparing the cost of the infrastructure for tape, don't forget you need infrastructure for the RAID5-array-builder-machine-guy-thing.
<br> <br>
There are numerous other advantages for tape as well.  Easy encryption, easy restores, less storage space, less likely to fail, etc etc etc.  For me tape is a no brainer.</htmltext>
<tokenext>You can get an LTO4 Internal drive for $ 2.5k ( HP 1760 Ultrium ) hooked to a cheap desktop PC if you do n't mind swapping the tapes yourself about $ 3-3.5k total expenditure .
The " enterprise-y " route would be $ 7- $ 9k if you want a robot to do the swapping for you ( look at HP MLS2024 for example ) .
And that will get you 2U with robot and I believe two 12 slot magazines all LTO4 .
You can hook this up to a very inexpensive single socket server for around $ 2k .
Then you 've got media .
LTO4 tapes are running around $ 40 for 800/1.6 tapes .
Total solution that way figure $ 9k- $ 11k .
Now compared to disk I 'd go with $ 180 2TB Seagate Barracuda drives for a reasonable option .
Three of them give you 3.5 + TB of usable space per customer which meets your requirements with some breathing room .
Now the important part - you need a server to put these drives in .
So when you 're comparing the cost of the infrastructure for tape , do n't forget you need infrastructure for the RAID5-array-builder-machine-guy-thing .
There are numerous other advantages for tape as well .
Easy encryption , easy restores , less storage space , less likely to fail , etc etc etc .
For me tape is a no brainer .</tokentext>
<sentencetext>You can get an LTO4 Internal drive for $2.5k (HP 1760 Ultrium) hooked to a cheap desktop PC if you don't mind swapping the tapes yourself about $3-3.5k total expenditure.
The "enterprise-y" route would be $7-$9k if you want a robot to do the swapping for you (look at HP MLS2024 for example).
And that will get you 2U with robot and I believe two 12 slot magazines all LTO4.
You can hook this up to a very inexpensive single socket server for around $2k.
Then you've got media.
LTO4 tapes are running around $40 for 800/1.6 tapes.
Total solution that way figure $9k-$11k.
Now compared to disk I'd go with $180 2TB Seagate Barracuda drives for a reasonable option.
Three of them give you 3.5+ TB of usable space per customer which meets your requirements with some breathing room.
Now the important part - you need a server to put these drives in.
So when you're comparing the cost of the infrastructure for tape, don't forget you need infrastructure for the RAID5-array-builder-machine-guy-thing.
There are numerous other advantages for tape as well.
Easy encryption, easy restores, less storage space, less likely to fail, etc etc etc.
For me tape is a no brainer.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31358228</id>
	<title>More information?</title>
	<author>eth1</author>
	<datestamp>1267720080000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>It would be much easier to suggest something reasonable if we knew what the impact was if the data couldn't be recovered, and also why the data needs to be kept. Are you keeping it for yourself? If you're just keeping it around in case the customer comes back for something, why not give it all to them and let THEM worry about it?</p></htmltext>
<tokenext>It would be much easier to suggest something reasonable if we knew what the impact was if the data could n't be recovered , and also why the data needs to be kept .
Are you keeping it for yourself ?
If you 're just keeping it around in case the customer comes back for something , why not give it all to them and let THEM worry about it ?</tokentext>
<sentencetext>It would be much easier to suggest something reasonable if we knew what the impact was if the data couldn't be recovered, and also why the data needs to be kept.
Are you keeping it for yourself?
If you're just keeping it around in case the customer comes back for something, why not give it all to them and let THEM worry about it?</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354686</id>
	<title>Re:Amazon S3</title>
	<author>chameleon\_skin</author>
	<datestamp>1267639980000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>
One major caveat to this: storing data on S3 is not compatible with many types of compliance standards (PCI, SOX, HIPAA, etc.)  If your business is required to have any particular types of compliance, either because the government requires it of you or your partners do, make sure you are storing the data offline.  That's about the only way to do it and still meet compliance measures.<p>
I also love the idea of using S3 for storage; just keep in mind that depending on the type of data you are storing it may not be the appropriate choice.</p></htmltext>
<tokenext>One major caveat to this : storing data on S3 is not compatible with many types of compliance standards ( PCI , SOX , HIPAA , etc .
) If your business is required to have any particular types of compliance , either because the government requires it of you or your partners do , make sure you are storing the data offline .
That 's about the only way to do it and still meet compliance measures .
I also love the idea of using S3 for storage ; just keep in mind that depending on the type of data you are storing it may not be the appropriate choice .</tokentext>
<sentencetext>
One major caveat to this: storing data on S3 is not compatible with many types of compliance standards (PCI, SOX, HIPAA, etc.
)  If your business is required to have any particular types of compliance, either because the government requires it of you or your partners do, make sure you are storing the data offline.
That's about the only way to do it and still meet compliance measures.
I also love the idea of using S3 for storage; just keep in mind that depending on the type of data you are storing it may not be the appropriate choice.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351420</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352190</id>
	<title>Build a BackBlaze Storage Pod</title>
	<author>Anonymous</author>
	<datestamp>1267619160000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>Honestly, I don't know of any offline storage that I trust with that much data for any serious length of time.  Have you considered building your own online/nearline storage?  The backblaze storage pod is an excellent example of online storage on the cheap.</p><p>http://blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-cheap-cloud-storage/</p></htmltext>
<tokenext>Honestly , I do n't know of any offline storage that I trust with that much data for any serious length of time .
Have you considered building your own online/nearline storage ?
The backblaze storage pod is an excellent example of online storage on the cheap.http : //blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-cheap-cloud-storage/</tokentext>
<sentencetext>Honestly, I don't know of any offline storage that I trust with that much data for any serious length of time.
Have you considered building your own online/nearline storage?
The backblaze storage pod is an excellent example of online storage on the cheap.http://blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-cheap-cloud-storage/</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31355506</id>
	<title>Tiered approach</title>
	<author>Anonymous</author>
	<datestamp>1267736340000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>I would use a tiered approach simply use a cheap S-ATA RAID 5/6 for things you still need access for. And once you need to archive them clone it to tape and preferably keep 2 copies of everything stored at different locations, doesn't add much too the cost. But can save you alot if one copy gets lost or doesn't work.</p></htmltext>
<tokenext>I would use a tiered approach simply use a cheap S-ATA RAID 5/6 for things you still need access for .
And once you need to archive them clone it to tape and preferably keep 2 copies of everything stored at different locations , does n't add much too the cost .
But can save you alot if one copy gets lost or does n't work .</tokentext>
<sentencetext>I would use a tiered approach simply use a cheap S-ATA RAID 5/6 for things you still need access for.
And once you need to archive them clone it to tape and preferably keep 2 copies of everything stored at different locations, doesn't add much too the cost.
But can save you alot if one copy gets lost or doesn't work.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31355046</id>
	<title>I have an idea</title>
	<author>Tablizer</author>
	<datestamp>1267644360000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>Let's see, where'd I put that stack of 12,000 AOL disks?<br>
&nbsp; &nbsp;</p></htmltext>
<tokenext>Let 's see , where 'd I put that stack of 12,000 AOL disks ?
   </tokentext>
<sentencetext>Let's see, where'd I put that stack of 12,000 AOL disks?
   </sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353164</id>
	<title>Re:I'd encrypt the data and...</title>
	<author>Anonymous</author>
	<datestamp>1267625820000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>Have you SEEN Idol this season?<nobr> <wbr></nobr>:|  Might as well label your data, "BEST OF CHEVY CHASE BLURAY ALSO INCLUDES VACATION MOVIES"</p></htmltext>
<tokenext>Have you SEEN Idol this season ?
: | Might as well label your data , " BEST OF CHEVY CHASE BLURAY ALSO INCLUDES VACATION MOVIES "</tokentext>
<sentencetext>Have you SEEN Idol this season?
:|  Might as well label your data, "BEST OF CHEVY CHASE BLURAY ALSO INCLUDES VACATION MOVIES"</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351566</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354358</id>
	<title>Buy a SAN</title>
	<author>Anonymous</author>
	<datestamp>1267636440000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>It'll cost you anywhere between $100K-1 Million, you can look at vendors like EMC, and bluearc, or Netapp, or Sun, and buy a nice SAN. Use SATA drives for 2nd teer storage You can mount everything via NFS, or do it ISCSI via a fiber network. Dead easy, and the best thing is you expand it as needed with no downtime. These vendora are pretty hands on about support, for example you'll have a tech at your door to replace a failed disk you didn't even know about because they monitor this stuff, or have them come in to update frimware.</p></htmltext>
<tokenext>It 'll cost you anywhere between $ 100K-1 Million , you can look at vendors like EMC , and bluearc , or Netapp , or Sun , and buy a nice SAN .
Use SATA drives for 2nd teer storage You can mount everything via NFS , or do it ISCSI via a fiber network .
Dead easy , and the best thing is you expand it as needed with no downtime .
These vendora are pretty hands on about support , for example you 'll have a tech at your door to replace a failed disk you did n't even know about because they monitor this stuff , or have them come in to update frimware .</tokentext>
<sentencetext>It'll cost you anywhere between $100K-1 Million, you can look at vendors like EMC, and bluearc, or Netapp, or Sun, and buy a nice SAN.
Use SATA drives for 2nd teer storage You can mount everything via NFS, or do it ISCSI via a fiber network.
Dead easy, and the best thing is you expand it as needed with no downtime.
These vendora are pretty hands on about support, for example you'll have a tech at your door to replace a failed disk you didn't even know about because they monitor this stuff, or have them come in to update frimware.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352846</id>
	<title>Re:Tape is crap anyway.</title>
	<author>Anonymous</author>
	<datestamp>1267623300000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>Funny, DLT worked beautifully for me for a decade, and now it's LTO<nobr> <wbr></nobr>... and yes, I did practice the restore with each hardware set before I needed it.</p></htmltext>
<tokenext>Funny , DLT worked beautifully for me for a decade , and now it 's LTO ... and yes , I did practice the restore with each hardware set before I needed it .</tokentext>
<sentencetext>Funny, DLT worked beautifully for me for a decade, and now it's LTO ... and yes, I did practice the restore with each hardware set before I needed it.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351706</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31357320</id>
	<title>Hard drives are cheap.</title>
	<author>Ihlosi</author>
	<datestamp>1267714140000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>Use several sets of them, from different manufacturers (to exclude the possibility of nasty firmware bugs or manufacturing problems killing your data).</p><p><b>Do not use any kind of RAID</b> just for storing the data. That's an invitation for problems, and hard drives are \_cheap\_. The main purpose of RAID isn't extra safety, it's to shorten or eliminate the downtime if one disk fails. Since you're merely storing the data and therefore don't need to hot-swap, you don't need that.</p><p>My $0.02.</p></htmltext>
<tokenext>Use several sets of them , from different manufacturers ( to exclude the possibility of nasty firmware bugs or manufacturing problems killing your data ) .Do not use any kind of RAID just for storing the data .
That 's an invitation for problems , and hard drives are \ _cheap \ _ .
The main purpose of RAID is n't extra safety , it 's to shorten or eliminate the downtime if one disk fails .
Since you 're merely storing the data and therefore do n't need to hot-swap , you do n't need that.My $ 0.02 .</tokentext>
<sentencetext>Use several sets of them, from different manufacturers (to exclude the possibility of nasty firmware bugs or manufacturing problems killing your data).Do not use any kind of RAID just for storing the data.
That's an invitation for problems, and hard drives are \_cheap\_.
The main purpose of RAID isn't extra safety, it's to shorten or eliminate the downtime if one disk fails.
Since you're merely storing the data and therefore don't need to hot-swap, you don't need that.My $0.02.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352910</id>
	<title>Re:Exactly what you're doing</title>
	<author>Tmack</author>
	<datestamp>1267623780000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>raid5 is ok for some things, but its not something I would trust for critical backups. Rebuild times are quite high these days for larger drives, so a single failure puts your terabytes of data at risk of total loss if just one other drive fails. I would say raid6 or mirrored raid5 sets, or just mirrored stripes, and always have a spare ready, and avoid using all the same brand/model/batch, so if one dies due to firmware or manufacturing bug, the others are less likely to die at the same time for the same reason (yes, Ive seen it happen). Still, this will not account for, correct or even warn for in-place data corruption, you need something that actively scans the drives comparing checksums against those written with the original data, something ZFS can do.
<p>
Or, you can go to an external provider as others have mentioned. There are more out there than just Amazon, some specializing in data integrity (amazon specifically does NOT guarantee your data, let alone any sort of redundancy/integrity for it <a href="http://blogs.sun.com/gbrunett/entry/amazon\_s3\_silent\_data\_corruption" title="sun.com">Link</a> [sun.com]).
</p><p>-Tm</p></htmltext>
<tokenext>raid5 is ok for some things , but its not something I would trust for critical backups .
Rebuild times are quite high these days for larger drives , so a single failure puts your terabytes of data at risk of total loss if just one other drive fails .
I would say raid6 or mirrored raid5 sets , or just mirrored stripes , and always have a spare ready , and avoid using all the same brand/model/batch , so if one dies due to firmware or manufacturing bug , the others are less likely to die at the same time for the same reason ( yes , Ive seen it happen ) .
Still , this will not account for , correct or even warn for in-place data corruption , you need something that actively scans the drives comparing checksums against those written with the original data , something ZFS can do .
Or , you can go to an external provider as others have mentioned .
There are more out there than just Amazon , some specializing in data integrity ( amazon specifically does NOT guarantee your data , let alone any sort of redundancy/integrity for it Link [ sun.com ] ) .
-Tm</tokentext>
<sentencetext>raid5 is ok for some things, but its not something I would trust for critical backups.
Rebuild times are quite high these days for larger drives, so a single failure puts your terabytes of data at risk of total loss if just one other drive fails.
I would say raid6 or mirrored raid5 sets, or just mirrored stripes, and always have a spare ready, and avoid using all the same brand/model/batch, so if one dies due to firmware or manufacturing bug, the others are less likely to die at the same time for the same reason (yes, Ive seen it happen).
Still, this will not account for, correct or even warn for in-place data corruption, you need something that actively scans the drives comparing checksums against those written with the original data, something ZFS can do.
Or, you can go to an external provider as others have mentioned.
There are more out there than just Amazon, some specializing in data integrity (amazon specifically does NOT guarantee your data, let alone any sort of redundancy/integrity for it Link [sun.com]).
-Tm</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351342</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351850</id>
	<title>Get a NAS with Deduplication</title>
	<author>Anonymous</author>
	<datestamp>1267617540000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>I would suggest something like this:<br><a href="http://www.netapp.com/us/products/storage-systems/fas2000/fas2000-tech-specs.html" title="netapp.com" rel="nofollow">http://www.netapp.com/us/products/storage-systems/fas2000/fas2000-tech-specs.html</a> [netapp.com]</p><p>that is, if you have a datacenter and ability to get 20A (NEMA L6-20) or 30A (NEMA L6-30) power.</p><p>Are the datasets generated using similar chunks of data?  If so, deduplication could be very very helpful (in addition to compression)</p></htmltext>
<tokenext>I would suggest something like this : http : //www.netapp.com/us/products/storage-systems/fas2000/fas2000-tech-specs.html [ netapp.com ] that is , if you have a datacenter and ability to get 20A ( NEMA L6-20 ) or 30A ( NEMA L6-30 ) power.Are the datasets generated using similar chunks of data ?
If so , deduplication could be very very helpful ( in addition to compression )</tokentext>
<sentencetext>I would suggest something like this:http://www.netapp.com/us/products/storage-systems/fas2000/fas2000-tech-specs.html [netapp.com]that is, if you have a datacenter and ability to get 20A (NEMA L6-20) or 30A (NEMA L6-30) power.Are the datasets generated using similar chunks of data?
If so, deduplication could be very very helpful (in addition to compression)</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31356906</id>
	<title>What we do (much larger system)...</title>
	<author>Anonymous</author>
	<datestamp>1267711080000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>We manage a moderate system of 5000 TB, growing to 10,000 TB in the next two years. Lots of sensor data. We get about 5 TB a day and that is growing to 15 later this year. Next year looks worse.</p><p>We have your problem and found that how you store your data is dependent on many things. We use a combination of disk and tape -- about 1600 TB of online disk (for fast cache) and the rest in 1 TB tapes (ok, 930 GB after fuzzy math).</p><p>We split data sets to make them fit and then duplicate the medium. So for a routine data set of our (3-5 TB per day), we break the data into chunks that fit within a tape and then make sure we have two copies of the tape. Disk access is all RAID (Hardware based 5 and 6, depending on vendor hardware). At any time we have at least two copies in the system, and more if you count copies at other sites.</p><p>In your case I would split your data set down into manageable chunks (each less than 1 disk in size) and then stuff it in.</p><p>Do not use RAID. We had an experiment where data sources would send us RAID data just like you suggest (Linux RIAD via hard drive in big padded cases from remote locations). Not optimal. Issues with firmware, drivers and the like are surprisingly frequent and take time to rectify.</p></htmltext>
<tokenext>We manage a moderate system of 5000 TB , growing to 10,000 TB in the next two years .
Lots of sensor data .
We get about 5 TB a day and that is growing to 15 later this year .
Next year looks worse.We have your problem and found that how you store your data is dependent on many things .
We use a combination of disk and tape -- about 1600 TB of online disk ( for fast cache ) and the rest in 1 TB tapes ( ok , 930 GB after fuzzy math ) .We split data sets to make them fit and then duplicate the medium .
So for a routine data set of our ( 3-5 TB per day ) , we break the data into chunks that fit within a tape and then make sure we have two copies of the tape .
Disk access is all RAID ( Hardware based 5 and 6 , depending on vendor hardware ) .
At any time we have at least two copies in the system , and more if you count copies at other sites.In your case I would split your data set down into manageable chunks ( each less than 1 disk in size ) and then stuff it in.Do not use RAID .
We had an experiment where data sources would send us RAID data just like you suggest ( Linux RIAD via hard drive in big padded cases from remote locations ) .
Not optimal .
Issues with firmware , drivers and the like are surprisingly frequent and take time to rectify .</tokentext>
<sentencetext>We manage a moderate system of 5000 TB, growing to 10,000 TB in the next two years.
Lots of sensor data.
We get about 5 TB a day and that is growing to 15 later this year.
Next year looks worse.We have your problem and found that how you store your data is dependent on many things.
We use a combination of disk and tape -- about 1600 TB of online disk (for fast cache) and the rest in 1 TB tapes (ok, 930 GB after fuzzy math).We split data sets to make them fit and then duplicate the medium.
So for a routine data set of our (3-5 TB per day), we break the data into chunks that fit within a tape and then make sure we have two copies of the tape.
Disk access is all RAID (Hardware based 5 and 6, depending on vendor hardware).
At any time we have at least two copies in the system, and more if you count copies at other sites.In your case I would split your data set down into manageable chunks (each less than 1 disk in size) and then stuff it in.Do not use RAID.
We had an experiment where data sources would send us RAID data just like you suggest (Linux RIAD via hard drive in big padded cases from remote locations).
Not optimal.
Issues with firmware, drivers and the like are surprisingly frequent and take time to rectify.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351386</id>
	<title>Amazon AWS?</title>
	<author>Anonymous</author>
	<datestamp>1267615380000</datestamp>
	<modclass>Interestin</modclass>
	<modscore>4</modscore>
	<htmltext>It might not be the cheapest option, but with <a href="http://aws.amazon.com/importexport/" title="amazon.com">Amazon's AWS</a> [amazon.com], you can snail mail them a copy of the drive with the data and they're store it in S3 storage buckets.</htmltext>
<tokenext>It might not be the cheapest option , but with Amazon 's AWS [ amazon.com ] , you can snail mail them a copy of the drive with the data and they 're store it in S3 storage buckets .</tokentext>
<sentencetext>It might not be the cheapest option, but with Amazon's AWS [amazon.com], you can snail mail them a copy of the drive with the data and they're store it in S3 storage buckets.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352166</id>
	<title>Re:Exactly what you're doing</title>
	<author>MoralHazard</author>
	<datestamp>1267619100000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p><i>Tape storage does store better.</i></p><p>CITATION MISSING</p></htmltext>
<tokenext>Tape storage does store better.CITATION MISSING</tokentext>
<sentencetext>Tape storage does store better.CITATION MISSING</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351504</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352152</id>
	<title>Re:Exactly what you're doing</title>
	<author>Anonymous</author>
	<datestamp>1267619040000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>AFAIK harde drives don't die if they are not being used, so store as well as tapes.</p></htmltext>
<tokenext>AFAIK harde drives do n't die if they are not being used , so store as well as tapes .</tokentext>
<sentencetext>AFAIK harde drives don't die if they are not being used, so store as well as tapes.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351504</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352638</id>
	<title>Re:Exactly what you're doing</title>
	<author>jedidiah</author>
	<datestamp>1267621800000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>Yup. RAID5 is a bad idea. Just split the dataset into ~ 2TB chunks and duplicate everything.</p><p>It ends up being more expensive but it also ends up being simpler and it yields better redundancy.</p></htmltext>
<tokenext>Yup .
RAID5 is a bad idea .
Just split the dataset into ~ 2TB chunks and duplicate everything.It ends up being more expensive but it also ends up being simpler and it yields better redundancy .</tokentext>
<sentencetext>Yup.
RAID5 is a bad idea.
Just split the dataset into ~ 2TB chunks and duplicate everything.It ends up being more expensive but it also ends up being simpler and it yields better redundancy.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351504</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354732</id>
	<title>Re:GMail Drive</title>
	<author>inKubus</author>
	<datestamp>1267640400000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>For an online backup, I highly recommend <a href="http://www.rsync.net/" title="rsync.net">rsync.net</a> [rsync.net] instead.  Starts at<nobr> <wbr></nobr>.80 a GB, goes down rapidly in bulk. Designed for storage, not mail or something.  More expensive than AWS but it goes down every month as disk prices drop.  Plus other benefits.  You're not going to beat cheap hard drives for bulk storage, but I like LTO4 a lot for archival backups.  But of course it's all worthless if you have a fire in the server room or something.</p></htmltext>
<tokenext>For an online backup , I highly recommend rsync.net [ rsync.net ] instead .
Starts at .80 a GB , goes down rapidly in bulk .
Designed for storage , not mail or something .
More expensive than AWS but it goes down every month as disk prices drop .
Plus other benefits .
You 're not going to beat cheap hard drives for bulk storage , but I like LTO4 a lot for archival backups .
But of course it 's all worthless if you have a fire in the server room or something .</tokentext>
<sentencetext>For an online backup, I highly recommend rsync.net [rsync.net] instead.
Starts at .80 a GB, goes down rapidly in bulk.
Designed for storage, not mail or something.
More expensive than AWS but it goes down every month as disk prices drop.
Plus other benefits.
You're not going to beat cheap hard drives for bulk storage, but I like LTO4 a lot for archival backups.
But of course it's all worthless if you have a fire in the server room or something.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351348</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351690</id>
	<title>I prefer online</title>
	<author>michaelmalak</author>
	<datestamp>1267616760000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext>I also deal in large amounts of scientific data (though only about 20GB per week of new data), and I prefer to keep it all online in order to analyze past data.  For now, an 8TB Buffalo Linkstation will do for me.  For when I outgrow that, I have been considering -- and have not yet tried -- a Drobo.<p> <a href="http://www.drobo.com/products/droboelite.php" title="drobo.com">http://www.drobo.com/products/droboelite.php</a> [drobo.com] </p><p>Each Drobo has 8 bays of 2TB drives for a total of 16TB.  And 255 Drobos can be linked together over Ethernet to create a single virtual volume of 4 Petabytes.</p><p>To back it all up, just buy more Drobos and store those in a separate location.  They're too big for a safe deposit box, so your home is just as good as anywhere else (assuming your home is different than your office) -- or a temperature-controlled storage unit if you don't like that idea.</p><p>If you can afford it, I recommend the backup strategy I use, which involves four complete sets of the data.  The main one is online.  The second is always connected to the network and receives a backup nightly via xxcopy (yes, I manage a file server in Windows -- shoot me).  The third is on-site but not connected to the network and gets rotated with the second weekly.  The fourth is off-site and gets rotated to on-site quarterly.</p><p>If you're really wed to the idea of off-line archival storage, get these for each customer -- get two so you can have two sets of data in case one goes bad:</p><p> <a href="http://www.newegg.com/Product/Product.aspx?Item=N82E16822154428" title="newegg.com">http://www.newegg.com/Product/Product.aspx?Item=N82E16822154428</a> [newegg.com]</p></htmltext>
<tokenext>I also deal in large amounts of scientific data ( though only about 20GB per week of new data ) , and I prefer to keep it all online in order to analyze past data .
For now , an 8TB Buffalo Linkstation will do for me .
For when I outgrow that , I have been considering -- and have not yet tried -- a Drobo .
http : //www.drobo.com/products/droboelite.php [ drobo.com ] Each Drobo has 8 bays of 2TB drives for a total of 16TB .
And 255 Drobos can be linked together over Ethernet to create a single virtual volume of 4 Petabytes.To back it all up , just buy more Drobos and store those in a separate location .
They 're too big for a safe deposit box , so your home is just as good as anywhere else ( assuming your home is different than your office ) -- or a temperature-controlled storage unit if you do n't like that idea.If you can afford it , I recommend the backup strategy I use , which involves four complete sets of the data .
The main one is online .
The second is always connected to the network and receives a backup nightly via xxcopy ( yes , I manage a file server in Windows -- shoot me ) .
The third is on-site but not connected to the network and gets rotated with the second weekly .
The fourth is off-site and gets rotated to on-site quarterly.If you 're really wed to the idea of off-line archival storage , get these for each customer -- get two so you can have two sets of data in case one goes bad : http : //www.newegg.com/Product/Product.aspx ? Item = N82E16822154428 [ newegg.com ]</tokentext>
<sentencetext>I also deal in large amounts of scientific data (though only about 20GB per week of new data), and I prefer to keep it all online in order to analyze past data.
For now, an 8TB Buffalo Linkstation will do for me.
For when I outgrow that, I have been considering -- and have not yet tried -- a Drobo.
http://www.drobo.com/products/droboelite.php [drobo.com] Each Drobo has 8 bays of 2TB drives for a total of 16TB.
And 255 Drobos can be linked together over Ethernet to create a single virtual volume of 4 Petabytes.To back it all up, just buy more Drobos and store those in a separate location.
They're too big for a safe deposit box, so your home is just as good as anywhere else (assuming your home is different than your office) -- or a temperature-controlled storage unit if you don't like that idea.If you can afford it, I recommend the backup strategy I use, which involves four complete sets of the data.
The main one is online.
The second is always connected to the network and receives a backup nightly via xxcopy (yes, I manage a file server in Windows -- shoot me).
The third is on-site but not connected to the network and gets rotated with the second weekly.
The fourth is off-site and gets rotated to on-site quarterly.If you're really wed to the idea of off-line archival storage, get these for each customer -- get two so you can have two sets of data in case one goes bad: http://www.newegg.com/Product/Product.aspx?Item=N82E16822154428 [newegg.com]</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352692</id>
	<title>Re:LTO-4?</title>
	<author>Anonymous</author>
	<datestamp>1267622160000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>I figured LTO-5 would be out by now.<br>http://www-03.ibm.com/systems/storage/tape/ts3100/browse.html<br>I think You can configure that with a single drive; Plus you can get encryption with no speed difference.</p><p>If you want to go cheap on the encryption there are ways to run the EKM for free instead of purchasing the new TKLM.</p></htmltext>
<tokenext>I figured LTO-5 would be out by now.http : //www-03.ibm.com/systems/storage/tape/ts3100/browse.htmlI think You can configure that with a single drive ; Plus you can get encryption with no speed difference.If you want to go cheap on the encryption there are ways to run the EKM for free instead of purchasing the new TKLM .</tokentext>
<sentencetext>I figured LTO-5 would be out by now.http://www-03.ibm.com/systems/storage/tape/ts3100/browse.htmlI think You can configure that with a single drive; Plus you can get encryption with no speed difference.If you want to go cheap on the encryption there are ways to run the EKM for free instead of purchasing the new TKLM.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351720</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354756</id>
	<title>Screw RAID 5</title>
	<author>hedronist</author>
	<datestamp>1267640580000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>
Sorry, but disk is free and RAID 5 has some really nasty rebuild times for large drives. It's also important to remember that RAID is for <i>availability</i>, not backup.
</p><p>
Our largish datasets (2-10TB) run on RAID 1 mirrors for availability and are fully rsync'd to a separate system (also mirrored) for backup. The stuff we <i>really</i> care about is further rsync/rsnapshot'ed offsite.
</p><p>
Disk. Is. Free.</p></htmltext>
<tokenext>Sorry , but disk is free and RAID 5 has some really nasty rebuild times for large drives .
It 's also important to remember that RAID is for availability , not backup .
Our largish datasets ( 2-10TB ) run on RAID 1 mirrors for availability and are fully rsync 'd to a separate system ( also mirrored ) for backup .
The stuff we really care about is further rsync/rsnapshot'ed offsite .
Disk. Is .
Free .</tokentext>
<sentencetext>
Sorry, but disk is free and RAID 5 has some really nasty rebuild times for large drives.
It's also important to remember that RAID is for availability, not backup.
Our largish datasets (2-10TB) run on RAID 1 mirrors for availability and are fully rsync'd to a separate system (also mirrored) for backup.
The stuff we really care about is further rsync/rsnapshot'ed offsite.
Disk. Is.
Free.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351342</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351574</id>
	<title>Re:Tape is your friend</title>
	<author>Icegryphon</author>
	<datestamp>1267616340000</datestamp>
	<modclass>Funny</modclass>
	<modscore>1</modscore>
	<htmltext>I laugh at your table on wiki<br>
3.2 TBA what kind of weakling only has 3.2TB?<br>
That is a like throwing Zip drives at the problem.</htmltext>
<tokenext>I laugh at your table on wiki 3.2 TBA what kind of weakling only has 3.2TB ?
That is a like throwing Zip drives at the problem .</tokentext>
<sentencetext>I laugh at your table on wiki
3.2 TBA what kind of weakling only has 3.2TB?
That is a like throwing Zip drives at the problem.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351414</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351348</id>
	<title>GMail Drive</title>
	<author>Anonymous</author>
	<datestamp>1267615200000</datestamp>
	<modclass>Funny</modclass>
	<modscore>2</modscore>
	<htmltext><p>Unlimited space with several accounts.</p></htmltext>
<tokenext>Unlimited space with several accounts .</tokentext>
<sentencetext>Unlimited space with several accounts.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351904</id>
	<title>Active storage is the only way</title>
	<author>Overzeetop</author>
	<datestamp>1267617780000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>The problem with storing things is that they tend to degrade over time, and you never know when they'll fail.</p><p>Without being ridiculous, four sets in two locations is the best bet.  Two sets are on line, and a regular parity check should be made between the two, with full data verification on a longer scale basis.  One backup set gets made of each online set (an external drive which is sync'd once a week/month is likely good enough) and stored unpowered. This prevents local disaster from destroying your data, electrical damage from destroying your data, and (hopefully) bit rot from corrupting your data (two online sets provide a cross check, offline sets allow polling if the parity is off).</p><p>Mechanisms usually last longer when they are in service than when they are left unattended, though it takes power and - more importantly - a human to keep tabs on the system.  You should also have a 5-8 year migration plan so that the data is updated to current interface standards on a regular schedule. The biggest fear, short of actual data loss, is that your storage medium will be unreadable at the indeterminate point in the future when it becomes necessary to retrieve the data.</p><p>This is, no doubt, more money than you've budgeted for the storage. Whatever you do, don't use RAID5. Two failures = zero data. Better to use a RAID4 (JBOD with Parity). If you lose 2 drives, you lose data, but at least you only use 1 drive of data for each additional drive failure.</p></htmltext>
<tokenext>The problem with storing things is that they tend to degrade over time , and you never know when they 'll fail.Without being ridiculous , four sets in two locations is the best bet .
Two sets are on line , and a regular parity check should be made between the two , with full data verification on a longer scale basis .
One backup set gets made of each online set ( an external drive which is sync 'd once a week/month is likely good enough ) and stored unpowered .
This prevents local disaster from destroying your data , electrical damage from destroying your data , and ( hopefully ) bit rot from corrupting your data ( two online sets provide a cross check , offline sets allow polling if the parity is off ) .Mechanisms usually last longer when they are in service than when they are left unattended , though it takes power and - more importantly - a human to keep tabs on the system .
You should also have a 5-8 year migration plan so that the data is updated to current interface standards on a regular schedule .
The biggest fear , short of actual data loss , is that your storage medium will be unreadable at the indeterminate point in the future when it becomes necessary to retrieve the data.This is , no doubt , more money than you 've budgeted for the storage .
Whatever you do , do n't use RAID5 .
Two failures = zero data .
Better to use a RAID4 ( JBOD with Parity ) .
If you lose 2 drives , you lose data , but at least you only use 1 drive of data for each additional drive failure .</tokentext>
<sentencetext>The problem with storing things is that they tend to degrade over time, and you never know when they'll fail.Without being ridiculous, four sets in two locations is the best bet.
Two sets are on line, and a regular parity check should be made between the two, with full data verification on a longer scale basis.
One backup set gets made of each online set (an external drive which is sync'd once a week/month is likely good enough) and stored unpowered.
This prevents local disaster from destroying your data, electrical damage from destroying your data, and (hopefully) bit rot from corrupting your data (two online sets provide a cross check, offline sets allow polling if the parity is off).Mechanisms usually last longer when they are in service than when they are left unattended, though it takes power and - more importantly - a human to keep tabs on the system.
You should also have a 5-8 year migration plan so that the data is updated to current interface standards on a regular schedule.
The biggest fear, short of actual data loss, is that your storage medium will be unreadable at the indeterminate point in the future when it becomes necessary to retrieve the data.This is, no doubt, more money than you've budgeted for the storage.
Whatever you do, don't use RAID5.
Two failures = zero data.
Better to use a RAID4 (JBOD with Parity).
If you lose 2 drives, you lose data, but at least you only use 1 drive of data for each additional drive failure.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351676</id>
	<title>Re:Amazon S3</title>
	<author>Anonymous</author>
	<datestamp>1267616700000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>What do you think S3 runs on? Magical pixie clouds?</p></htmltext>
<tokenext>What do you think S3 runs on ?
Magical pixie clouds ?</tokentext>
<sentencetext>What do you think S3 runs on?
Magical pixie clouds?</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351420</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354246</id>
	<title>still don't know how much data you are keeping...</title>
	<author>Anonymous</author>
	<datestamp>1267635180000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>RE: and we end up generating fairly large datasets (2-3 TB) for each customer</p><p>how many customers do you have? 1? 2? 10? 100? 1000? 5000?</p><p>how many IT staff do u have managing this?</p></htmltext>
<tokenext>RE : and we end up generating fairly large datasets ( 2-3 TB ) for each customerhow many customers do you have ?
1 ? 2 ?
10 ? 100 ?
1000 ? 5000 ? how many IT staff do u have managing this ?</tokentext>
<sentencetext>RE: and we end up generating fairly large datasets (2-3 TB) for each customerhow many customers do you have?
1? 2?
10? 100?
1000? 5000?how many IT staff do u have managing this?</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31365372</id>
	<title>Re:Another Option / Definition issues</title>
	<author>jabuzz</author>
	<datestamp>1267709760000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>If you could not store 86TB of data without spending millions of dollars then you need to get out of the storage business as you have no clue. I look after over 300TB of spinning disk at work and we ain't spent millions on it as we don't have millions to begin with. For $60k I could put your 86TB on brand new un duplicated disk with brand new servers all on a five year maintenance contract.</p></htmltext>
<tokenext>If you could not store 86TB of data without spending millions of dollars then you need to get out of the storage business as you have no clue .
I look after over 300TB of spinning disk at work and we ai n't spent millions on it as we do n't have millions to begin with .
For $ 60k I could put your 86TB on brand new un duplicated disk with brand new servers all on a five year maintenance contract .</tokentext>
<sentencetext>If you could not store 86TB of data without spending millions of dollars then you need to get out of the storage business as you have no clue.
I look after over 300TB of spinning disk at work and we ain't spent millions on it as we don't have millions to begin with.
For $60k I could put your 86TB on brand new un duplicated disk with brand new servers all on a five year maintenance contract.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354430</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351650</id>
	<title>LTO4</title>
	<author>Anonymous</author>
	<datestamp>1267616640000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>Get a couple LTO-4 tape drives (one to use, one to keep in the safety deposit box with the archived tapes...just in case) and do it right.</p><p>You can get 1TB+ on a tape.  Each dataset will take you 2-3 tapes.  Make 2 complete copies for your archive.  If the data is truly important to you, its worth the expense.</p><p>Magnetic tape has been proven to be the most reliable long term storage, short of carving your data into stone monoliths.</p></htmltext>
<tokenext>Get a couple LTO-4 tape drives ( one to use , one to keep in the safety deposit box with the archived tapes...just in case ) and do it right.You can get 1TB + on a tape .
Each dataset will take you 2-3 tapes .
Make 2 complete copies for your archive .
If the data is truly important to you , its worth the expense.Magnetic tape has been proven to be the most reliable long term storage , short of carving your data into stone monoliths .</tokentext>
<sentencetext>Get a couple LTO-4 tape drives (one to use, one to keep in the safety deposit box with the archived tapes...just in case) and do it right.You can get 1TB+ on a tape.
Each dataset will take you 2-3 tapes.
Make 2 complete copies for your archive.
If the data is truly important to you, its worth the expense.Magnetic tape has been proven to be the most reliable long term storage, short of carving your data into stone monoliths.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31355140</id>
	<title>Tape... still best solution....</title>
	<author>Fallen Kell</author>
	<datestamp>1267645440000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>I know you said a tape library is impractical, but depending on how many customers you have, it may still be your best solution. An LTO-4 tape drive with a small multi-tape bay and robotic picker really isn't that expensive (in the scheme of things) anymore. You can get a small sized LTO-4 vault/autoloader (24 slots) from Oracle/Sun for about $5k. Tapes are around $40-50 a piece and store 1.6TB each, which is a heck of a lot cheaper than hard drives for the size. So for your 2-3TB you would need 2 $50 tapes, vs., 7 $70 500GB hard drives, for a savings of almost $400 just for one customer's data sets. If you have 10 data sets, you just paid for your tape autoloader and are using a proven long term storage safe solution.</htmltext>
<tokenext>I know you said a tape library is impractical , but depending on how many customers you have , it may still be your best solution .
An LTO-4 tape drive with a small multi-tape bay and robotic picker really is n't that expensive ( in the scheme of things ) anymore .
You can get a small sized LTO-4 vault/autoloader ( 24 slots ) from Oracle/Sun for about $ 5k .
Tapes are around $ 40-50 a piece and store 1.6TB each , which is a heck of a lot cheaper than hard drives for the size .
So for your 2-3TB you would need 2 $ 50 tapes , vs. , 7 $ 70 500GB hard drives , for a savings of almost $ 400 just for one customer 's data sets .
If you have 10 data sets , you just paid for your tape autoloader and are using a proven long term storage safe solution .</tokentext>
<sentencetext>I know you said a tape library is impractical, but depending on how many customers you have, it may still be your best solution.
An LTO-4 tape drive with a small multi-tape bay and robotic picker really isn't that expensive (in the scheme of things) anymore.
You can get a small sized LTO-4 vault/autoloader (24 slots) from Oracle/Sun for about $5k.
Tapes are around $40-50 a piece and store 1.6TB each, which is a heck of a lot cheaper than hard drives for the size.
So for your 2-3TB you would need 2 $50 tapes, vs., 7 $70 500GB hard drives, for a savings of almost $400 just for one customer's data sets.
If you have 10 data sets, you just paid for your tape autoloader and are using a proven long term storage safe solution.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351504</id>
	<title>Re:Exactly what you're doing</title>
	<author>forgottenusername</author>
	<datestamp>1267616040000</datestamp>
	<modclass>Interestin</modclass>
	<modscore>5</modscore>
	<htmltext><p>I don't think it's a great solution. You're storing relatively fragile hard drives in a raid5 configuration in a lock box? It's not like you can tell if one of the drives goes bad and needs to be replaced when it's sitting in a box. You'd have to regularly pull the data sets out, fire them up and make sure everything is still functional.</p><p>I'd at least want to do 2 complete sets of mirrored drives.</p><p>Tape storage does store better.</p><p>Depending on how important the data is, I might do something like a local mirrored drive set in storage and an online copy at something like rsync.net - stay away from s3, it's not designed to protect data, despite what AWS fans may say.</p></htmltext>
<tokenext>I do n't think it 's a great solution .
You 're storing relatively fragile hard drives in a raid5 configuration in a lock box ?
It 's not like you can tell if one of the drives goes bad and needs to be replaced when it 's sitting in a box .
You 'd have to regularly pull the data sets out , fire them up and make sure everything is still functional.I 'd at least want to do 2 complete sets of mirrored drives.Tape storage does store better.Depending on how important the data is , I might do something like a local mirrored drive set in storage and an online copy at something like rsync.net - stay away from s3 , it 's not designed to protect data , despite what AWS fans may say .</tokentext>
<sentencetext>I don't think it's a great solution.
You're storing relatively fragile hard drives in a raid5 configuration in a lock box?
It's not like you can tell if one of the drives goes bad and needs to be replaced when it's sitting in a box.
You'd have to regularly pull the data sets out, fire them up and make sure everything is still functional.I'd at least want to do 2 complete sets of mirrored drives.Tape storage does store better.Depending on how important the data is, I might do something like a local mirrored drive set in storage and an online copy at something like rsync.net - stay away from s3, it's not designed to protect data, despite what AWS fans may say.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351342</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351448</id>
	<title>Practice your Recovery Method</title>
	<author>abfan1127</author>
	<datestamp>1267615740000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>If you go RAID5, have a known method for recovering if a drive fails. Actually perform a recovery before pushing it into service. I say this because some RAID5 cards use nonstandard methods making recovery very difficult and expensive. I'd also consider a process to transfer your datasets to new drives periodically so as not to lose your data.</htmltext>
<tokenext>If you go RAID5 , have a known method for recovering if a drive fails .
Actually perform a recovery before pushing it into service .
I say this because some RAID5 cards use nonstandard methods making recovery very difficult and expensive .
I 'd also consider a process to transfer your datasets to new drives periodically so as not to lose your data .</tokentext>
<sentencetext>If you go RAID5, have a known method for recovering if a drive fails.
Actually perform a recovery before pushing it into service.
I say this because some RAID5 cards use nonstandard methods making recovery very difficult and expensive.
I'd also consider a process to transfer your datasets to new drives periodically so as not to lose your data.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352894</id>
	<title>Re:Exactly what you're doing</title>
	<author>newdsfornerds</author>
	<datestamp>1267623600000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>Why is tape storage better? Cheaper than drives per GB? Certainly tapes are more rugged when it comes to gforce shock but aside from that why are they superior?</htmltext>
<tokenext>Why is tape storage better ?
Cheaper than drives per GB ?
Certainly tapes are more rugged when it comes to gforce shock but aside from that why are they superior ?</tokentext>
<sentencetext>Why is tape storage better?
Cheaper than drives per GB?
Certainly tapes are more rugged when it comes to gforce shock but aside from that why are they superior?</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351504</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31370142</id>
	<title>Re:For what it's worth</title>
	<author>pnutjam</author>
	<datestamp>1267799940000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>Tape would probably have fared worse.</htmltext>
<tokenext>Tape would probably have fared worse .</tokentext>
<sentencetext>Tape would probably have fared worse.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351626</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353266</id>
	<title>LTO-4 tape</title>
	<author>ediacaran</author>
	<datestamp>1267626900000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>I don't recommend using hard drives for archival purposes, they're not rated for that. Have you considered LTO-4 tapes: the price per unit GB comparable and they are intended for archival storage. You could ship them to an offsite location or make two copies and send to two locations if you're paranoid. Setup is relatively cheap if you just buy a tape drive, somewhat more if you buy fancy software and a tape robot.</p></htmltext>
<tokenext>I do n't recommend using hard drives for archival purposes , they 're not rated for that .
Have you considered LTO-4 tapes : the price per unit GB comparable and they are intended for archival storage .
You could ship them to an offsite location or make two copies and send to two locations if you 're paranoid .
Setup is relatively cheap if you just buy a tape drive , somewhat more if you buy fancy software and a tape robot .</tokentext>
<sentencetext>I don't recommend using hard drives for archival purposes, they're not rated for that.
Have you considered LTO-4 tapes: the price per unit GB comparable and they are intended for archival storage.
You could ship them to an offsite location or make two copies and send to two locations if you're paranoid.
Setup is relatively cheap if you just buy a tape drive, somewhat more if you buy fancy software and a tape robot.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352996</id>
	<title>RAR Compression with Parity Files?</title>
	<author>dthardcore</author>
	<datestamp>1267624380000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>I don't know exactly what you are looking for but what about using RAR files to split up the data and using parity files to ensure the data can be restored properly? This way you would avoid having to create a raid array to have enough space to store the files. plus it will limit your chances of accidentally losing the data when trying to recreate the raid array. You could then store these files on whatever media you want (SSD, Blu-ray, HDD, Tapes etc.) Plus this would make it easier to use multiple media sets for redundancy, say store your main copy on a couple of HDD and then have a Blu-ray backup set JIC.

Being a Network Engineer myself, I still strongly believe you should be using a tape drive/library for backing up these files as the media will last a long time. Even opting for the cheaper LTO2 drive to save some money and just requiring a few more tapes. Depending on how well your dataset compresses an LTO2 tape will hold up to 400GB(Compressed) and 200GB(Uncompressed). I personally like EMC's Networker product for running backups but since price is an issue you might want to look at a product like Arcserve or an Open Source product instead.</htmltext>
<tokenext>I do n't know exactly what you are looking for but what about using RAR files to split up the data and using parity files to ensure the data can be restored properly ?
This way you would avoid having to create a raid array to have enough space to store the files .
plus it will limit your chances of accidentally losing the data when trying to recreate the raid array .
You could then store these files on whatever media you want ( SSD , Blu-ray , HDD , Tapes etc .
) Plus this would make it easier to use multiple media sets for redundancy , say store your main copy on a couple of HDD and then have a Blu-ray backup set JIC .
Being a Network Engineer myself , I still strongly believe you should be using a tape drive/library for backing up these files as the media will last a long time .
Even opting for the cheaper LTO2 drive to save some money and just requiring a few more tapes .
Depending on how well your dataset compresses an LTO2 tape will hold up to 400GB ( Compressed ) and 200GB ( Uncompressed ) .
I personally like EMC 's Networker product for running backups but since price is an issue you might want to look at a product like Arcserve or an Open Source product instead .</tokentext>
<sentencetext>I don't know exactly what you are looking for but what about using RAR files to split up the data and using parity files to ensure the data can be restored properly?
This way you would avoid having to create a raid array to have enough space to store the files.
plus it will limit your chances of accidentally losing the data when trying to recreate the raid array.
You could then store these files on whatever media you want (SSD, Blu-ray, HDD, Tapes etc.
) Plus this would make it easier to use multiple media sets for redundancy, say store your main copy on a couple of HDD and then have a Blu-ray backup set JIC.
Being a Network Engineer myself, I still strongly believe you should be using a tape drive/library for backing up these files as the media will last a long time.
Even opting for the cheaper LTO2 drive to save some money and just requiring a few more tapes.
Depending on how well your dataset compresses an LTO2 tape will hold up to 400GB(Compressed) and 200GB(Uncompressed).
I personally like EMC's Networker product for running backups but since price is an issue you might want to look at a product like Arcserve or an Open Source product instead.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351604</id>
	<title>use a tape drive</title>
	<author>Lehk228</author>
	<datestamp>1267616400000</datestamp>
	<modclass>Informativ</modclass>
	<modscore>3</modscore>
	<htmltext>you make the assertion that a tape archive would be impractical, but really it is the most practical solution.  the drive will set you back a couple thousand, but 800 gig tapes are only around 40 bucks each, and they are engineered for data storage unlike hard drives.  this will only cost $160 per 3 gig dataset, or 200 if you use par2 files and an extra tape to make it recoverable in case a tape does fail.</htmltext>
<tokenext>you make the assertion that a tape archive would be impractical , but really it is the most practical solution .
the drive will set you back a couple thousand , but 800 gig tapes are only around 40 bucks each , and they are engineered for data storage unlike hard drives .
this will only cost $ 160 per 3 gig dataset , or 200 if you use par2 files and an extra tape to make it recoverable in case a tape does fail .</tokentext>
<sentencetext>you make the assertion that a tape archive would be impractical, but really it is the most practical solution.
the drive will set you back a couple thousand, but 800 gig tapes are only around 40 bucks each, and they are engineered for data storage unlike hard drives.
this will only cost $160 per 3 gig dataset, or 200 if you use par2 files and an extra tape to make it recoverable in case a tape does fail.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351524</id>
	<title>Re:Tape is your friend</title>
	<author>cruff</author>
	<datestamp>1267616100000</datestamp>
	<modclass>Informativ</modclass>
	<modscore>3</modscore>
	<htmltext><p>I agree, when the tapes are stored in proper environmental conditions.  You don't need a library, just use some stand alone tape drives.  Also look at the claimed media lifetime and recovered bit error rate figures to see if you are choosing the right tape drive/media.</p></htmltext>
<tokenext>I agree , when the tapes are stored in proper environmental conditions .
You do n't need a library , just use some stand alone tape drives .
Also look at the claimed media lifetime and recovered bit error rate figures to see if you are choosing the right tape drive/media .</tokentext>
<sentencetext>I agree, when the tapes are stored in proper environmental conditions.
You don't need a library, just use some stand alone tape drives.
Also look at the claimed media lifetime and recovered bit error rate figures to see if you are choosing the right tape drive/media.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351414</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31355320</id>
	<title>Removable Disk</title>
	<author>Bondo0891</author>
	<datestamp>1267734000000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>If you have a general preference for disk vs. tape, you might want to check out a ruggedized removable disk solution such as RDX.  It can be safely handled/dropped like tape, is quickly replacing the low end tape technologies like DAT, and is sold by all the major server manufacturers (Dell (as RD1000), HP, IBM, and others).  You can find more info at <a href="http://www.rdxstorage.com/" title="rdxstorage.com" rel="nofollow">http://www.rdxstorage.com/</a> [rdxstorage.com].  If you are looking for more of a managed solution, the InfiniVault product which also uses RDX basically shows up as an infinite capacity NAS device and is designed for specifically this purpose.  You can find more at <a href="http://www.prostorsystems.com/" title="prostorsystems.com" rel="nofollow">http://www.prostorsystems.com/</a> [prostorsystems.com].</htmltext>
<tokenext>If you have a general preference for disk vs. tape , you might want to check out a ruggedized removable disk solution such as RDX .
It can be safely handled/dropped like tape , is quickly replacing the low end tape technologies like DAT , and is sold by all the major server manufacturers ( Dell ( as RD1000 ) , HP , IBM , and others ) .
You can find more info at http : //www.rdxstorage.com/ [ rdxstorage.com ] .
If you are looking for more of a managed solution , the InfiniVault product which also uses RDX basically shows up as an infinite capacity NAS device and is designed for specifically this purpose .
You can find more at http : //www.prostorsystems.com/ [ prostorsystems.com ] .</tokentext>
<sentencetext>If you have a general preference for disk vs. tape, you might want to check out a ruggedized removable disk solution such as RDX.
It can be safely handled/dropped like tape, is quickly replacing the low end tape technologies like DAT, and is sold by all the major server manufacturers (Dell (as RD1000), HP, IBM, and others).
You can find more info at http://www.rdxstorage.com/ [rdxstorage.com].
If you are looking for more of a managed solution, the InfiniVault product which also uses RDX basically shows up as an infinite capacity NAS device and is designed for specifically this purpose.
You can find more at http://www.prostorsystems.com/ [prostorsystems.com].</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351778</id>
	<title>Re:bzip2</title>
	<author>mrmeval</author>
	<datestamp>1267617180000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>It would take approximately 5242.88 pages to store 1 gigabyte. This comes up from time to time. Laser printed pages will not store well over time. The toner degrades and if the pages are stacked together you lose all the sheets. Some inkjets have ink that will not glue the pages together but some ink will migrate and some is nutrient source for bacteria.</p><p>One of the better printed codes I've seen uses this <a href="http://microglyphs.com/english/html/dataglyphs.shtml" title="microglyphs.com">http://microglyphs.com/english/html/dataglyphs.shtml</a> [microglyphs.com] As an added bonus this coding can be printed with varying widths so that the data is encoded yet what is displayed is a photo. This is a form of stegonography.</p><p>So you could get your local news paper to print the data and yet still have data for people to view and your data would be stored for how ever long that newspaper edition would be archived.</p></htmltext>
<tokenext>It would take approximately 5242.88 pages to store 1 gigabyte .
This comes up from time to time .
Laser printed pages will not store well over time .
The toner degrades and if the pages are stacked together you lose all the sheets .
Some inkjets have ink that will not glue the pages together but some ink will migrate and some is nutrient source for bacteria.One of the better printed codes I 've seen uses this http : //microglyphs.com/english/html/dataglyphs.shtml [ microglyphs.com ] As an added bonus this coding can be printed with varying widths so that the data is encoded yet what is displayed is a photo .
This is a form of stegonography.So you could get your local news paper to print the data and yet still have data for people to view and your data would be stored for how ever long that newspaper edition would be archived .</tokentext>
<sentencetext>It would take approximately 5242.88 pages to store 1 gigabyte.
This comes up from time to time.
Laser printed pages will not store well over time.
The toner degrades and if the pages are stacked together you lose all the sheets.
Some inkjets have ink that will not glue the pages together but some ink will migrate and some is nutrient source for bacteria.One of the better printed codes I've seen uses this http://microglyphs.com/english/html/dataglyphs.shtml [microglyphs.com] As an added bonus this coding can be printed with varying widths so that the data is encoded yet what is displayed is a photo.
This is a form of stegonography.So you could get your local news paper to print the data and yet still have data for people to view and your data would be stored for how ever long that newspaper edition would be archived.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351344</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31356612</id>
	<title>Re:IT Auditor Opinion</title>
	<author>Stonefish</author>
	<datestamp>1267707720000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>I think a risk assessment would be the best way to proceed. That way you can develop a solution based upon somebody's perception of risk and develop an appropriately skewed solution. The reality is that your business is the most likely failure mode so a best effort mechanism for storage is appropriate.</p></htmltext>
<tokenext>I think a risk assessment would be the best way to proceed .
That way you can develop a solution based upon somebody 's perception of risk and develop an appropriately skewed solution .
The reality is that your business is the most likely failure mode so a best effort mechanism for storage is appropriate .</tokentext>
<sentencetext>I think a risk assessment would be the best way to proceed.
That way you can develop a solution based upon somebody's perception of risk and develop an appropriately skewed solution.
The reality is that your business is the most likely failure mode so a best effort mechanism for storage is appropriate.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352838</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31361056</id>
	<title>Re:Exactly what you're doing</title>
	<author>jon3k</author>
	<datestamp>1267732800000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>We had a guy who thought the best way to backup the datacenter years ago was to take 1 of the RAID1 drives out of every server, put them into a large padded/locked pelican case and store it offsite.  We spent weeks begging him not to do this.
<br> <br>
We lost about 30-40 Ultra320 SCSI disks in a year (riding in vans is not good for hard drives, who knew!) along with about 10 disk back planes in HP servers (from the drive swapping in and out).
<br> <br>
I'm glad you had good luck.  I would not trust customers data with what you've described.  I'd put an encrypted copy on two tapes and store them at two different locations.  To each his own I suppose.</htmltext>
<tokenext>We had a guy who thought the best way to backup the datacenter years ago was to take 1 of the RAID1 drives out of every server , put them into a large padded/locked pelican case and store it offsite .
We spent weeks begging him not to do this .
We lost about 30-40 Ultra320 SCSI disks in a year ( riding in vans is not good for hard drives , who knew !
) along with about 10 disk back planes in HP servers ( from the drive swapping in and out ) .
I 'm glad you had good luck .
I would not trust customers data with what you 've described .
I 'd put an encrypted copy on two tapes and store them at two different locations .
To each his own I suppose .</tokentext>
<sentencetext>We had a guy who thought the best way to backup the datacenter years ago was to take 1 of the RAID1 drives out of every server, put them into a large padded/locked pelican case and store it offsite.
We spent weeks begging him not to do this.
We lost about 30-40 Ultra320 SCSI disks in a year (riding in vans is not good for hard drives, who knew!
) along with about 10 disk back planes in HP servers (from the drive swapping in and out).
I'm glad you had good luck.
I would not trust customers data with what you've described.
I'd put an encrypted copy on two tapes and store them at two different locations.
To each his own I suppose.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353102</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353014</id>
	<title>Re:For what it's worth</title>
	<author>Anonymous</author>
	<datestamp>1267624440000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>You should ask the bank if the safe deposit box chassis is grounded. It's a very reasonable question nowadays.</p></htmltext>
<tokenext>You should ask the bank if the safe deposit box chassis is grounded .
It 's a very reasonable question nowadays .</tokentext>
<sentencetext>You should ask the bank if the safe deposit box chassis is grounded.
It's a very reasonable question nowadays.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351626</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354010</id>
	<title>Check out HDFS</title>
	<author>gyromastar</author>
	<datestamp>1267632540000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>I work for a small business where we needed to store ~15TB on the cheap. We decided to go with a distributed filesystem called HDFS. It's vastly cheaper than large HA raid arrays b/c the filesystem itself handles the redundancy by creating 3 copies of each file across the cluster. SATA works fine for HDFS so you don't need to pay the premium for SCSI.

Since most of our data happens to be log data, we use Facebook's Scribe to aggregate the logs and throw them onto HDFS for long term storage. It's not the most traditional or easy to work with storage, but it's definitely easy to scale out b/c all you have to do is add more nodes to the cluster. Hope that helps</htmltext>
<tokenext>I work for a small business where we needed to store ~ 15TB on the cheap .
We decided to go with a distributed filesystem called HDFS .
It 's vastly cheaper than large HA raid arrays b/c the filesystem itself handles the redundancy by creating 3 copies of each file across the cluster .
SATA works fine for HDFS so you do n't need to pay the premium for SCSI .
Since most of our data happens to be log data , we use Facebook 's Scribe to aggregate the logs and throw them onto HDFS for long term storage .
It 's not the most traditional or easy to work with storage , but it 's definitely easy to scale out b/c all you have to do is add more nodes to the cluster .
Hope that helps</tokentext>
<sentencetext>I work for a small business where we needed to store ~15TB on the cheap.
We decided to go with a distributed filesystem called HDFS.
It's vastly cheaper than large HA raid arrays b/c the filesystem itself handles the redundancy by creating 3 copies of each file across the cluster.
SATA works fine for HDFS so you don't need to pay the premium for SCSI.
Since most of our data happens to be log data, we use Facebook's Scribe to aggregate the logs and throw them onto HDFS for long term storage.
It's not the most traditional or easy to work with storage, but it's definitely easy to scale out b/c all you have to do is add more nodes to the cluster.
Hope that helps</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31356570</id>
	<title>Big disks and ATAoE</title>
	<author>Stonefish</author>
	<datestamp>1267706940000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>Tape sucks, the reality is that it doesn't last, the drives themselves are fragile and prone to failure. Firmware changes in what appears to be the same product may make tapes unreadable and they are expensive. Whilst the tape companies survive they're not thriving and living on pure inertia. This may change if a new technology appears but...</p><p>Hard disk size will simplify your problem a couple of years. A 5 TB drive will give you appropriate storage and you can mirror this and offer the other one to the client. That way they are complicit in any data loss scenario.</p><p>If you're moving those amounts of data around your network, I assume that you've got at least Gig ethernet. You can leverage this by using ATAoE as the basis of network storage and ether buy coraid's disk enclosures or build one and export it using Qaoed or vblade http://aoetools.sourceforge.net/. In terms of bang for buck this is one of the cheapest network storage options.</p></htmltext>
<tokenext>Tape sucks , the reality is that it does n't last , the drives themselves are fragile and prone to failure .
Firmware changes in what appears to be the same product may make tapes unreadable and they are expensive .
Whilst the tape companies survive they 're not thriving and living on pure inertia .
This may change if a new technology appears but...Hard disk size will simplify your problem a couple of years .
A 5 TB drive will give you appropriate storage and you can mirror this and offer the other one to the client .
That way they are complicit in any data loss scenario.If you 're moving those amounts of data around your network , I assume that you 've got at least Gig ethernet .
You can leverage this by using ATAoE as the basis of network storage and ether buy coraid 's disk enclosures or build one and export it using Qaoed or vblade http : //aoetools.sourceforge.net/ .
In terms of bang for buck this is one of the cheapest network storage options .</tokentext>
<sentencetext>Tape sucks, the reality is that it doesn't last, the drives themselves are fragile and prone to failure.
Firmware changes in what appears to be the same product may make tapes unreadable and they are expensive.
Whilst the tape companies survive they're not thriving and living on pure inertia.
This may change if a new technology appears but...Hard disk size will simplify your problem a couple of years.
A 5 TB drive will give you appropriate storage and you can mirror this and offer the other one to the client.
That way they are complicit in any data loss scenario.If you're moving those amounts of data around your network, I assume that you've got at least Gig ethernet.
You can leverage this by using ATAoE as the basis of network storage and ether buy coraid's disk enclosures or build one and export it using Qaoed or vblade http://aoetools.sourceforge.net/.
In terms of bang for buck this is one of the cheapest network storage options.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31417258</id>
	<title>Re:Exactly what you're doing</title>
	<author>Hurricane78</author>
	<datestamp>1268164620000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>So in essence the best thing to do, is to use ZFS with regular scrubbing on mirrored drives (triple-mirrored is best, as proven in aerospace design). But without ever changing your running system, so it can&rsquo;t break, and then move the data out to whatever replaces it, if the stream of replacement parts <em>starts</em> to run dry.</p><p>That&rsquo;s the problem with every non-live backup solution, including paper: You never know if you&rsquo;ll get your data back, unless you try. And if you do, then you can run a live system that automates it, anyway.</p></htmltext>
<tokenext>So in essence the best thing to do , is to use ZFS with regular scrubbing on mirrored drives ( triple-mirrored is best , as proven in aerospace design ) .
But without ever changing your running system , so it can    t break , and then move the data out to whatever replaces it , if the stream of replacement parts starts to run dry.That    s the problem with every non-live backup solution , including paper : You never know if you    ll get your data back , unless you try .
And if you do , then you can run a live system that automates it , anyway .</tokentext>
<sentencetext>So in essence the best thing to do, is to use ZFS with regular scrubbing on mirrored drives (triple-mirrored is best, as proven in aerospace design).
But without ever changing your running system, so it can’t break, and then move the data out to whatever replaces it, if the stream of replacement parts starts to run dry.That’s the problem with every non-live backup solution, including paper: You never know if you’ll get your data back, unless you try.
And if you do, then you can run a live system that automates it, anyway.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351504</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31358572</id>
	<title>Re:Drobo fan and user</title>
	<author>Anonymous</author>
	<datestamp>1267721820000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>Drobo = closed system.  Drobe dies (or has a leap year problem) you are screwed.</p><p>Avoid them.  They are simple solutions for people who are:<br>1) too cheap to get a good solution<br>2) not technical enough to roll their own solution<br>3) not saavy enough to know they are at risk</p><p>Me, I'm too cheap to get a good solution and too lazy to roll my own but I know I'm at risk &amp; a bottle of single malt is cheaper than 2x2TB external drives<nobr> <wbr></nobr>:)</p></htmltext>
<tokenext>Drobo = closed system .
Drobe dies ( or has a leap year problem ) you are screwed.Avoid them .
They are simple solutions for people who are : 1 ) too cheap to get a good solution2 ) not technical enough to roll their own solution3 ) not saavy enough to know they are at riskMe , I 'm too cheap to get a good solution and too lazy to roll my own but I know I 'm at risk &amp; a bottle of single malt is cheaper than 2x2TB external drives : )</tokentext>
<sentencetext>Drobo = closed system.
Drobe dies (or has a leap year problem) you are screwed.Avoid them.
They are simple solutions for people who are:1) too cheap to get a good solution2) not technical enough to roll their own solution3) not saavy enough to know they are at riskMe, I'm too cheap to get a good solution and too lazy to roll my own but I know I'm at risk &amp; a bottle of single malt is cheaper than 2x2TB external drives :)</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351502</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351610</id>
	<title>you need archiving and not a backup tool</title>
	<author>Anonymous</author>
	<datestamp>1267616460000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>What you need is Database Archiving Solution (HP offers such tool). With this you can export your DBA using archiving tool into xml files and without having the need to acutally keep copies of the DB's itself. This tool will also allow you to recreate your db using the xml file.</p></htmltext>
<tokenext>What you need is Database Archiving Solution ( HP offers such tool ) .
With this you can export your DBA using archiving tool into xml files and without having the need to acutally keep copies of the DB 's itself .
This tool will also allow you to recreate your db using the xml file .</tokentext>
<sentencetext>What you need is Database Archiving Solution (HP offers such tool).
With this you can export your DBA using archiving tool into xml files and without having the need to acutally keep copies of the DB's itself.
This tool will also allow you to recreate your db using the xml file.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31357846</id>
	<title>Re:use a tape drive</title>
	<author>Anonymous</author>
	<datestamp>1267717620000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>"..and they are engineered for data storage unlike hard drives."</p><p>Hm, what was my hard drive engineered for if not to store data?  Was it engineered to play music instead?</p></htmltext>
<tokenext>" ..and they are engineered for data storage unlike hard drives .
" Hm , what was my hard drive engineered for if not to store data ?
Was it engineered to play music instead ?</tokentext>
<sentencetext>"..and they are engineered for data storage unlike hard drives.
"Hm, what was my hard drive engineered for if not to store data?
Was it engineered to play music instead?</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351604</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31357348</id>
	<title>I heard of...</title>
	<author>hesaigo999ca</author>
	<datestamp>1267714320000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>There was a story on Slashdot 10 years ago about a guy that developed a way of putting 50gb of data on one sheet of regular paper based on an algorithm using shapes and colors...I am uncertain about the degradation of the paper over time, but you may today be able to buy into that technology...although I am uncertain as to where or how....</p></htmltext>
<tokenext>There was a story on Slashdot 10 years ago about a guy that developed a way of putting 50gb of data on one sheet of regular paper based on an algorithm using shapes and colors...I am uncertain about the degradation of the paper over time , but you may today be able to buy into that technology...although I am uncertain as to where or how... .</tokentext>
<sentencetext>There was a story on Slashdot 10 years ago about a guy that developed a way of putting 50gb of data on one sheet of regular paper based on an algorithm using shapes and colors...I am uncertain about the degradation of the paper over time, but you may today be able to buy into that technology...although I am uncertain as to where or how....</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352674</id>
	<title>Let someone else worry about it</title>
	<author>Turzyx</author>
	<datestamp>1267622040000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>Rent some decent off-site storage at an established data centre and get a leased data line.
<br> <br>
Don't bother messing about with tapes, it will be a full time job maintaining the library and space will be an issue after a while - I presume this is why you think it is impractical. With a proper data centre, you shouldn't have to worry about drives failing or the storage medium degrading due to age; most offer multiple site redundancy as well.
<br> <br>
Seriously, don't get clever; save yourself the hassle and your business' reputation if something goes wrong with your 'lockbox' method.</htmltext>
<tokenext>Rent some decent off-site storage at an established data centre and get a leased data line .
Do n't bother messing about with tapes , it will be a full time job maintaining the library and space will be an issue after a while - I presume this is why you think it is impractical .
With a proper data centre , you should n't have to worry about drives failing or the storage medium degrading due to age ; most offer multiple site redundancy as well .
Seriously , do n't get clever ; save yourself the hassle and your business ' reputation if something goes wrong with your 'lockbox ' method .</tokentext>
<sentencetext>Rent some decent off-site storage at an established data centre and get a leased data line.
Don't bother messing about with tapes, it will be a full time job maintaining the library and space will be an issue after a while - I presume this is why you think it is impractical.
With a proper data centre, you shouldn't have to worry about drives failing or the storage medium degrading due to age; most offer multiple site redundancy as well.
Seriously, don't get clever; save yourself the hassle and your business' reputation if something goes wrong with your 'lockbox' method.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31355856</id>
	<title>Cost effective is a customer decision, not yours</title>
	<author>slincolne</author>
	<datestamp>1267697700000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>Talk to your customer, and ask them if you should be keeping copies of their data. <p>
If they don't want you to, problem solved.</p><p>
If they do, ask them how badly they want to (ie $$$)</p><p>
If they won't pay the costs, problem solved</p><p>
If they will, again problem solved</p><p>
My advice would be to cost out the price of a proper tape library - I know you said you don't want to do that, but honestly if you don't want to do it properly you are taking a huge risk by cutting corners.  Tape is one of the best archival forms for storing (you've seen how many people are doing it) and to do it on disk takes more work and effort that you can probably manage.  If the customers want multiple copies kept, then charge it at a sensible price and use a commercial storage firm.</p><p>
It may sound strange, but if you do it properly while you are a small business, then you will have fewer problems when you become a large business.  If you are trying to grow your own solution then you are effectively expanding your menu of services to scientific and storage services, or potentially risking becoming a small business that failed.</p></htmltext>
<tokenext>Talk to your customer , and ask them if you should be keeping copies of their data .
If they do n't want you to , problem solved .
If they do , ask them how badly they want to ( ie $ $ $ ) If they wo n't pay the costs , problem solved If they will , again problem solved My advice would be to cost out the price of a proper tape library - I know you said you do n't want to do that , but honestly if you do n't want to do it properly you are taking a huge risk by cutting corners .
Tape is one of the best archival forms for storing ( you 've seen how many people are doing it ) and to do it on disk takes more work and effort that you can probably manage .
If the customers want multiple copies kept , then charge it at a sensible price and use a commercial storage firm .
It may sound strange , but if you do it properly while you are a small business , then you will have fewer problems when you become a large business .
If you are trying to grow your own solution then you are effectively expanding your menu of services to scientific and storage services , or potentially risking becoming a small business that failed .</tokentext>
<sentencetext>Talk to your customer, and ask them if you should be keeping copies of their data.
If they don't want you to, problem solved.
If they do, ask them how badly they want to (ie $$$)
If they won't pay the costs, problem solved
If they will, again problem solved
My advice would be to cost out the price of a proper tape library - I know you said you don't want to do that, but honestly if you don't want to do it properly you are taking a huge risk by cutting corners.
Tape is one of the best archival forms for storing (you've seen how many people are doing it) and to do it on disk takes more work and effort that you can probably manage.
If the customers want multiple copies kept, then charge it at a sensible price and use a commercial storage firm.
It may sound strange, but if you do it properly while you are a small business, then you will have fewer problems when you become a large business.
If you are trying to grow your own solution then you are effectively expanding your menu of services to scientific and storage services, or potentially risking becoming a small business that failed.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353100</id>
	<title>Collecting data sets or generating data sets?</title>
	<author>Anonymous</author>
	<datestamp>1267625220000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>And under what conditions do they need to be archived?  Everyone else is busy spewing technical and product solutions.  However archival is more dependent upon issues like "Is this historical data collected for a scientific study?"  If so, you want to chat with groups like photo astronomers as to what they do for data archival.  Whatever archival media you choose will be obsolete within a few years and you'll need to regularly migrate the data as new technology is adopted.  If it's generated through computation and simulation and is likely to be requested again in the near future, invest in the infrastructure to keep it spinning until it's cheaper to recompute it.  For cases where it is needed infrequently, tape with an offsite storage company is the better bet.  Remember that with tape you should be doing random pulls monthly to make sure you can actually retrieve the data.</p><p>And it's probably worth punting to your comptroller and legal department.  If the liability for having just discarded the data is less than the cost to archive it properly, they're going to tell you to delete it after the customer says its good.  Plus if archival is important, they have to figure out how to drop the cost onto your customers anyways.</p><p>Now when it actually comes time to decide technology... pick the vendor that does the best job of taking you and your peers out to lunch regularly.  Their product, like everyone else's, will fail to meet expectations, but you'll get lunch.</p></htmltext>
<tokenext>And under what conditions do they need to be archived ?
Everyone else is busy spewing technical and product solutions .
However archival is more dependent upon issues like " Is this historical data collected for a scientific study ?
" If so , you want to chat with groups like photo astronomers as to what they do for data archival .
Whatever archival media you choose will be obsolete within a few years and you 'll need to regularly migrate the data as new technology is adopted .
If it 's generated through computation and simulation and is likely to be requested again in the near future , invest in the infrastructure to keep it spinning until it 's cheaper to recompute it .
For cases where it is needed infrequently , tape with an offsite storage company is the better bet .
Remember that with tape you should be doing random pulls monthly to make sure you can actually retrieve the data.And it 's probably worth punting to your comptroller and legal department .
If the liability for having just discarded the data is less than the cost to archive it properly , they 're going to tell you to delete it after the customer says its good .
Plus if archival is important , they have to figure out how to drop the cost onto your customers anyways.Now when it actually comes time to decide technology... pick the vendor that does the best job of taking you and your peers out to lunch regularly .
Their product , like everyone else 's , will fail to meet expectations , but you 'll get lunch .</tokentext>
<sentencetext>And under what conditions do they need to be archived?
Everyone else is busy spewing technical and product solutions.
However archival is more dependent upon issues like "Is this historical data collected for a scientific study?
"  If so, you want to chat with groups like photo astronomers as to what they do for data archival.
Whatever archival media you choose will be obsolete within a few years and you'll need to regularly migrate the data as new technology is adopted.
If it's generated through computation and simulation and is likely to be requested again in the near future, invest in the infrastructure to keep it spinning until it's cheaper to recompute it.
For cases where it is needed infrequently, tape with an offsite storage company is the better bet.
Remember that with tape you should be doing random pulls monthly to make sure you can actually retrieve the data.And it's probably worth punting to your comptroller and legal department.
If the liability for having just discarded the data is less than the cost to archive it properly, they're going to tell you to delete it after the customer says its good.
Plus if archival is important, they have to figure out how to drop the cost onto your customers anyways.Now when it actually comes time to decide technology... pick the vendor that does the best job of taking you and your peers out to lunch regularly.
Their product, like everyone else's, will fail to meet expectations, but you'll get lunch.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351634</id>
	<title>Re:Tape is your friend</title>
	<author>mengel</author>
	<datestamp>1267616580000</datestamp>
	<modclass>Insightful</modclass>
	<modscore>3</modscore>
	<htmltext><p>
There's some code lurking in the amanda backup package I did a while back for "RAIT" (RAID with tape instead of disk) to make a stripe-set of tapes, if you need several tapes worth of data in one set, with redundancy.
</p><p>
On the other hand, while LT04 tapes are about half the price ($40) of cheap 1TB disk drives ($80), the tape <b>drives</b> are ablout $2k apiece, so depending how many data sets you want to keep, and for how long, the disk drives may really be cheaper...</p></htmltext>
<tokenext>There 's some code lurking in the amanda backup package I did a while back for " RAIT " ( RAID with tape instead of disk ) to make a stripe-set of tapes , if you need several tapes worth of data in one set , with redundancy .
On the other hand , while LT04 tapes are about half the price ( $ 40 ) of cheap 1TB disk drives ( $ 80 ) , the tape drives are ablout $ 2k apiece , so depending how many data sets you want to keep , and for how long , the disk drives may really be cheaper.. .</tokentext>
<sentencetext>
There's some code lurking in the amanda backup package I did a while back for "RAIT" (RAID with tape instead of disk) to make a stripe-set of tapes, if you need several tapes worth of data in one set, with redundancy.
On the other hand, while LT04 tapes are about half the price ($40) of cheap 1TB disk drives ($80), the tape drives are ablout $2k apiece, so depending how many data sets you want to keep, and for how long, the disk drives may really be cheaper...</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351414</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353556</id>
	<title>Re:LTO-4?</title>
	<author>turbidostato</author>
	<datestamp>1267629120000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>"I've gotten as much as 5TB onto a single LTO-4 tape using the regular drive compression."</p><p>Regular *drive* compression? I hope you use it for short them backup, not long term storage.</p></htmltext>
<tokenext>" I 've gotten as much as 5TB onto a single LTO-4 tape using the regular drive compression .
" Regular * drive * compression ?
I hope you use it for short them backup , not long term storage .</tokentext>
<sentencetext>"I've gotten as much as 5TB onto a single LTO-4 tape using the regular drive compression.
"Regular *drive* compression?
I hope you use it for short them backup, not long term storage.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351720</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353386</id>
	<title>Re:Exactly what you're doing</title>
	<author>Anonymous</author>
	<datestamp>1267627980000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>Never use RAID5! Here's the conclusion from the following <a href="http://miracleas.com/BAARF/RAID5\_versus\_RAID10.txt" title="miracleas.com" rel="nofollow">article</a> [miracleas.com]:</p><p>"For safety and performance favor RAID10 first, RAID3 second,<br>RAID4 third, and RAID5 last!  The original reason for the RAID2-5 specs<br>was that the high cost of disks was making RAID1, mirroring, impractical.<br>That is no longer the case!  Drives are commodity priced, even the biggest<br>fastest drives are cheaper in absolute dollars than drives were then and<br>cost per MB is a tiny fraction of what it was."</p></htmltext>
<tokenext>Never use RAID5 !
Here 's the conclusion from the following article [ miracleas.com ] : " For safety and performance favor RAID10 first , RAID3 second,RAID4 third , and RAID5 last !
The original reason for the RAID2-5 specswas that the high cost of disks was making RAID1 , mirroring , impractical.That is no longer the case !
Drives are commodity priced , even the biggestfastest drives are cheaper in absolute dollars than drives were then andcost per MB is a tiny fraction of what it was .
"</tokentext>
<sentencetext>Never use RAID5!
Here's the conclusion from the following article [miracleas.com]:"For safety and performance favor RAID10 first, RAID3 second,RAID4 third, and RAID5 last!
The original reason for the RAID2-5 specswas that the high cost of disks was making RAID1, mirroring, impractical.That is no longer the case!
Drives are commodity priced, even the biggestfastest drives are cheaper in absolute dollars than drives were then andcost per MB is a tiny fraction of what it was.
"</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351342</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353022</id>
	<title>These guys offer distributed, 96-way encrypted RAI</title>
	<author>melted</author>
	<datestamp>1267624560000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>These guys offer distributed, 96-way encrypted RAID (32 parity slices, 64 data): <a href="http://www.symform.com/" title="symform.com">http://www.symform.com/</a> [symform.com]. Check them out. You will have to pony up the same amount of disk space as you consume, though, but to me this does seem a heck of a lot more reliable than RAID 5 or 6, and the drives can be super cheap.</p></htmltext>
<tokenext>These guys offer distributed , 96-way encrypted RAID ( 32 parity slices , 64 data ) : http : //www.symform.com/ [ symform.com ] .
Check them out .
You will have to pony up the same amount of disk space as you consume , though , but to me this does seem a heck of a lot more reliable than RAID 5 or 6 , and the drives can be super cheap .</tokentext>
<sentencetext>These guys offer distributed, 96-way encrypted RAID (32 parity slices, 64 data): http://www.symform.com/ [symform.com].
Check them out.
You will have to pony up the same amount of disk space as you consume, though, but to me this does seem a heck of a lot more reliable than RAID 5 or 6, and the drives can be super cheap.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351956</id>
	<title>Re:Exactly what you're doing</title>
	<author>MoonBuggy</author>
	<datestamp>1267618140000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p><div class="quote"><p>Tape storage does store better.</p></div><p>Admittedly the submitter said tape would be impractical, but my nerdly curiosity has been piqued: how reliable are relatively cheap tape systems?</p><p>The price crossover point seems fairly reasonable even for a small-ish operation, if you're looking at a few TB per customer. A quick look on Google puts drives at about &pound;700 and 800GB tapes at ~&pound;20, compared to ~&pound;55 for 1TB hard drives.</p><p>Going on &pound;0.055/GB for hard drives and &pound;0.025/GB for tapes, my quick back of the envelope calculation says that the investment in the drive amortizes after around 23.3TB for 800GB tapes.</p></div>
	</htmltext>
<tokenext>Tape storage does store better.Admittedly the submitter said tape would be impractical , but my nerdly curiosity has been piqued : how reliable are relatively cheap tape systems ? The price crossover point seems fairly reasonable even for a small-ish operation , if you 're looking at a few TB per customer .
A quick look on Google puts drives at about   700 and 800GB tapes at ~   20 , compared to ~   55 for 1TB hard drives.Going on   0.055/GB for hard drives and   0.025/GB for tapes , my quick back of the envelope calculation says that the investment in the drive amortizes after around 23.3TB for 800GB tapes .</tokentext>
<sentencetext>Tape storage does store better.Admittedly the submitter said tape would be impractical, but my nerdly curiosity has been piqued: how reliable are relatively cheap tape systems?The price crossover point seems fairly reasonable even for a small-ish operation, if you're looking at a few TB per customer.
A quick look on Google puts drives at about £700 and 800GB tapes at ~£20, compared to ~£55 for 1TB hard drives.Going on £0.055/GB for hard drives and £0.025/GB for tapes, my quick back of the envelope calculation says that the investment in the drive amortizes after around 23.3TB for 800GB tapes.
	</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351504</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31355126</id>
	<title>Don't use raid 5</title>
	<author>Anonymous</author>
	<datestamp>1267645260000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>Don't use raid 5. With Linux RAID 5, you will be at the mercy of finding a Linux box with the same exact software format as when you wrote the tapes. At the very least, you must put a liveCD that works with that cluster, and configuration notes.</p><p>The only reason to use Raid 5 is if you have a single file that needs to span multiple drives (&gt; 1-ish TB). Don't do that. Change your dataset to have multiple files, and write a script that will segment your dataset out into multiple disks, and pull them back again. Write down those rules in English, not in code, and put those rules with the disks. If you do have a single large file, write a script to slice it up and put it back together again.</p><p>Then, pick a file format. I think the only two reasonable choices today are FAT32 and EXT3, with NTFS as a dark horse, but this is crystal ball territory - your judgement may vary. It'll be hard to argue against the chances of finding FAT32 support 10 years from now. If you think file format risk is greater than disk failure rate risk, make one copy in each format.</p><p>Then run your script to pack each drive full. Then make a second set. Then, if you want, make a third set.</p><p>This is ghetto RAID 10, which is the best combination of failure resistance and price, without getting in bed with a RAID format. And, you get partial restores in case of catastrophic failures.</p><p>And - realize that 2T drives have a huge price premium over 1.5T drives (2T at $300, 1.5T at $100 - 3x more price for 33\% more storage). We also know that stability goes down at the edges of the performance envelope in every hardware domain. So pick the second-best.</p><p>If one was to assess risk, buggy drives is a strong contender. I would strongly suggest making a stripeset of one manufacturer, and a stripeset with another. Seagate had all those firmware bugs, maybe the next round of bugs will be WD. No way to know.</p></htmltext>
<tokenext>Do n't use raid 5 .
With Linux RAID 5 , you will be at the mercy of finding a Linux box with the same exact software format as when you wrote the tapes .
At the very least , you must put a liveCD that works with that cluster , and configuration notes.The only reason to use Raid 5 is if you have a single file that needs to span multiple drives ( &gt; 1-ish TB ) .
Do n't do that .
Change your dataset to have multiple files , and write a script that will segment your dataset out into multiple disks , and pull them back again .
Write down those rules in English , not in code , and put those rules with the disks .
If you do have a single large file , write a script to slice it up and put it back together again.Then , pick a file format .
I think the only two reasonable choices today are FAT32 and EXT3 , with NTFS as a dark horse , but this is crystal ball territory - your judgement may vary .
It 'll be hard to argue against the chances of finding FAT32 support 10 years from now .
If you think file format risk is greater than disk failure rate risk , make one copy in each format.Then run your script to pack each drive full .
Then make a second set .
Then , if you want , make a third set.This is ghetto RAID 10 , which is the best combination of failure resistance and price , without getting in bed with a RAID format .
And , you get partial restores in case of catastrophic failures.And - realize that 2T drives have a huge price premium over 1.5T drives ( 2T at $ 300 , 1.5T at $ 100 - 3x more price for 33 \ % more storage ) .
We also know that stability goes down at the edges of the performance envelope in every hardware domain .
So pick the second-best.If one was to assess risk , buggy drives is a strong contender .
I would strongly suggest making a stripeset of one manufacturer , and a stripeset with another .
Seagate had all those firmware bugs , maybe the next round of bugs will be WD .
No way to know .</tokentext>
<sentencetext>Don't use raid 5.
With Linux RAID 5, you will be at the mercy of finding a Linux box with the same exact software format as when you wrote the tapes.
At the very least, you must put a liveCD that works with that cluster, and configuration notes.The only reason to use Raid 5 is if you have a single file that needs to span multiple drives (&gt; 1-ish TB).
Don't do that.
Change your dataset to have multiple files, and write a script that will segment your dataset out into multiple disks, and pull them back again.
Write down those rules in English, not in code, and put those rules with the disks.
If you do have a single large file, write a script to slice it up and put it back together again.Then, pick a file format.
I think the only two reasonable choices today are FAT32 and EXT3, with NTFS as a dark horse, but this is crystal ball territory - your judgement may vary.
It'll be hard to argue against the chances of finding FAT32 support 10 years from now.
If you think file format risk is greater than disk failure rate risk, make one copy in each format.Then run your script to pack each drive full.
Then make a second set.
Then, if you want, make a third set.This is ghetto RAID 10, which is the best combination of failure resistance and price, without getting in bed with a RAID format.
And, you get partial restores in case of catastrophic failures.And - realize that 2T drives have a huge price premium over 1.5T drives (2T at $300, 1.5T at $100 - 3x more price for 33\% more storage).
We also know that stability goes down at the edges of the performance envelope in every hardware domain.
So pick the second-best.If one was to assess risk, buggy drives is a strong contender.
I would strongly suggest making a stripeset of one manufacturer, and a stripeset with another.
Seagate had all those firmware bugs, maybe the next round of bugs will be WD.
No way to know.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351540</id>
	<title>Blu-Ray</title>
	<author>Anonymous</author>
	<datestamp>1267616160000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>Have you already ruled out blu-ray? 25GB per disc, make two copies per customer. Much cheaper than RAID5.</htmltext>
<tokenext>Have you already ruled out blu-ray ?
25GB per disc , make two copies per customer .
Much cheaper than RAID5 .</tokentext>
<sentencetext>Have you already ruled out blu-ray?
25GB per disc, make two copies per customer.
Much cheaper than RAID5.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354318</id>
	<title>Re:Agree with the tape option..;.</title>
	<author>Anonymous</author>
	<datestamp>1267635960000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>&gt;f anything, skip the RAID option and just store 2 copies.</p><p>If both of your copies are degraded, you have no way of automatically restore the data set.  If they were stored as par2 or 1+1, you could have restored that.</p></htmltext>
<tokenext>&gt; f anything , skip the RAID option and just store 2 copies.If both of your copies are degraded , you have no way of automatically restore the data set .
If they were stored as par2 or 1 + 1 , you could have restored that .</tokentext>
<sentencetext>&gt;f anything, skip the RAID option and just store 2 copies.If both of your copies are degraded, you have no way of automatically restore the data set.
If they were stored as par2 or 1+1, you could have restored that.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351838</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352888</id>
	<title>Re:Tape is your friend</title>
	<author>complete loony</author>
	<datestamp>1267623600000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>HDD's are much cheaper than $5K, and can be attached to any computer. For 3TB of data per customer, you're looking at 4 x 2TB drives to mirror the data. You can buy the drives for about $700. Compared to the tape solution, you're talking about $5K for the drive, and you only have one of them, then about $40 per 1.2TB tape. Sure tape is about 70\% cheaper per TB in the long run, but you really need to factor in the cost of tape drives (and replacement tape drives) when you recommend it for smaller sized backup requirements.</htmltext>
<tokenext>HDD 's are much cheaper than $ 5K , and can be attached to any computer .
For 3TB of data per customer , you 're looking at 4 x 2TB drives to mirror the data .
You can buy the drives for about $ 700 .
Compared to the tape solution , you 're talking about $ 5K for the drive , and you only have one of them , then about $ 40 per 1.2TB tape .
Sure tape is about 70 \ % cheaper per TB in the long run , but you really need to factor in the cost of tape drives ( and replacement tape drives ) when you recommend it for smaller sized backup requirements .</tokentext>
<sentencetext>HDD's are much cheaper than $5K, and can be attached to any computer.
For 3TB of data per customer, you're looking at 4 x 2TB drives to mirror the data.
You can buy the drives for about $700.
Compared to the tape solution, you're talking about $5K for the drive, and you only have one of them, then about $40 per 1.2TB tape.
Sure tape is about 70\% cheaper per TB in the long run, but you really need to factor in the cost of tape drives (and replacement tape drives) when you recommend it for smaller sized backup requirements.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351562</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354902</id>
	<title>Re:Use RAID6 not RAID5</title>
	<author>Anonymous</author>
	<datestamp>1267642440000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>It might take 3 drive failures to lose data, but depending on the set up, you can easily loose data with only a single drive "missing."</p><p>And no, you don't get to claim zero loose data if you later find the drive behind a copy machine.  I'm not even sure you could form an more ironic and obvious clue as to the disposition of the drive during the period it was not under your control.</p><p>Yes, I know it was just a typo, but it's so rare that lose/loose is actually meaningfully different and both meanings relevant to the discussion.</p><p>Raid 6 might not be the best option.  It's a good deal more computationally expensive than raid5 so it might be more practical to do something like CDs do and do layers of raid5 level ecc.</p></htmltext>
<tokenext>It might take 3 drive failures to lose data , but depending on the set up , you can easily loose data with only a single drive " missing .
" And no , you do n't get to claim zero loose data if you later find the drive behind a copy machine .
I 'm not even sure you could form an more ironic and obvious clue as to the disposition of the drive during the period it was not under your control.Yes , I know it was just a typo , but it 's so rare that lose/loose is actually meaningfully different and both meanings relevant to the discussion.Raid 6 might not be the best option .
It 's a good deal more computationally expensive than raid5 so it might be more practical to do something like CDs do and do layers of raid5 level ecc .</tokentext>
<sentencetext>It might take 3 drive failures to lose data, but depending on the set up, you can easily loose data with only a single drive "missing.
"And no, you don't get to claim zero loose data if you later find the drive behind a copy machine.
I'm not even sure you could form an more ironic and obvious clue as to the disposition of the drive during the period it was not under your control.Yes, I know it was just a typo, but it's so rare that lose/loose is actually meaningfully different and both meanings relevant to the discussion.Raid 6 might not be the best option.
It's a good deal more computationally expensive than raid5 so it might be more practical to do something like CDs do and do layers of raid5 level ecc.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351578</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31357762</id>
	<title>dont raid</title>
	<author>malcreado</author>
	<datestamp>1267716960000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>You mentioned putting them in a safety deposit box, so you are talking off line storage.  You probably dont want to raid them.   It is just another point of failure weather you are talking a battery in a raid controller a software raid.  I would just store them in a regular filesystem on large capacity drives.  Simple, easy to plug in and start again.

Having said that Tape is still king and the LTO 5 drives will be out this month (march 2010) according to both HP and dell reps that I have heard from.

I would put one copy on Hard drive and one on tape.  good redundancy; store them in different locations.</htmltext>
<tokenext>You mentioned putting them in a safety deposit box , so you are talking off line storage .
You probably dont want to raid them .
It is just another point of failure weather you are talking a battery in a raid controller a software raid .
I would just store them in a regular filesystem on large capacity drives .
Simple , easy to plug in and start again .
Having said that Tape is still king and the LTO 5 drives will be out this month ( march 2010 ) according to both HP and dell reps that I have heard from .
I would put one copy on Hard drive and one on tape .
good redundancy ; store them in different locations .</tokentext>
<sentencetext>You mentioned putting them in a safety deposit box, so you are talking off line storage.
You probably dont want to raid them.
It is just another point of failure weather you are talking a battery in a raid controller a software raid.
I would just store them in a regular filesystem on large capacity drives.
Simple, easy to plug in and start again.
Having said that Tape is still king and the LTO 5 drives will be out this month (march 2010) according to both HP and dell reps that I have heard from.
I would put one copy on Hard drive and one on tape.
good redundancy; store them in different locations.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351578</id>
	<title>Use RAID6 not RAID5</title>
	<author>jbridges</author>
	<datestamp>1267616340000</datestamp>
	<modclass>Insightful</modclass>
	<modscore>3</modscore>
	<htmltext><p>I would use RAID6 not RAID5, since 2 drive failures means data loss with RAID5, while it takes 3 drive failures to loose data on RAID6.</p><p>Linux MDADM has supported RAID6 for years, it's stable.</p><p>I would mix and match drives, not buying all the same model from one maker. One Samsung, One WD, One Hitachi, One Seagate.</p><p>That gets you 4TB in 4 drives, and unlike a RAID1, any 2 drives can fail with no dataloss.</p><p>You can further ensure no dataloss by making a second copy using different brand drives for each clone.</p><p>Eight 2TB drives is around $1500. Not bad for a very safe 4TB backup.</p></htmltext>
<tokenext>I would use RAID6 not RAID5 , since 2 drive failures means data loss with RAID5 , while it takes 3 drive failures to loose data on RAID6.Linux MDADM has supported RAID6 for years , it 's stable.I would mix and match drives , not buying all the same model from one maker .
One Samsung , One WD , One Hitachi , One Seagate.That gets you 4TB in 4 drives , and unlike a RAID1 , any 2 drives can fail with no dataloss.You can further ensure no dataloss by making a second copy using different brand drives for each clone.Eight 2TB drives is around $ 1500 .
Not bad for a very safe 4TB backup .</tokentext>
<sentencetext>I would use RAID6 not RAID5, since 2 drive failures means data loss with RAID5, while it takes 3 drive failures to loose data on RAID6.Linux MDADM has supported RAID6 for years, it's stable.I would mix and match drives, not buying all the same model from one maker.
One Samsung, One WD, One Hitachi, One Seagate.That gets you 4TB in 4 drives, and unlike a RAID1, any 2 drives can fail with no dataloss.You can further ensure no dataloss by making a second copy using different brand drives for each clone.Eight 2TB drives is around $1500.
Not bad for a very safe 4TB backup.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31356298</id>
	<title>NearLine storage.</title>
	<author>Salsero</author>
	<datestamp>1267703220000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>Hi,

If you are really talking about multiple terrabytes of archiving and you are allowed to keep it online I would look at something like Isilon IQ 36NL or IQ 72NL Series. This gives you a redundant storage environment where you can store depending of the size of the cluster starting at 100TB up to Peta Bytes of storage. And it is affordable. These amounts of data are just to big to consider things like tape or blueray.

JHP</htmltext>
<tokenext>Hi , If you are really talking about multiple terrabytes of archiving and you are allowed to keep it online I would look at something like Isilon IQ 36NL or IQ 72NL Series .
This gives you a redundant storage environment where you can store depending of the size of the cluster starting at 100TB up to Peta Bytes of storage .
And it is affordable .
These amounts of data are just to big to consider things like tape or blueray .
JHP</tokentext>
<sentencetext>Hi,

If you are really talking about multiple terrabytes of archiving and you are allowed to keep it online I would look at something like Isilon IQ 36NL or IQ 72NL Series.
This gives you a redundant storage environment where you can store depending of the size of the cluster starting at 100TB up to Peta Bytes of storage.
And it is affordable.
These amounts of data are just to big to consider things like tape or blueray.
JHP</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352118</id>
	<title>the $100,000 compression question</title>
	<author>Dare nMc</author>
	<datestamp>1267618800000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>the good about compression, is less storage and better chance of detecting errors.  the bad is at a minimum every bad bit becomes at least a bad byte, and if it is in the header, all data in that archive.<br>for example:<br><a href="http://www.gnu.org/software/automake/manual/tar/gzip.html" title="gnu.org">About corrupted compressed archives: gzip'ed files have no redundancy, for maximum compression. The adaptive nature of the compression scheme means that the compression tables are implicitly spread all over the archive. If you lose a few blocks, the dynamic construction of the compression tables becomes unsynchronized, and there is little chance that you could recover later in the archive. </a> [gnu.org]</p><p>So if the nature of the data is, "one bad bit means all data is garbage anyway" then compress away.  If the nature of the backup, is "I may need a couple files out of a backup someday, who cares about the rest, then don't make a single compressed archive.</p><p>My method for filesystem backups was to do a file by file gzip in place on the backup server, this way I could lose a file, but not all files in a single archive.<br>I associated all<nobr> <wbr></nobr>.gz files with uncompress in place gzip on windows machines, so users would just have to click, wait a second, then click again if we had to jump to the backup file server...</p></htmltext>
<tokenext>the good about compression , is less storage and better chance of detecting errors .
the bad is at a minimum every bad bit becomes at least a bad byte , and if it is in the header , all data in that archive.for example : About corrupted compressed archives : gzip'ed files have no redundancy , for maximum compression .
The adaptive nature of the compression scheme means that the compression tables are implicitly spread all over the archive .
If you lose a few blocks , the dynamic construction of the compression tables becomes unsynchronized , and there is little chance that you could recover later in the archive .
[ gnu.org ] So if the nature of the data is , " one bad bit means all data is garbage anyway " then compress away .
If the nature of the backup , is " I may need a couple files out of a backup someday , who cares about the rest , then do n't make a single compressed archive.My method for filesystem backups was to do a file by file gzip in place on the backup server , this way I could lose a file , but not all files in a single archive.I associated all .gz files with uncompress in place gzip on windows machines , so users would just have to click , wait a second , then click again if we had to jump to the backup file server.. .</tokentext>
<sentencetext>the good about compression, is less storage and better chance of detecting errors.
the bad is at a minimum every bad bit becomes at least a bad byte, and if it is in the header, all data in that archive.for example:About corrupted compressed archives: gzip'ed files have no redundancy, for maximum compression.
The adaptive nature of the compression scheme means that the compression tables are implicitly spread all over the archive.
If you lose a few blocks, the dynamic construction of the compression tables becomes unsynchronized, and there is little chance that you could recover later in the archive.
[gnu.org]So if the nature of the data is, "one bad bit means all data is garbage anyway" then compress away.
If the nature of the backup, is "I may need a couple files out of a backup someday, who cares about the rest, then don't make a single compressed archive.My method for filesystem backups was to do a file by file gzip in place on the backup server, this way I could lose a file, but not all files in a single archive.I associated all .gz files with uncompress in place gzip on windows machines, so users would just have to click, wait a second, then click again if we had to jump to the backup file server...</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351602</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31370126</id>
	<title>Re:use a tape drive</title>
	<author>pnutjam</author>
	<datestamp>1267799820000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>My understanding is tape has very long restore times.  It is also difficult to access just a few files.  If these are offline data-sets that need to be accessed periodically, he would be way better off w/ disk IMHO.  If they are a put it up and keep it just in case, then he is probably better off with tape.  The time limitations of tape should definitely be considered, not just the longevity.</htmltext>
<tokenext>My understanding is tape has very long restore times .
It is also difficult to access just a few files .
If these are offline data-sets that need to be accessed periodically , he would be way better off w/ disk IMHO .
If they are a put it up and keep it just in case , then he is probably better off with tape .
The time limitations of tape should definitely be considered , not just the longevity .</tokentext>
<sentencetext>My understanding is tape has very long restore times.
It is also difficult to access just a few files.
If these are offline data-sets that need to be accessed periodically, he would be way better off w/ disk IMHO.
If they are a put it up and keep it just in case, then he is probably better off with tape.
The time limitations of tape should definitely be considered, not just the longevity.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351604</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31357600</id>
	<title>Let the client store the data</title>
	<author>Anonymous</author>
	<datestamp>1267715940000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>We have a very similar issue.</p><p>The solution we've adopted is to build the cost of storage devices into the quote.</p><p>When the job (analysis) is done the client gets the TBs to store how they want.</p><p>We don't end up paying for or storing anything.</p></div>
	</htmltext>
<tokenext>We have a very similar issue.The solution we 've adopted is to build the cost of storage devices into the quote.When the job ( analysis ) is done the client gets the TBs to store how they want.We do n't end up paying for or storing anything .</tokentext>
<sentencetext>We have a very similar issue.The solution we've adopted is to build the cost of storage devices into the quote.When the job (analysis) is done the client gets the TBs to store how they want.We don't end up paying for or storing anything.
	</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31361084</id>
	<title>Re:Agree with the tape option..;.</title>
	<author>petermgreen</author>
	<datestamp>1267732980000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p><i>if anything, skip the RAID option and just store 2 copies</i><br>Another option that may be better than keeping two copies of every drive is to use something like quickpar to generate a load of parity data.</p><p>Less chance of accidental loss (e.g. though a rebuild starting at a bad time because not all drives were detected) than raid and potentially more resiliant to drive loss than either raid or simple mirroring of data.</p></htmltext>
<tokenext>if anything , skip the RAID option and just store 2 copiesAnother option that may be better than keeping two copies of every drive is to use something like quickpar to generate a load of parity data.Less chance of accidental loss ( e.g .
though a rebuild starting at a bad time because not all drives were detected ) than raid and potentially more resiliant to drive loss than either raid or simple mirroring of data .</tokentext>
<sentencetext>if anything, skip the RAID option and just store 2 copiesAnother option that may be better than keeping two copies of every drive is to use something like quickpar to generate a load of parity data.Less chance of accidental loss (e.g.
though a rebuild starting at a bad time because not all drives were detected) than raid and potentially more resiliant to drive loss than either raid or simple mirroring of data.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351838</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352570</id>
	<title>RAIDPacs can store 4TB RAID 5 or 6TB RAID 0</title>
	<author>Anonymous</author>
	<datestamp>1267621260000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>They're completely self contained and hot pluggable RAID arrays.  They have a SATA connector on the back, they look just like a hard drive to the host, no driver or controller card needed.</p><p>http://www.high-rely.com/HR3/includes/RAIDFrame/RAIDFrame5Bay.php</p></htmltext>
<tokenext>They 're completely self contained and hot pluggable RAID arrays .
They have a SATA connector on the back , they look just like a hard drive to the host , no driver or controller card needed.http : //www.high-rely.com/HR3/includes/RAIDFrame/RAIDFrame5Bay.php</tokentext>
<sentencetext>They're completely self contained and hot pluggable RAID arrays.
They have a SATA connector on the back, they look just like a hard drive to the host, no driver or controller card needed.http://www.high-rely.com/HR3/includes/RAIDFrame/RAIDFrame5Bay.php</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352620</id>
	<title>Re:Another approach...</title>
	<author>hawkeyeMI</author>
	<datestamp>1267621620000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>Actually the more I think about this, the better it is. I can store some of the intermediary files and all of the scripts/binaries in much less space. While waiting for any follow-up or feedback from the client I'll need all of the files close at hand, but after a while I can probably pickle things like this.</htmltext>
<tokenext>Actually the more I think about this , the better it is .
I can store some of the intermediary files and all of the scripts/binaries in much less space .
While waiting for any follow-up or feedback from the client I 'll need all of the files close at hand , but after a while I can probably pickle things like this .</tokentext>
<sentencetext>Actually the more I think about this, the better it is.
I can store some of the intermediary files and all of the scripts/binaries in much less space.
While waiting for any follow-up or feedback from the client I'll need all of the files close at hand, but after a while I can probably pickle things like this.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352080</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351414</id>
	<title>Tape is your friend</title>
	<author>Anonymous</author>
	<datestamp>1267615500000</datestamp>
	<modclass>Informativ</modclass>
	<modscore>5</modscore>
	<htmltext><p>LTO tape, properly stored, will outlast burned optical media and hard drives.  Great stuff and designed specifically for what you're talking about.</p><p><a href="http://en.wikipedia.org/wiki/Linear\_Tape-Open" title="wikipedia.org">http://en.wikipedia.org/wiki/Linear\_Tape-Open</a> [wikipedia.org]</p></htmltext>
<tokenext>LTO tape , properly stored , will outlast burned optical media and hard drives .
Great stuff and designed specifically for what you 're talking about.http : //en.wikipedia.org/wiki/Linear \ _Tape-Open [ wikipedia.org ]</tokentext>
<sentencetext>LTO tape, properly stored, will outlast burned optical media and hard drives.
Great stuff and designed specifically for what you're talking about.http://en.wikipedia.org/wiki/Linear\_Tape-Open [wikipedia.org]</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31355302</id>
	<title>Re:Tape is your friend</title>
	<author>sunderland56</author>
	<datestamp>1267733880000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>The <a href="http://nle.ch/dl/LTO.pdf" title="nle.ch">spec on LTO</a> [nle.ch] is only 15-30 years.
<br> <br>
DVD <a href="http://www.osta.org/technology/dvdqa/dvdqa11.htm" title="osta.org">claims 30 to 100 years</a> [osta.org].
<br> <br>
Neither lasts as long as necessary for archival storage.</htmltext>
<tokenext>The spec on LTO [ nle.ch ] is only 15-30 years .
DVD claims 30 to 100 years [ osta.org ] .
Neither lasts as long as necessary for archival storage .</tokentext>
<sentencetext>The spec on LTO [nle.ch] is only 15-30 years.
DVD claims 30 to 100 years [osta.org].
Neither lasts as long as necessary for archival storage.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351414</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31356678</id>
	<title>Do what CERN do for the LHC: Tape</title>
	<author>Anonymous</author>
	<datestamp>1267708440000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>Tape is designed for this and it's how the LHC stores its petabytes of data (with a HDD based "cache" in front of it).</p><p>Disks are not designed to be used as a long term (by that I mean decades) storage platform. You also need to consider the stability and longevity of the formats you store on whatever media you use. Everything from the file format to the partition format to the volume manager format. These days even the magic the disks do to map to the physical sectors is important (e.g. 512byte to 4k block mapping in new drives).</p></htmltext>
<tokenext>Tape is designed for this and it 's how the LHC stores its petabytes of data ( with a HDD based " cache " in front of it ) .Disks are not designed to be used as a long term ( by that I mean decades ) storage platform .
You also need to consider the stability and longevity of the formats you store on whatever media you use .
Everything from the file format to the partition format to the volume manager format .
These days even the magic the disks do to map to the physical sectors is important ( e.g .
512byte to 4k block mapping in new drives ) .</tokentext>
<sentencetext>Tape is designed for this and it's how the LHC stores its petabytes of data (with a HDD based "cache" in front of it).Disks are not designed to be used as a long term (by that I mean decades) storage platform.
You also need to consider the stability and longevity of the formats you store on whatever media you use.
Everything from the file format to the partition format to the volume manager format.
These days even the magic the disks do to map to the physical sectors is important (e.g.
512byte to 4k block mapping in new drives).</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352254</id>
	<title>Why store RAID arrays?</title>
	<author>tlhIngan</author>
	<datestamp>1267619520000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>Why store a RAID array? Your data set isn't too big (2-3TB should fit nicely into a couple of today's 2TB hard drives), so it seems somewhat roundabout to store the array itself. Tarball it up and if the compression can squeeze it onto a 2TB hard disk, then you're golden. If not, use split to break it up. Then handle the tarball like any other file. Restoring is just as easy and if you use split, the opposite is our friend cat.</p><p>Store the array for instant access, but use the hard drives (you'll make copies) as medium term backups. If you need long term storage, then go with tape.</p><p>And I'm sure by the end of the year, you can just use one hard disk for your entire dataset backup. Which means you can duplicate it multiple times for safety. And just plug it into any Linux PC and access it immediately without having to reinitialize the array.</p></htmltext>
<tokenext>Why store a RAID array ?
Your data set is n't too big ( 2-3TB should fit nicely into a couple of today 's 2TB hard drives ) , so it seems somewhat roundabout to store the array itself .
Tarball it up and if the compression can squeeze it onto a 2TB hard disk , then you 're golden .
If not , use split to break it up .
Then handle the tarball like any other file .
Restoring is just as easy and if you use split , the opposite is our friend cat.Store the array for instant access , but use the hard drives ( you 'll make copies ) as medium term backups .
If you need long term storage , then go with tape.And I 'm sure by the end of the year , you can just use one hard disk for your entire dataset backup .
Which means you can duplicate it multiple times for safety .
And just plug it into any Linux PC and access it immediately without having to reinitialize the array .</tokentext>
<sentencetext>Why store a RAID array?
Your data set isn't too big (2-3TB should fit nicely into a couple of today's 2TB hard drives), so it seems somewhat roundabout to store the array itself.
Tarball it up and if the compression can squeeze it onto a 2TB hard disk, then you're golden.
If not, use split to break it up.
Then handle the tarball like any other file.
Restoring is just as easy and if you use split, the opposite is our friend cat.Store the array for instant access, but use the hard drives (you'll make copies) as medium term backups.
If you need long term storage, then go with tape.And I'm sure by the end of the year, you can just use one hard disk for your entire dataset backup.
Which means you can duplicate it multiple times for safety.
And just plug it into any Linux PC and access it immediately without having to reinitialize the array.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31356664</id>
	<title>Rename as "Avatar full BluRay rip.iso"</title>
	<author>Rogerborg</author>
	<datestamp>1267708320000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>Then let bittorrent take care of your distributed storage for you.</htmltext>
<tokenext>Then let bittorrent take care of your distributed storage for you .</tokentext>
<sentencetext>Then let bittorrent take care of your distributed storage for you.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352602</id>
	<title>An archival way to back up</title>
	<author>Anonymous</author>
	<datestamp>1267621500000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>Just translate it to binary and print out the data.</p></htmltext>
<tokenext>Just translate it to binary and print out the data .</tokentext>
<sentencetext>Just translate it to binary and print out the data.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352100</id>
	<title>Re:Tape is crap anyway.</title>
	<author>Anonymous</author>
	<datestamp>1267618800000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p><div class="quote"><p>Tapes just bleed data at an alarming rate, and they are about as reliable as a drunk gabling addict living under the subway next to the OTB shop.</p></div><p>Your statements contradict all scientific evidence...</p></div>
	</htmltext>
<tokenext>Tapes just bleed data at an alarming rate , and they are about as reliable as a drunk gabling addict living under the subway next to the OTB shop.Your statements contradict all scientific evidence.. .</tokentext>
<sentencetext>Tapes just bleed data at an alarming rate, and they are about as reliable as a drunk gabling addict living under the subway next to the OTB shop.Your statements contradict all scientific evidence...
	</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351706</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353492</id>
	<title>Can you reduce the size pre-storage?</title>
	<author>twoDigitIq</author>
	<datestamp>1267628820000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext>Let me preface this by saying that I don't know a couple of things: <p>
1. How are you generating this 2-3 TB of data? <br>
2. Are you compressing it to its absolute minimum size? <br>
3. Actually I don't know way more than a couple of things, but I digress. </p><p>

Now, having spent most of the last year trying to determine the best method for storing huge amounts of data with limited space while retaining every important detail in a complex system that changes every few seconds --(deep breath)-- I can say that in OUR case, we found some very efficient ways to reduce the storage requirement at the front end instead of throwing money at the backend. Before I start rambling I will repeat the disclaimer regarding my total ignorance of your data situation, and your environment in general.</p><p>

1. When you generate 2-3 TB of data per customer, what creates this data? Is there some type of progression or versioning in play (like a new value for a particular datapoint every millisecond or some such thing?) If so then whatever is creating the data may benefit from some type of delta checking before it saves data. Less data generated means less to compress. If you were to post a sample of what your uncompressed data values look like without revealing any sensitive information you may see replies with meaningful suggestions on how to change the data itself.</p><p>

2. How exactly are you compressing the data? We've found that doing everything possible in step 1 above results in much less data meaning smaller source size and a smaller need for compression. But if whatever is compressing is dumb then you can turn a 500 GB archive into 1 TB which ends up costing you way more than rethinking step 1.</p><p>

Again, I may be completely missing your point here and you may have already thought of these things many times over. Or you may be at the mercy of coders that don't care how much data they generate. But I think, lacking the answers to the above questions, no one here could give you the solutions that have the best bang for the buck.</p></htmltext>
<tokenext>Let me preface this by saying that I do n't know a couple of things : 1 .
How are you generating this 2-3 TB of data ?
2. Are you compressing it to its absolute minimum size ?
3. Actually I do n't know way more than a couple of things , but I digress .
Now , having spent most of the last year trying to determine the best method for storing huge amounts of data with limited space while retaining every important detail in a complex system that changes every few seconds -- ( deep breath ) -- I can say that in OUR case , we found some very efficient ways to reduce the storage requirement at the front end instead of throwing money at the backend .
Before I start rambling I will repeat the disclaimer regarding my total ignorance of your data situation , and your environment in general .
1. When you generate 2-3 TB of data per customer , what creates this data ?
Is there some type of progression or versioning in play ( like a new value for a particular datapoint every millisecond or some such thing ?
) If so then whatever is creating the data may benefit from some type of delta checking before it saves data .
Less data generated means less to compress .
If you were to post a sample of what your uncompressed data values look like without revealing any sensitive information you may see replies with meaningful suggestions on how to change the data itself .
2. How exactly are you compressing the data ?
We 've found that doing everything possible in step 1 above results in much less data meaning smaller source size and a smaller need for compression .
But if whatever is compressing is dumb then you can turn a 500 GB archive into 1 TB which ends up costing you way more than rethinking step 1 .
Again , I may be completely missing your point here and you may have already thought of these things many times over .
Or you may be at the mercy of coders that do n't care how much data they generate .
But I think , lacking the answers to the above questions , no one here could give you the solutions that have the best bang for the buck .</tokentext>
<sentencetext>Let me preface this by saying that I don't know a couple of things: 
1.
How are you generating this 2-3 TB of data?
2. Are you compressing it to its absolute minimum size?
3. Actually I don't know way more than a couple of things, but I digress.
Now, having spent most of the last year trying to determine the best method for storing huge amounts of data with limited space while retaining every important detail in a complex system that changes every few seconds --(deep breath)-- I can say that in OUR case, we found some very efficient ways to reduce the storage requirement at the front end instead of throwing money at the backend.
Before I start rambling I will repeat the disclaimer regarding my total ignorance of your data situation, and your environment in general.
1. When you generate 2-3 TB of data per customer, what creates this data?
Is there some type of progression or versioning in play (like a new value for a particular datapoint every millisecond or some such thing?
) If so then whatever is creating the data may benefit from some type of delta checking before it saves data.
Less data generated means less to compress.
If you were to post a sample of what your uncompressed data values look like without revealing any sensitive information you may see replies with meaningful suggestions on how to change the data itself.
2. How exactly are you compressing the data?
We've found that doing everything possible in step 1 above results in much less data meaning smaller source size and a smaller need for compression.
But if whatever is compressing is dumb then you can turn a 500 GB archive into 1 TB which ends up costing you way more than rethinking step 1.
Again, I may be completely missing your point here and you may have already thought of these things many times over.
Or you may be at the mercy of coders that don't care how much data they generate.
But I think, lacking the answers to the above questions, no one here could give you the solutions that have the best bang for the buck.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352838</id>
	<title>IT Auditor Opinion</title>
	<author>Anonymous</author>
	<datestamp>1267623240000</datestamp>
	<modclass>Informativ</modclass>
	<modscore>1</modscore>
	<htmltext><p>As usual, the "Ask Slashdot" doesn't have enough information to make a proper recommendation. There are a number of factors that need to be considered.</p><p>1. How valuable is the data?<br>2. How far into the future must you be able to produce the data for your customer?<br>3. What resources do you have at your disposal to solve this problem?<br>4. How secure do you need this to be?<br>5. How quickly do you need to be able to produce the data?</p><p>Once you answer those questions, you'll have a lot more insight into how to proceed. For example, if you suspect that a customer might want you to do something with their data in the next year or two, then your current solution seems reasonable.</p><p>On the other hand, if you are contractually obligated to provide copies of the data upon request for the next 10 or 20 years, you would need to invest in proper future-proof archiving technologies and duplicate environmentally controlled storage in geographically disparate locations. You may even have to go so far as to include schemas or transform the data to more universal formats.</p><p>As a related aside, if you incur big expenses to secure a client's data, you should be charging a premium for this service. At the very least, you should be using this as a selling point.</p></htmltext>
<tokenext>As usual , the " Ask Slashdot " does n't have enough information to make a proper recommendation .
There are a number of factors that need to be considered.1 .
How valuable is the data ? 2 .
How far into the future must you be able to produce the data for your customer ? 3 .
What resources do you have at your disposal to solve this problem ? 4 .
How secure do you need this to be ? 5 .
How quickly do you need to be able to produce the data ? Once you answer those questions , you 'll have a lot more insight into how to proceed .
For example , if you suspect that a customer might want you to do something with their data in the next year or two , then your current solution seems reasonable.On the other hand , if you are contractually obligated to provide copies of the data upon request for the next 10 or 20 years , you would need to invest in proper future-proof archiving technologies and duplicate environmentally controlled storage in geographically disparate locations .
You may even have to go so far as to include schemas or transform the data to more universal formats.As a related aside , if you incur big expenses to secure a client 's data , you should be charging a premium for this service .
At the very least , you should be using this as a selling point .</tokentext>
<sentencetext>As usual, the "Ask Slashdot" doesn't have enough information to make a proper recommendation.
There are a number of factors that need to be considered.1.
How valuable is the data?2.
How far into the future must you be able to produce the data for your customer?3.
What resources do you have at your disposal to solve this problem?4.
How secure do you need this to be?5.
How quickly do you need to be able to produce the data?Once you answer those questions, you'll have a lot more insight into how to proceed.
For example, if you suspect that a customer might want you to do something with their data in the next year or two, then your current solution seems reasonable.On the other hand, if you are contractually obligated to provide copies of the data upon request for the next 10 or 20 years, you would need to invest in proper future-proof archiving technologies and duplicate environmentally controlled storage in geographically disparate locations.
You may even have to go so far as to include schemas or transform the data to more universal formats.As a related aside, if you incur big expenses to secure a client's data, you should be charging a premium for this service.
At the very least, you should be using this as a selling point.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351994</id>
	<title>Beware RAID</title>
	<author>meburke</author>
	<datestamp>1267618260000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>As far as I know the 2TB Raid problem hasn't been fixed. <a href="http://blogs.zdnet.com/storage/?p=162" title="zdnet.com">http://blogs.zdnet.com/storage/?p=162</a> [zdnet.com] If anyone knows differently, please let me know.<br>I've been using a drive docking station and splitting my backups for large databases.</p></htmltext>
<tokenext>As far as I know the 2TB Raid problem has n't been fixed .
http : //blogs.zdnet.com/storage/ ? p = 162 [ zdnet.com ] If anyone knows differently , please let me know.I 've been using a drive docking station and splitting my backups for large databases .</tokentext>
<sentencetext>As far as I know the 2TB Raid problem hasn't been fixed.
http://blogs.zdnet.com/storage/?p=162 [zdnet.com] If anyone knows differently, please let me know.I've been using a drive docking station and splitting my backups for large databases.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351344</id>
	<title>bzip2</title>
	<author>Anonymous</author>
	<datestamp>1267615140000</datestamp>
	<modclass>Funny</modclass>
	<modscore>5</modscore>
	<htmltext><p>And optar:</p><p><a href="http://ronja.twibright.com/optar/" title="twibright.com">http://ronja.twibright.com/optar/</a> [twibright.com]</p><p>You know it makes sense.</p></htmltext>
<tokenext>And optar : http : //ronja.twibright.com/optar/ [ twibright.com ] You know it makes sense .</tokentext>
<sentencetext>And optar:http://ronja.twibright.com/optar/ [twibright.com]You know it makes sense.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31360282</id>
	<title>Thanks everyone!</title>
	<author>hawkeyeMI</author>
	<datestamp>1267729020000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>Thank you all for your replies, for allowing me to benefit from your collective experience. Both the straight-up answers to my question, and questions requesting further clarification were quite helpful.</htmltext>
<tokenext>Thank you all for your replies , for allowing me to benefit from your collective experience .
Both the straight-up answers to my question , and questions requesting further clarification were quite helpful .</tokentext>
<sentencetext>Thank you all for your replies, for allowing me to benefit from your collective experience.
Both the straight-up answers to my question, and questions requesting further clarification were quite helpful.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31357382</id>
	<title>Re:LTO Tapes</title>
	<author>klocwerk</author>
	<datestamp>1267714500000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>I have to agree.<br>Investigated this for my last job, we did in fact end up doing SATA 1TB disks in a fireproof safe in the server room, but we had a lot less data to deal with than you do.<br>LTO5 should be out this year with 1.5TB native space, and it compresses very well. You could probably get one of your clients per tape.</p><p>LTO's got a long lifespan, and is readable with newer LTO tech for a few generations. There's a reason it's the industry standard backup these days.</p></htmltext>
<tokenext>I have to agree.Investigated this for my last job , we did in fact end up doing SATA 1TB disks in a fireproof safe in the server room , but we had a lot less data to deal with than you do.LTO5 should be out this year with 1.5TB native space , and it compresses very well .
You could probably get one of your clients per tape.LTO 's got a long lifespan , and is readable with newer LTO tech for a few generations .
There 's a reason it 's the industry standard backup these days .</tokentext>
<sentencetext>I have to agree.Investigated this for my last job, we did in fact end up doing SATA 1TB disks in a fireproof safe in the server room, but we had a lot less data to deal with than you do.LTO5 should be out this year with 1.5TB native space, and it compresses very well.
You could probably get one of your clients per tape.LTO's got a long lifespan, and is readable with newer LTO tech for a few generations.
There's a reason it's the industry standard backup these days.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351648</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351660</id>
	<title>Re:Exactly what you're doing</title>
	<author>TheMeld</author>
	<datestamp>1267616640000</datestamp>
	<modclass>Informativ</modclass>
	<modscore>3</modscore>
	<htmltext><p>The other thing to do if you want longish term reliability is to add redundancy to whatever you're storing with a tool like par2, <a href="http://www.par2.net/" title="par2.net" rel="nofollow">http://www.par2.net/</a> [par2.net] and <a href="http://www.quickpar.org.uk/" title="quickpar.org.uk" rel="nofollow">http://www.quickpar.org.uk/</a> [quickpar.org.uk] are your friend.</p><p>Raid5 will help you if you lose a whole drive (e.g. siezes up from sitting still for a long time), the par2 data will both allow you to verify that the data hasn't been corrupted, and if it is (e.g. a couple sectors go bad), it will let you recover the data.</p></htmltext>
<tokenext>The other thing to do if you want longish term reliability is to add redundancy to whatever you 're storing with a tool like par2 , http : //www.par2.net/ [ par2.net ] and http : //www.quickpar.org.uk/ [ quickpar.org.uk ] are your friend.Raid5 will help you if you lose a whole drive ( e.g .
siezes up from sitting still for a long time ) , the par2 data will both allow you to verify that the data has n't been corrupted , and if it is ( e.g .
a couple sectors go bad ) , it will let you recover the data .</tokentext>
<sentencetext>The other thing to do if you want longish term reliability is to add redundancy to whatever you're storing with a tool like par2, http://www.par2.net/ [par2.net] and http://www.quickpar.org.uk/ [quickpar.org.uk] are your friend.Raid5 will help you if you lose a whole drive (e.g.
siezes up from sitting still for a long time), the par2 data will both allow you to verify that the data hasn't been corrupted, and if it is (e.g.
a couple sectors go bad), it will let you recover the data.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351342</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351938</id>
	<title>Don't use raid</title>
	<author>myforwik</author>
	<datestamp>1267618020000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>I don't know why people would even suggest using raid for backing up.

Harddisks when they sit around for long periods of time tend to fail out-right because of the bearings lubricant. 2-3TB isn't much. Why have the hassle of raid? You want it future proof. You don't have to have to be tied to a specific raid card and plugged in the write way etc. just to get data off, you want to be able to slap the drive in anywhere and get it. I would just make 3 copies on 3 different brands of harddisks and store at 3 different places.</htmltext>
<tokenext>I do n't know why people would even suggest using raid for backing up .
Harddisks when they sit around for long periods of time tend to fail out-right because of the bearings lubricant .
2-3TB is n't much .
Why have the hassle of raid ?
You want it future proof .
You do n't have to have to be tied to a specific raid card and plugged in the write way etc .
just to get data off , you want to be able to slap the drive in anywhere and get it .
I would just make 3 copies on 3 different brands of harddisks and store at 3 different places .</tokentext>
<sentencetext>I don't know why people would even suggest using raid for backing up.
Harddisks when they sit around for long periods of time tend to fail out-right because of the bearings lubricant.
2-3TB isn't much.
Why have the hassle of raid?
You want it future proof.
You don't have to have to be tied to a specific raid card and plugged in the write way etc.
just to get data off, you want to be able to slap the drive in anywhere and get it.
I would just make 3 copies on 3 different brands of harddisks and store at 3 different places.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31358296</id>
	<title>May be the best solution,Business/Enterprise class</title>
	<author>Anonymous</author>
	<datestamp>1267720440000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>http://www.permabit.com/products/roi-calculator.asp</p><p>Not sure of your budget, but I used to work at permabit.<br>Archive storage, always online, with in-line de-duplication, RAIN technology (250X more reliable than RAID 6) All at less than $1 per GB, that's right, less than the cost of a medium coffee per day. Not happy that I lost my job, but hey, their stuff works, and is quite impressive (MIT startup, in Cambridge, MA)<br>The technology is hardware agnostic, new 1rmu units can be addded/removed as technology improves, or storage needs increase. A new node is automatically added to the system capacity once added in! Data remains accessible throughout..</p></htmltext>
<tokenext>http : //www.permabit.com/products/roi-calculator.aspNot sure of your budget , but I used to work at permabit.Archive storage , always online , with in-line de-duplication , RAIN technology ( 250X more reliable than RAID 6 ) All at less than $ 1 per GB , that 's right , less than the cost of a medium coffee per day .
Not happy that I lost my job , but hey , their stuff works , and is quite impressive ( MIT startup , in Cambridge , MA ) The technology is hardware agnostic , new 1rmu units can be addded/removed as technology improves , or storage needs increase .
A new node is automatically added to the system capacity once added in !
Data remains accessible throughout. .</tokentext>
<sentencetext>http://www.permabit.com/products/roi-calculator.aspNot sure of your budget, but I used to work at permabit.Archive storage, always online, with in-line de-duplication, RAIN technology (250X more reliable than RAID 6) All at less than $1 per GB, that's right, less than the cost of a medium coffee per day.
Not happy that I lost my job, but hey, their stuff works, and is quite impressive (MIT startup, in Cambridge, MA)The technology is hardware agnostic, new 1rmu units can be addded/removed as technology improves, or storage needs increase.
A new node is automatically added to the system capacity once added in!
Data remains accessible throughout..</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351924</id>
	<title>Re:bzip2</title>
	<author>Anonymous</author>
	<datestamp>1267617960000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>This would be even better combined with microfiche technology, but bypass the print to paper.  Just print straight onto film.</p></htmltext>
<tokenext>This would be even better combined with microfiche technology , but bypass the print to paper .
Just print straight onto film .</tokentext>
<sentencetext>This would be even better combined with microfiche technology, but bypass the print to paper.
Just print straight onto film.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351344</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352260</id>
	<title>Hard Disks are best density</title>
	<author>physburn</author>
	<datestamp>1267619520000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>Hard Disks have risen in density so far, that
tapes and optical drives just haven't kept up.
I should use a RAID 6 array of the disks with
the best bits per buck score, probably 500GB
right now. If its very important consider have
a copy server with the disk colocated at a rack
space provider.
<p>
---
</p><p>
<a href="http://www.feeddistiller.com/blogs/Data\%20Integrity/feed.html" title="feeddistiller.com">Data Integrity</a> [feeddistiller.com] Feed @ <a href="http://www.feeddistiller.com/" title="feeddistiller.com">Feed Distiller</a> [feeddistiller.com]</p></htmltext>
<tokenext>Hard Disks have risen in density so far , that tapes and optical drives just have n't kept up .
I should use a RAID 6 array of the disks with the best bits per buck score , probably 500GB right now .
If its very important consider have a copy server with the disk colocated at a rack space provider .
--- Data Integrity [ feeddistiller.com ] Feed @ Feed Distiller [ feeddistiller.com ]</tokentext>
<sentencetext>Hard Disks have risen in density so far, that
tapes and optical drives just haven't kept up.
I should use a RAID 6 array of the disks with
the best bits per buck score, probably 500GB
right now.
If its very important consider have
a copy server with the disk colocated at a rack
space provider.
---

Data Integrity [feeddistiller.com] Feed @ Feed Distiller [feeddistiller.com]</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351454</id>
	<title>IBM Information Archive</title>
	<author>Anonymous</author>
	<datestamp>1267615740000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>Depending on what you're doing, you could consider using a basic version of IBM Information Archive: <a href="http://www.ibm.com/systems/storage/disk/archive/" title="ibm.com" rel="nofollow">http://www.ibm.com/systems/storage/disk/archive/</a> [ibm.com]<br>It scales up to 304 TB (Raw Capacity)</p></htmltext>
<tokenext>Depending on what you 're doing , you could consider using a basic version of IBM Information Archive : http : //www.ibm.com/systems/storage/disk/archive/ [ ibm.com ] It scales up to 304 TB ( Raw Capacity )</tokentext>
<sentencetext>Depending on what you're doing, you could consider using a basic version of IBM Information Archive: http://www.ibm.com/systems/storage/disk/archive/ [ibm.com]It scales up to 304 TB (Raw Capacity)</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353454</id>
	<title>Re:Use RAID6 not RAID5</title>
	<author>turbidostato</author>
	<datestamp>1267628520000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>"the biggest problem with RAID is that all the drives must be stored in the same physical box."</p><p>Because?  Say you have a RAID5 with five disks.  You make your copy, turn off the system and send each disk to a different continent.  Where's the problem?</p></htmltext>
<tokenext>" the biggest problem with RAID is that all the drives must be stored in the same physical box. " Because ?
Say you have a RAID5 with five disks .
You make your copy , turn off the system and send each disk to a different continent .
Where 's the problem ?</tokentext>
<sentencetext>"the biggest problem with RAID is that all the drives must be stored in the same physical box."Because?
Say you have a RAID5 with five disks.
You make your copy, turn off the system and send each disk to a different continent.
Where's the problem?</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353130</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353080</id>
	<title>Paper tape.</title>
	<author>RyuuzakiTetsuya</author>
	<datestamp>1267625040000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>Never goes bad, never has a bad bit.</p></htmltext>
<tokenext>Never goes bad , never has a bad bit .</tokentext>
<sentencetext>Never goes bad, never has a bad bit.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31359462</id>
	<title>Recheck the drives</title>
	<author>DrYak</author>
	<datestamp>1267725540000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p><div class="quote"><p>This way if there is a problem with a particular batch of drives it won't ruin everything.</p></div><p>Data that can be digitally copied is pretty much immortal.<br>Mechanical hard-drive are not.</p><p>I would go for :<br>- RAID, too. Preferably RAID6 (better data survival and check-ability in case of failure)<br>- Some way to check sum that data themselves (in check summed file systems or archives).<br>- Periodically get the harddrives out of storage and do I full check on them (check drive's SMART status, check driver's surface readability, check's array's parity, check files checksums).<br>- Every now and then, replace the drives (Either copy and swap one drive at a time inside the array, or migrate the whole content to a bigger array, while checking everything at the same time).</p></div>
	</htmltext>
<tokenext>This way if there is a problem with a particular batch of drives it wo n't ruin everything.Data that can be digitally copied is pretty much immortal.Mechanical hard-drive are not.I would go for : - RAID , too .
Preferably RAID6 ( better data survival and check-ability in case of failure ) - Some way to check sum that data themselves ( in check summed file systems or archives ) .- Periodically get the harddrives out of storage and do I full check on them ( check drive 's SMART status , check driver 's surface readability , check 's array 's parity , check files checksums ) .- Every now and then , replace the drives ( Either copy and swap one drive at a time inside the array , or migrate the whole content to a bigger array , while checking everything at the same time ) .</tokentext>
<sentencetext>This way if there is a problem with a particular batch of drives it won't ruin everything.Data that can be digitally copied is pretty much immortal.Mechanical hard-drive are not.I would go for :- RAID, too.
Preferably RAID6 (better data survival and check-ability in case of failure)- Some way to check sum that data themselves (in check summed file systems or archives).- Periodically get the harddrives out of storage and do I full check on them (check drive's SMART status, check driver's surface readability, check's array's parity, check files checksums).- Every now and then, replace the drives (Either copy and swap one drive at a time inside the array, or migrate the whole content to a bigger array, while checking everything at the same time).
	</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351410</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354430</id>
	<title>Another Option / Definition issues</title>
	<author>bruciferofbrm</author>
	<datestamp>1267637160000</datestamp>
	<modclass>Interestin</modclass>
	<modscore>3</modscore>
	<htmltext><p>A problem I have here is the definition of 'long term'.  To each of us it means something different.</p><p>In my job I have to archive 1.6 terabytes of data per day, and keep it around for 45 days (which, BTW, is not my definition of LONG TERM). For this task I utilize Data Domain storage, which utilizes data deduplication techniques for massive compression.</p><p>What you find is that at the block level your data may in fact be incredibly deduplicatable. In my case it very much the situation. I am currently storing 86 terabytes of rolling archives within 2.5 terabytes of physical disk space.</p><p>The problem with any technology you use for 'long term' storage is the ability to read those archives later. Assuming the media doesn't self degrade inside of the time frame you call 'long term', you must have the tools to read that media again. If you use BluRay, then you must store a compatible drive with it. (Nothing says Sony will not change the standard in two years and make all current drives obsolete, so no one makes them any more). Tape is worse, in that in two major model revisions, drives wont be able to read your media because its density is to low for the new drive head technology. Hardware based disk raid has the issue that the controller the raid was built with needs to stay with that raid. Another controller from the same manufacture, with the same model number, but a different firmware revision may not be able to figure out the raid, and declare the drives empty. Software raid is a little easier to deal with as long as you keep a copy of the OS you used to create it with in the same box. But then, during your defined 'long term' period, will you still have access to a system you can even plug these drives into, or run the OS on?</p><p>What you end up dealing with in reality is that as an archivist, you either ignore these facts, or you invest in a constant media / technology refresh and spend large amounts of time keeping your archives on the latest storage available.</p><p>Of course, all this falls apart if your definition of 'long term' isn't as long as some will project. In my case, my archives roll over every 45 days. I could easily keep that data alive for years on a live piece of hardware with a service contract. If I do not trust that hardware enough, I can buy two and replicate between them. (which, actually I am, for disaster recovery purposes)</p><p>With deduplication my (acknowledged) high initial investment quickly outweighs the cost of single purpose drives holding one copy, and wasting unused space. My purchase cost was less then $60k, but if I had to store all of that data in its raw form, my costs would be in the millions. However, if the data is not deduplicatable, then of course it is a moot point.</p><p>Each answer has it flaws. You decide which risks are acceptable, plan your best to deal with obsolesce, and define your definition of 'long term'. You also have to be ready to change your solution, when the one you choose today, fails to be the right solution for your needs in 5 years.</p></htmltext>
<tokenext>A problem I have here is the definition of 'long term' .
To each of us it means something different.In my job I have to archive 1.6 terabytes of data per day , and keep it around for 45 days ( which , BTW , is not my definition of LONG TERM ) .
For this task I utilize Data Domain storage , which utilizes data deduplication techniques for massive compression.What you find is that at the block level your data may in fact be incredibly deduplicatable .
In my case it very much the situation .
I am currently storing 86 terabytes of rolling archives within 2.5 terabytes of physical disk space.The problem with any technology you use for 'long term ' storage is the ability to read those archives later .
Assuming the media does n't self degrade inside of the time frame you call 'long term ' , you must have the tools to read that media again .
If you use BluRay , then you must store a compatible drive with it .
( Nothing says Sony will not change the standard in two years and make all current drives obsolete , so no one makes them any more ) .
Tape is worse , in that in two major model revisions , drives wont be able to read your media because its density is to low for the new drive head technology .
Hardware based disk raid has the issue that the controller the raid was built with needs to stay with that raid .
Another controller from the same manufacture , with the same model number , but a different firmware revision may not be able to figure out the raid , and declare the drives empty .
Software raid is a little easier to deal with as long as you keep a copy of the OS you used to create it with in the same box .
But then , during your defined 'long term ' period , will you still have access to a system you can even plug these drives into , or run the OS on ? What you end up dealing with in reality is that as an archivist , you either ignore these facts , or you invest in a constant media / technology refresh and spend large amounts of time keeping your archives on the latest storage available.Of course , all this falls apart if your definition of 'long term ' is n't as long as some will project .
In my case , my archives roll over every 45 days .
I could easily keep that data alive for years on a live piece of hardware with a service contract .
If I do not trust that hardware enough , I can buy two and replicate between them .
( which , actually I am , for disaster recovery purposes ) With deduplication my ( acknowledged ) high initial investment quickly outweighs the cost of single purpose drives holding one copy , and wasting unused space .
My purchase cost was less then $ 60k , but if I had to store all of that data in its raw form , my costs would be in the millions .
However , if the data is not deduplicatable , then of course it is a moot point.Each answer has it flaws .
You decide which risks are acceptable , plan your best to deal with obsolesce , and define your definition of 'long term' .
You also have to be ready to change your solution , when the one you choose today , fails to be the right solution for your needs in 5 years .</tokentext>
<sentencetext>A problem I have here is the definition of 'long term'.
To each of us it means something different.In my job I have to archive 1.6 terabytes of data per day, and keep it around for 45 days (which, BTW, is not my definition of LONG TERM).
For this task I utilize Data Domain storage, which utilizes data deduplication techniques for massive compression.What you find is that at the block level your data may in fact be incredibly deduplicatable.
In my case it very much the situation.
I am currently storing 86 terabytes of rolling archives within 2.5 terabytes of physical disk space.The problem with any technology you use for 'long term' storage is the ability to read those archives later.
Assuming the media doesn't self degrade inside of the time frame you call 'long term', you must have the tools to read that media again.
If you use BluRay, then you must store a compatible drive with it.
(Nothing says Sony will not change the standard in two years and make all current drives obsolete, so no one makes them any more).
Tape is worse, in that in two major model revisions, drives wont be able to read your media because its density is to low for the new drive head technology.
Hardware based disk raid has the issue that the controller the raid was built with needs to stay with that raid.
Another controller from the same manufacture, with the same model number, but a different firmware revision may not be able to figure out the raid, and declare the drives empty.
Software raid is a little easier to deal with as long as you keep a copy of the OS you used to create it with in the same box.
But then, during your defined 'long term' period, will you still have access to a system you can even plug these drives into, or run the OS on?What you end up dealing with in reality is that as an archivist, you either ignore these facts, or you invest in a constant media / technology refresh and spend large amounts of time keeping your archives on the latest storage available.Of course, all this falls apart if your definition of 'long term' isn't as long as some will project.
In my case, my archives roll over every 45 days.
I could easily keep that data alive for years on a live piece of hardware with a service contract.
If I do not trust that hardware enough, I can buy two and replicate between them.
(which, actually I am, for disaster recovery purposes)With deduplication my (acknowledged) high initial investment quickly outweighs the cost of single purpose drives holding one copy, and wasting unused space.
My purchase cost was less then $60k, but if I had to store all of that data in its raw form, my costs would be in the millions.
However, if the data is not deduplicatable, then of course it is a moot point.Each answer has it flaws.
You decide which risks are acceptable, plan your best to deal with obsolesce, and define your definition of 'long term'.
You also have to be ready to change your solution, when the one you choose today, fails to be the right solution for your needs in 5 years.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352222</id>
	<title>Hard Copies</title>
	<author>proc\_tarry</author>
	<datestamp>1267619340000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>Print it out on a tractor feed dot-matrix printer to make sure the data set stays collated.  Maybe even use carbon copy paper to have a backup around.  Store it in an Iron Mountain (or under your desk).  Seriously, whose going to know 100 years from know what a Blu-Ray is?</htmltext>
<tokenext>Print it out on a tractor feed dot-matrix printer to make sure the data set stays collated .
Maybe even use carbon copy paper to have a backup around .
Store it in an Iron Mountain ( or under your desk ) .
Seriously , whose going to know 100 years from know what a Blu-Ray is ?</tokentext>
<sentencetext>Print it out on a tractor feed dot-matrix printer to make sure the data set stays collated.
Maybe even use carbon copy paper to have a backup around.
Store it in an Iron Mountain (or under your desk).
Seriously, whose going to know 100 years from know what a Blu-Ray is?</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351648</id>
	<title>LTO Tapes</title>
	<author>Anonymous</author>
	<datestamp>1267616640000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>The only answer here is LTO tape stored at a contracted record archival facility.  Optical media degrades and is easily damaged, hard drives fail ALL THE TIME and will have obsolete interfaces in a few years.  Tape has very long shelf life when stored properly -- it is time tested and trusted.  It is not that expensive to get one tape drive and a few carts for each customer.</p></htmltext>
<tokenext>The only answer here is LTO tape stored at a contracted record archival facility .
Optical media degrades and is easily damaged , hard drives fail ALL THE TIME and will have obsolete interfaces in a few years .
Tape has very long shelf life when stored properly -- it is time tested and trusted .
It is not that expensive to get one tape drive and a few carts for each customer .</tokentext>
<sentencetext>The only answer here is LTO tape stored at a contracted record archival facility.
Optical media degrades and is easily damaged, hard drives fail ALL THE TIME and will have obsolete interfaces in a few years.
Tape has very long shelf life when stored properly -- it is time tested and trusted.
It is not that expensive to get one tape drive and a few carts for each customer.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352178</id>
	<title>Re:Tape is your friend</title>
	<author>hawkeyeMI</author>
	<datestamp>1267619160000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext>This may be my best option. As you mention, a tape changer (the only place I've ever seen/dealt with LTO drives) is out, but a drive and tapes sound like a good option.</htmltext>
<tokenext>This may be my best option .
As you mention , a tape changer ( the only place I 've ever seen/dealt with LTO drives ) is out , but a drive and tapes sound like a good option .</tokentext>
<sentencetext>This may be my best option.
As you mention, a tape changer (the only place I've ever seen/dealt with LTO drives) is out, but a drive and tapes sound like a good option.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351562</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31357684</id>
	<title>Re:Exactly what you're doing</title>
	<author>Anonymous</author>
	<datestamp>1267716480000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>Check your MTBE ratings on the drives you are using, and you will find that RAID5 is woefully inadequate for data protection..  As data sizes scale upwards, you'll need double or triple parity RAID5..  I'm used to calling it RAIDZ as I'm into Solaris/OpenSolaris big time - and would say using RAIDZ2/RAIDZ3 plus the benefits of the ZFS filesystem by itself ought to offer you all the protection you need.</p></htmltext>
<tokenext>Check your MTBE ratings on the drives you are using , and you will find that RAID5 is woefully inadequate for data protection.. As data sizes scale upwards , you 'll need double or triple parity RAID5.. I 'm used to calling it RAIDZ as I 'm into Solaris/OpenSolaris big time - and would say using RAIDZ2/RAIDZ3 plus the benefits of the ZFS filesystem by itself ought to offer you all the protection you need .</tokentext>
<sentencetext>Check your MTBE ratings on the drives you are using, and you will find that RAID5 is woefully inadequate for data protection..  As data sizes scale upwards, you'll need double or triple parity RAID5..  I'm used to calling it RAIDZ as I'm into Solaris/OpenSolaris big time - and would say using RAIDZ2/RAIDZ3 plus the benefits of the ZFS filesystem by itself ought to offer you all the protection you need.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351342</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353110</id>
	<title>Re:Amazon AWS?</title>
	<author>Mad Merlin</author>
	<datestamp>1267625340000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><blockquote><div><p>For comparison, this week I bought a 1TB USB 2.0 external HD for under $100, so a DIY RAID should save you money in the long run.</p></div></blockquote><p>Don't even think about using USB hard drives for this, the performance will be atrocious.</p></div>
	</htmltext>
<tokenext>For comparison , this week I bought a 1TB USB 2.0 external HD for under $ 100 , so a DIY RAID should save you money in the long run.Do n't even think about using USB hard drives for this , the performance will be atrocious .</tokentext>
<sentencetext>For comparison, this week I bought a 1TB USB 2.0 external HD for under $100, so a DIY RAID should save you money in the long run.Don't even think about using USB hard drives for this, the performance will be atrocious.
	</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351598</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354226</id>
	<title>Roll your own or SATA SAN, then store offsite</title>
	<author>scum-o</author>
	<datestamp>1267635060000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>LTO tape is $50/400GB ($125/TB) - PAINFULLY SLOW, SEMI CHEAP<br>Disk is approx $150/2TB ($75/TB) - CHEAPEST AND PRETTY FAST<br>S3 is $153/TB/mo + xfer fees - SLOW, NOT CHEAP</p><p>Buying your own harddrives and storing them yourself is the cheapest option and probably will have very good retention.  Since you're doing this frequently, it seems that it might be worth it to buy a SATA SAN that you can mount several drives in and a *bunch* of SATA drives.  Put in 3-4 drives, raid them, copy your data (might take a while).  Put the drives somewhere safe.  If this is customer data, you can charge them a fee for data retention, so you don't have to eat the whole cost, but you'll have to put some money into the platform to begin with.  If you roll your own, ZFS might be a better option instead of Linux's software raid because you can turn on compression and move data around if you need to.  Getting something going with hot-plug drives (a PC chassis or SAN or whatnot) might also be a good investment.  You may be able to re-use the drives after a while.  Drive costs will also drop over time and you'll be able to buy 4TB and 8TB drives for the same price in a year or less too.  After a year or two, we're talking a few bucks to store this amount of data on your own, fast media.</p><p>As far as storage, a safety deposit box will only work for so many drives.  Might look into <a href="http://www.ironmountain.com/" title="ironmountain.com" rel="nofollow">http://www.ironmountain.com/</a> [ironmountain.com] for secure off-site storage, or just encrypt the data and take it home (off-site) with your or one of your employees so it's physically in more than one location.</p></htmltext>
<tokenext>LTO tape is $ 50/400GB ( $ 125/TB ) - PAINFULLY SLOW , SEMI CHEAPDisk is approx $ 150/2TB ( $ 75/TB ) - CHEAPEST AND PRETTY FASTS3 is $ 153/TB/mo + xfer fees - SLOW , NOT CHEAPBuying your own harddrives and storing them yourself is the cheapest option and probably will have very good retention .
Since you 're doing this frequently , it seems that it might be worth it to buy a SATA SAN that you can mount several drives in and a * bunch * of SATA drives .
Put in 3-4 drives , raid them , copy your data ( might take a while ) .
Put the drives somewhere safe .
If this is customer data , you can charge them a fee for data retention , so you do n't have to eat the whole cost , but you 'll have to put some money into the platform to begin with .
If you roll your own , ZFS might be a better option instead of Linux 's software raid because you can turn on compression and move data around if you need to .
Getting something going with hot-plug drives ( a PC chassis or SAN or whatnot ) might also be a good investment .
You may be able to re-use the drives after a while .
Drive costs will also drop over time and you 'll be able to buy 4TB and 8TB drives for the same price in a year or less too .
After a year or two , we 're talking a few bucks to store this amount of data on your own , fast media.As far as storage , a safety deposit box will only work for so many drives .
Might look into http : //www.ironmountain.com/ [ ironmountain.com ] for secure off-site storage , or just encrypt the data and take it home ( off-site ) with your or one of your employees so it 's physically in more than one location .</tokentext>
<sentencetext>LTO tape is $50/400GB ($125/TB) - PAINFULLY SLOW, SEMI CHEAPDisk is approx $150/2TB ($75/TB) - CHEAPEST AND PRETTY FASTS3 is $153/TB/mo + xfer fees - SLOW, NOT CHEAPBuying your own harddrives and storing them yourself is the cheapest option and probably will have very good retention.
Since you're doing this frequently, it seems that it might be worth it to buy a SATA SAN that you can mount several drives in and a *bunch* of SATA drives.
Put in 3-4 drives, raid them, copy your data (might take a while).
Put the drives somewhere safe.
If this is customer data, you can charge them a fee for data retention, so you don't have to eat the whole cost, but you'll have to put some money into the platform to begin with.
If you roll your own, ZFS might be a better option instead of Linux's software raid because you can turn on compression and move data around if you need to.
Getting something going with hot-plug drives (a PC chassis or SAN or whatnot) might also be a good investment.
You may be able to re-use the drives after a while.
Drive costs will also drop over time and you'll be able to buy 4TB and 8TB drives for the same price in a year or less too.
After a year or two, we're talking a few bucks to store this amount of data on your own, fast media.As far as storage, a safety deposit box will only work for so many drives.
Might look into http://www.ironmountain.com/ [ironmountain.com] for secure off-site storage, or just encrypt the data and take it home (off-site) with your or one of your employees so it's physically in more than one location.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352322</id>
	<title>ZFS?</title>
	<author>Anonymous</author>
	<datestamp>1267619820000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>Seriously, multi-petabyte disk array on RAIDZ w/ ZFS. Use OpenSolaris instead of Solaris and its as cheap as and technically superior to a Linux RAID solution. This is a no brainer. Wasn't there a slashdot story recently about how rackspace builds their disk arrays? (hardware)</p></htmltext>
<tokenext>Seriously , multi-petabyte disk array on RAIDZ w/ ZFS .
Use OpenSolaris instead of Solaris and its as cheap as and technically superior to a Linux RAID solution .
This is a no brainer .
Was n't there a slashdot story recently about how rackspace builds their disk arrays ?
( hardware )</tokentext>
<sentencetext>Seriously, multi-petabyte disk array on RAIDZ w/ ZFS.
Use OpenSolaris instead of Solaris and its as cheap as and technically superior to a Linux RAID solution.
This is a no brainer.
Wasn't there a slashdot story recently about how rackspace builds their disk arrays?
(hardware)</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353382</id>
	<title>Re:Amazon AWS?</title>
	<author>Unequivocal</author>
	<datestamp>1267627980000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>Not to mention he'll need a fat symmetric pipe to push TB's of data up to amazon and that's not free either. A 10mbs pipe takes 12 days to push 1,000gb, if I'm doing the math right..</p></htmltext>
<tokenext>Not to mention he 'll need a fat symmetric pipe to push TB 's of data up to amazon and that 's not free either .
A 10mbs pipe takes 12 days to push 1,000gb , if I 'm doing the math right. .</tokentext>
<sentencetext>Not to mention he'll need a fat symmetric pipe to push TB's of data up to amazon and that's not free either.
A 10mbs pipe takes 12 days to push 1,000gb, if I'm doing the math right..</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351598</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31357774</id>
	<title>Data de-duplication, compression, plus tape</title>
	<author>Anonymous</author>
	<datestamp>1267717080000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>You should consider some new and novel ways to de-duplicate your data at the blocklet level.  Say you generate a few terabytes per customer.  I betcha there's a fair amount of that data that overlaps, but not at the file level.  Let a well managed de-duplication engine reduce it, then save the greatly reduced data on disk, or even tape.  Here's one such product, backed by a company that has dedicated itself to this new market:</p><p>http://www.quantum.com/Products/Disk-BasedBackup/DXi6500/Index.aspx</p></htmltext>
<tokenext>You should consider some new and novel ways to de-duplicate your data at the blocklet level .
Say you generate a few terabytes per customer .
I betcha there 's a fair amount of that data that overlaps , but not at the file level .
Let a well managed de-duplication engine reduce it , then save the greatly reduced data on disk , or even tape .
Here 's one such product , backed by a company that has dedicated itself to this new market : http : //www.quantum.com/Products/Disk-BasedBackup/DXi6500/Index.aspx</tokentext>
<sentencetext>You should consider some new and novel ways to de-duplicate your data at the blocklet level.
Say you generate a few terabytes per customer.
I betcha there's a fair amount of that data that overlaps, but not at the file level.
Let a well managed de-duplication engine reduce it, then save the greatly reduced data on disk, or even tape.
Here's one such product, backed by a company that has dedicated itself to this new market:http://www.quantum.com/Products/Disk-BasedBackup/DXi6500/Index.aspx</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352470</id>
	<title>Re:Tape is your friend</title>
	<author>BikeHelmet</author>
	<datestamp>1267620600000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p><div class="quote"><p>Seriously, you're not going to find a reasonable way of storing that much data anywhere else.</p></div><p>HDDs certainly are looking appealing. 4x1.5TB for $400.</p><p>Just don't let them sit for more than 5 years without copying stuff off to new ones.</p><p>I'd probably just keep them in a working but offline storage computer. You can build one for cheap($400) and stick 12+ drives in. If you ever need data off, plug in the gigabit ethernet. Of course, if the data has to be locked away in a safety deposit box, that completely rules out this solution - but it should be alright if your office or servers are secured. Keep the storage systems in a locked room and encrypt the drives to protect against theft.</p><p>Disclaimer: I try to take security seriously, but I've never been in a situation where it was absolutely required. Someone here can probably poke holes in my suggestion.<nobr> <wbr></nobr>:P</p></div>
	</htmltext>
<tokenext>Seriously , you 're not going to find a reasonable way of storing that much data anywhere else.HDDs certainly are looking appealing .
4x1.5TB for $ 400.Just do n't let them sit for more than 5 years without copying stuff off to new ones.I 'd probably just keep them in a working but offline storage computer .
You can build one for cheap ( $ 400 ) and stick 12 + drives in .
If you ever need data off , plug in the gigabit ethernet .
Of course , if the data has to be locked away in a safety deposit box , that completely rules out this solution - but it should be alright if your office or servers are secured .
Keep the storage systems in a locked room and encrypt the drives to protect against theft.Disclaimer : I try to take security seriously , but I 've never been in a situation where it was absolutely required .
Someone here can probably poke holes in my suggestion .
: P</tokentext>
<sentencetext>Seriously, you're not going to find a reasonable way of storing that much data anywhere else.HDDs certainly are looking appealing.
4x1.5TB for $400.Just don't let them sit for more than 5 years without copying stuff off to new ones.I'd probably just keep them in a working but offline storage computer.
You can build one for cheap($400) and stick 12+ drives in.
If you ever need data off, plug in the gigabit ethernet.
Of course, if the data has to be locked away in a safety deposit box, that completely rules out this solution - but it should be alright if your office or servers are secured.
Keep the storage systems in a locked room and encrypt the drives to protect against theft.Disclaimer: I try to take security seriously, but I've never been in a situation where it was absolutely required.
Someone here can probably poke holes in my suggestion.
:P
	</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351562</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31357694</id>
	<title>s3 or jungledisk or ???</title>
	<author>peril</author>
	<datestamp>1267716540000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>take a look at S3 or jungledisk and see if you can somehow have that make sense. Mozy might not be a bad idea either. The big tradeoff is</p><p>1) limited SLAs (privacy / latency / bandwidth) for getting to your data; once you host with a provider and accept their physical / logical storage footprint - you are constrained to living in their hosting model<br>vs<br>2) providing quick access to your dataset b/c you have special sauce</p><p>The timing / requirements to get back to the data are the things that should drive your behavior. It might make sense to turn the data over to the customer - so when they need you to work on it-  they provide it back.</p><p>I think you might be looking at this all wrong - why not redo the analysis and charge for the whole thing again? (According to the RIAA/MPAA isn't that what should happen when we scratch our movies or music?)</p><p>--Adrian</p></htmltext>
<tokenext>take a look at S3 or jungledisk and see if you can somehow have that make sense .
Mozy might not be a bad idea either .
The big tradeoff is1 ) limited SLAs ( privacy / latency / bandwidth ) for getting to your data ; once you host with a provider and accept their physical / logical storage footprint - you are constrained to living in their hosting modelvs2 ) providing quick access to your dataset b/c you have special sauceThe timing / requirements to get back to the data are the things that should drive your behavior .
It might make sense to turn the data over to the customer - so when they need you to work on it- they provide it back.I think you might be looking at this all wrong - why not redo the analysis and charge for the whole thing again ?
( According to the RIAA/MPAA is n't that what should happen when we scratch our movies or music ?
) --Adrian</tokentext>
<sentencetext>take a look at S3 or jungledisk and see if you can somehow have that make sense.
Mozy might not be a bad idea either.
The big tradeoff is1) limited SLAs (privacy / latency / bandwidth) for getting to your data; once you host with a provider and accept their physical / logical storage footprint - you are constrained to living in their hosting modelvs2) providing quick access to your dataset b/c you have special sauceThe timing / requirements to get back to the data are the things that should drive your behavior.
It might make sense to turn the data over to the customer - so when they need you to work on it-  they provide it back.I think you might be looking at this all wrong - why not redo the analysis and charge for the whole thing again?
(According to the RIAA/MPAA isn't that what should happen when we scratch our movies or music?
)--Adrian</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353338</id>
	<title>Re:Exactly what you're doing</title>
	<author>Unequivocal</author>
	<datestamp>1267627500000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>I got the exact same device from a watch expert about how to store my great grandfather's pocket watch. I thought maybe I should wind it every now and then to "keep it lubricated" or something. He said the best way to prevent mechanical wear and failure is to not use the machine. Makes perfect sense in hindsight.</p></htmltext>
<tokenext>I got the exact same device from a watch expert about how to store my great grandfather 's pocket watch .
I thought maybe I should wind it every now and then to " keep it lubricated " or something .
He said the best way to prevent mechanical wear and failure is to not use the machine .
Makes perfect sense in hindsight .</tokentext>
<sentencetext>I got the exact same device from a watch expert about how to store my great grandfather's pocket watch.
I thought maybe I should wind it every now and then to "keep it lubricated" or something.
He said the best way to prevent mechanical wear and failure is to not use the machine.
Makes perfect sense in hindsight.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353102</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351562</id>
	<title>Re:Tape is your friend</title>
	<author>Saint Aardvark</author>
	<datestamp>1267616280000</datestamp>
	<modclass>Informativ</modclass>
	<modscore>5</modscore>
	<htmltext><p>Couldn't agree more.  A tape library (as in autochanger) might be out of your budget, but a simple tape drive wouldn't be too much -- say $5000 for an LTO4.  Media is $50-$100 or so depending on where you shop.  Seriously, you're not going to find a reasonable way of storing that much data anywhere else.</p><p>BTW, if you're not a member of <a href="http://www.lopsa.org/" title="lopsa.org">LOPSA</a> [lopsa.org], you may want to seriously consider it.  Even if you're not a sysadmin, this is definitely a sysadmin-type question, and their mailing lists are second to none.  It's an excellent resource.</p></htmltext>
<tokenext>Could n't agree more .
A tape library ( as in autochanger ) might be out of your budget , but a simple tape drive would n't be too much -- say $ 5000 for an LTO4 .
Media is $ 50- $ 100 or so depending on where you shop .
Seriously , you 're not going to find a reasonable way of storing that much data anywhere else.BTW , if you 're not a member of LOPSA [ lopsa.org ] , you may want to seriously consider it .
Even if you 're not a sysadmin , this is definitely a sysadmin-type question , and their mailing lists are second to none .
It 's an excellent resource .</tokentext>
<sentencetext>Couldn't agree more.
A tape library (as in autochanger) might be out of your budget, but a simple tape drive wouldn't be too much -- say $5000 for an LTO4.
Media is $50-$100 or so depending on where you shop.
Seriously, you're not going to find a reasonable way of storing that much data anywhere else.BTW, if you're not a member of LOPSA [lopsa.org], you may want to seriously consider it.
Even if you're not a sysadmin, this is definitely a sysadmin-type question, and their mailing lists are second to none.
It's an excellent resource.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351414</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352598</id>
	<title>a petabyte class cluster file system ?</title>
	<author>TravisHein</author>
	<datestamp>1267621500000</datestamp>
	<modclass>Informativ</modclass>
	<modscore>2</modscore>
	<htmltext>how about a cluster file system like <a href="http://www.moosefs.com/" title="moosefs.com" rel="nofollow">http://www.moosefs.com/</a> [moosefs.com] , where it works well on commodity or even end of life by current standards of hardware. Redundancy is achieved through several chunk server nodes in the system. Performance and size can be dynamically scaled over time by adding more nodes or disks to the system.
<br>
Their project is GPL licensed, requires very little effort to set up, and likely much less effort over time to maintain and administer than a RAID or Tape system likely would, with the benefit of being able to choose what data is online or offline (by its presence on chunk servers being connected at the time).</htmltext>
<tokenext>how about a cluster file system like http : //www.moosefs.com/ [ moosefs.com ] , where it works well on commodity or even end of life by current standards of hardware .
Redundancy is achieved through several chunk server nodes in the system .
Performance and size can be dynamically scaled over time by adding more nodes or disks to the system .
Their project is GPL licensed , requires very little effort to set up , and likely much less effort over time to maintain and administer than a RAID or Tape system likely would , with the benefit of being able to choose what data is online or offline ( by its presence on chunk servers being connected at the time ) .</tokentext>
<sentencetext>how about a cluster file system like http://www.moosefs.com/ [moosefs.com] , where it works well on commodity or even end of life by current standards of hardware.
Redundancy is achieved through several chunk server nodes in the system.
Performance and size can be dynamically scaled over time by adding more nodes or disks to the system.
Their project is GPL licensed, requires very little effort to set up, and likely much less effort over time to maintain and administer than a RAID or Tape system likely would, with the benefit of being able to choose what data is online or offline (by its presence on chunk servers being connected at the time).</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352436</id>
	<title>Are you legaly responsible for storing this data?</title>
	<author>Anonymous</author>
	<datestamp>1267620420000</datestamp>
	<modclass>Interestin</modclass>
	<modscore>1</modscore>
	<htmltext><p>Or are you keeping it just in case. If your customers are paying you to keep this data in archive form for them for some set period of time, then you need a solution that will meet that. Taking hard drives offline and hoping for the best is not a reliable long term way to store data. If you leave them online then at least you can detect disk failures, but then you are at risk from accidental or malicious deletion. If you are serious about protecting your customer's data, then it needs to be written to a real offline media (tape, optical, cloud), and preferably 2 copies one on your site, one at a managed 3rd party site that the customer can access if your business model doesn't pan out. If the data contains any sensitive information then it should be written in an encrypted format, and you should provide your customer a way to unencrypt that data if you are unable. The cloud providers have some great options for these services, but the data sizes you are talking about should be much cheaper to manage with 1-2 LTO4 drives direct attached to a system of your choice, and bonus is that LTO4 supports hardware encryption and is widely adopted. You may be a small business but multi-TB data archives are normally a mid to large size business problem. So, the small business online solutions that I have seen are not priced to solve your problem (but there are new ones all the time). Finally, LTO-5 will be out soon, so LTO4 prices should be coming down some over the next few months.<br>If you are not storing this data for your customers benefit, but for your own - to reuse in future work etc - then just decide what its worth to you. I still think tape or maybe bluray will be a more reliable / dependable / repeatable than hard drives, but if it&rsquo;s your data do what you want with it.</p></htmltext>
<tokenext>Or are you keeping it just in case .
If your customers are paying you to keep this data in archive form for them for some set period of time , then you need a solution that will meet that .
Taking hard drives offline and hoping for the best is not a reliable long term way to store data .
If you leave them online then at least you can detect disk failures , but then you are at risk from accidental or malicious deletion .
If you are serious about protecting your customer 's data , then it needs to be written to a real offline media ( tape , optical , cloud ) , and preferably 2 copies one on your site , one at a managed 3rd party site that the customer can access if your business model does n't pan out .
If the data contains any sensitive information then it should be written in an encrypted format , and you should provide your customer a way to unencrypt that data if you are unable .
The cloud providers have some great options for these services , but the data sizes you are talking about should be much cheaper to manage with 1-2 LTO4 drives direct attached to a system of your choice , and bonus is that LTO4 supports hardware encryption and is widely adopted .
You may be a small business but multi-TB data archives are normally a mid to large size business problem .
So , the small business online solutions that I have seen are not priced to solve your problem ( but there are new ones all the time ) .
Finally , LTO-5 will be out soon , so LTO4 prices should be coming down some over the next few months.If you are not storing this data for your customers benefit , but for your own - to reuse in future work etc - then just decide what its worth to you .
I still think tape or maybe bluray will be a more reliable / dependable / repeatable than hard drives , but if it    s your data do what you want with it .</tokentext>
<sentencetext>Or are you keeping it just in case.
If your customers are paying you to keep this data in archive form for them for some set period of time, then you need a solution that will meet that.
Taking hard drives offline and hoping for the best is not a reliable long term way to store data.
If you leave them online then at least you can detect disk failures, but then you are at risk from accidental or malicious deletion.
If you are serious about protecting your customer's data, then it needs to be written to a real offline media (tape, optical, cloud), and preferably 2 copies one on your site, one at a managed 3rd party site that the customer can access if your business model doesn't pan out.
If the data contains any sensitive information then it should be written in an encrypted format, and you should provide your customer a way to unencrypt that data if you are unable.
The cloud providers have some great options for these services, but the data sizes you are talking about should be much cheaper to manage with 1-2 LTO4 drives direct attached to a system of your choice, and bonus is that LTO4 supports hardware encryption and is widely adopted.
You may be a small business but multi-TB data archives are normally a mid to large size business problem.
So, the small business online solutions that I have seen are not priced to solve your problem (but there are new ones all the time).
Finally, LTO-5 will be out soon, so LTO4 prices should be coming down some over the next few months.If you are not storing this data for your customers benefit, but for your own - to reuse in future work etc - then just decide what its worth to you.
I still think tape or maybe bluray will be a more reliable / dependable / repeatable than hard drives, but if it’s your data do what you want with it.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31355146</id>
	<title>Re:Use RAID6 not RAID5</title>
	<author>Anonymous</author>
	<datestamp>1267645440000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>I would definitely use RAID 6 and sleep easier at night. As disks increase in size the chances of a read error increase. It won't be long before RAID 6 is obsolete and we will be using triple parity RAID.</p><p><a href="http://queue.acm.org/detail.cfm?id=1670144" title="acm.org" rel="nofollow">http://queue.acm.org/detail.cfm?id=1670144</a> [acm.org]</p></htmltext>
<tokenext>I would definitely use RAID 6 and sleep easier at night .
As disks increase in size the chances of a read error increase .
It wo n't be long before RAID 6 is obsolete and we will be using triple parity RAID.http : //queue.acm.org/detail.cfm ? id = 1670144 [ acm.org ]</tokentext>
<sentencetext>I would definitely use RAID 6 and sleep easier at night.
As disks increase in size the chances of a read error increase.
It won't be long before RAID 6 is obsolete and we will be using triple parity RAID.http://queue.acm.org/detail.cfm?id=1670144 [acm.org]</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351578</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31355714</id>
	<title>Re:use a tape drive</title>
	<author>Anonymous</author>
	<datestamp>1267695780000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>Par2'ing 3TB would take a hell of a long time. Also, the current par2 implementations are not numerically stable: you can create a par2 set which cannot actually be used for repair.</p></htmltext>
<tokenext>Par2'ing 3TB would take a hell of a long time .
Also , the current par2 implementations are not numerically stable : you can create a par2 set which can not actually be used for repair .</tokentext>
<sentencetext>Par2'ing 3TB would take a hell of a long time.
Also, the current par2 implementations are not numerically stable: you can create a par2 set which cannot actually be used for repair.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351604</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351838</id>
	<title>Agree with the tape option..;.</title>
	<author>klubar</author>
	<datestamp>1267617480000</datestamp>
	<modclass>Informativ</modclass>
	<modscore>3</modscore>
	<htmltext><p>Tape is probably your best option.  You can buy at DAT-5 (or even a DAT-4) tape drive for not very much.  The tapes cost about $10 to $30 each (depending on what tape option you choose).  Make 3 copies of the data set, store one onsite, store another offsite in a secure/climate controlled facility and send the 3rd to the client. Buy a spare tape drive and use both to make writing across tapes easier.  There is a wide variety of software to write to the tape; we use the aging Retrospect.</p><p>The disk options is just way too complex; if anything, skip the RAID option and just store 2 copies.  Putting the RAID sets back together and finding the RAID software will be nearly impossible in a couple of years.  Use some standard formatting on the drives (FAT, NTFS, etc.) and you'll be good to go for the next 15 years.</p></htmltext>
<tokenext>Tape is probably your best option .
You can buy at DAT-5 ( or even a DAT-4 ) tape drive for not very much .
The tapes cost about $ 10 to $ 30 each ( depending on what tape option you choose ) .
Make 3 copies of the data set , store one onsite , store another offsite in a secure/climate controlled facility and send the 3rd to the client .
Buy a spare tape drive and use both to make writing across tapes easier .
There is a wide variety of software to write to the tape ; we use the aging Retrospect.The disk options is just way too complex ; if anything , skip the RAID option and just store 2 copies .
Putting the RAID sets back together and finding the RAID software will be nearly impossible in a couple of years .
Use some standard formatting on the drives ( FAT , NTFS , etc .
) and you 'll be good to go for the next 15 years .</tokentext>
<sentencetext>Tape is probably your best option.
You can buy at DAT-5 (or even a DAT-4) tape drive for not very much.
The tapes cost about $10 to $30 each (depending on what tape option you choose).
Make 3 copies of the data set, store one onsite, store another offsite in a secure/climate controlled facility and send the 3rd to the client.
Buy a spare tape drive and use both to make writing across tapes easier.
There is a wide variety of software to write to the tape; we use the aging Retrospect.The disk options is just way too complex; if anything, skip the RAID option and just store 2 copies.
Putting the RAID sets back together and finding the RAID software will be nearly impossible in a couple of years.
Use some standard formatting on the drives (FAT, NTFS, etc.
) and you'll be good to go for the next 15 years.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351414</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352206</id>
	<title>Re:Exactly what you're doing</title>
	<author>HeronBlademaster</author>
	<datestamp>1267619280000</datestamp>
	<modclass>Interestin</modclass>
	<modscore>2</modscore>
	<htmltext><p><div class="quote"><p>stay away from s3, it's not designed to protect data, despite what AWS fans may say.</p></div><p>Just curious... S3 stores all of your data at multiple, geographically separate data centers.  How exactly does that <i>not</i> protect your data?  What else would you want it to do in terms of protection?  It even gives you md5 sums of your files if you want to verify them (check the ETag attribute of each object).</p><p>So, honest question: what do you think they're missing to make S3 <i>really</i> protect data?</p></div>
	</htmltext>
<tokenext>stay away from s3 , it 's not designed to protect data , despite what AWS fans may say.Just curious... S3 stores all of your data at multiple , geographically separate data centers .
How exactly does that not protect your data ?
What else would you want it to do in terms of protection ?
It even gives you md5 sums of your files if you want to verify them ( check the ETag attribute of each object ) .So , honest question : what do you think they 're missing to make S3 really protect data ?</tokentext>
<sentencetext>stay away from s3, it's not designed to protect data, despite what AWS fans may say.Just curious... S3 stores all of your data at multiple, geographically separate data centers.
How exactly does that not protect your data?
What else would you want it to do in terms of protection?
It even gives you md5 sums of your files if you want to verify them (check the ETag attribute of each object).So, honest question: what do you think they're missing to make S3 really protect data?
	</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351504</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351892</id>
	<title>Re:I'd encrypt the data and...</title>
	<author>Anonymous</author>
	<datestamp>1267617780000</datestamp>
	<modclass>None</modclass>
	<modscore>0</modscore>
	<htmltext><p>entertaining when using public computers... how about using private computers?</p><p>throw together a bunch of computers with 10tb or so via JBOD/RAID1... p2p between 'em</p><p>"official solutions" can include MS DFS, Hadoop, etc.</p></htmltext>
<tokenext>entertaining when using public computers... how about using private computers ? throw together a bunch of computers with 10tb or so via JBOD/RAID1... p2p between 'em " official solutions " can include MS DFS , Hadoop , etc .</tokentext>
<sentencetext>entertaining when using public computers... how about using private computers?throw together a bunch of computers with 10tb or so via JBOD/RAID1... p2p between 'em"official solutions" can include MS DFS, Hadoop, etc.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351566</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351502</id>
	<title>Drobo fan and user</title>
	<author>Lvdata</author>
	<datestamp>1267615980000</datestamp>
	<modclass>Interestin</modclass>
	<modscore>3</modscore>
	<htmltext>You might look at a www.drobo.com as a set of 4,5 and 8 drive enclosures. 1 TB disks gives you 3 TB usable space with a 2 drive failure tolerance. I have the older 4 bay drobo (2 for myself, and 2 at separate clients offices). It is much simpler to use, and will scale to your 2-3tb use and allow mismatched drives that normal raid will not use. Get a enclosure to start with, and then financing permitting, get a 2nd for Drobo redundancy. Not the fastest or cheapest, but reasonably good by both accounts, and simple to use.</htmltext>
<tokenext>You might look at a www.drobo.com as a set of 4,5 and 8 drive enclosures .
1 TB disks gives you 3 TB usable space with a 2 drive failure tolerance .
I have the older 4 bay drobo ( 2 for myself , and 2 at separate clients offices ) .
It is much simpler to use , and will scale to your 2-3tb use and allow mismatched drives that normal raid will not use .
Get a enclosure to start with , and then financing permitting , get a 2nd for Drobo redundancy .
Not the fastest or cheapest , but reasonably good by both accounts , and simple to use .</tokentext>
<sentencetext>You might look at a www.drobo.com as a set of 4,5 and 8 drive enclosures.
1 TB disks gives you 3 TB usable space with a 2 drive failure tolerance.
I have the older 4 bay drobo (2 for myself, and 2 at separate clients offices).
It is much simpler to use, and will scale to your 2-3tb use and allow mismatched drives that normal raid will not use.
Get a enclosure to start with, and then financing permitting, get a 2nd for Drobo redundancy.
Not the fastest or cheapest, but reasonably good by both accounts, and simple to use.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352882</id>
	<title>Re:Exactly what you're doing</title>
	<author>dhobbit</author>
	<datestamp>1267623540000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>I just solved this same problem at work.  We generate ~14 TB every 12 days, and after looking at a bunch of backup solutions I went with good old fashion tape.  LTO4 delivers 800GB raw and ~1.6 with modest compression, is fast up to min of 120MB/s, cheap ~$34 a tape, easy to store, and there's lots of great FOSS backup software.</p></htmltext>
<tokenext>I just solved this same problem at work .
We generate ~ 14 TB every 12 days , and after looking at a bunch of backup solutions I went with good old fashion tape .
LTO4 delivers 800GB raw and ~ 1.6 with modest compression , is fast up to min of 120MB/s , cheap ~ $ 34 a tape , easy to store , and there 's lots of great FOSS backup software .</tokentext>
<sentencetext>I just solved this same problem at work.
We generate ~14 TB every 12 days, and after looking at a bunch of backup solutions I went with good old fashion tape.
LTO4 delivers 800GB raw and ~1.6 with modest compression, is fast up to min of 120MB/s, cheap ~$34 a tape, easy to store, and there's lots of great FOSS backup software.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351342</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351598</id>
	<title>Re:Amazon AWS?</title>
	<author>vrmlguy</author>
	<datestamp>1267616400000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>According to <a href="http://aws.amazon.com/s3/#pricing" title="amazon.com">http://aws.amazon.com/s3/#pricing</a> [amazon.com], S3 will cost you about $150/month per TB.  OTOH, it appears that all data transfers into S3 are free until June 30th, 2010, after which transfer fees will be about $100/TB.  So if you want to do it, do it now.  Be prepared to spend to get your data back out, if you ever need it.</p><p>For comparison, this week I bought a 1TB USB 2.0 external HD for under $100, so a DIY RAID should save you money in the long run.</p><p>I do have to ask one question:  Exactly how is a tape library more impractical than storing a RAID set in a safe deposit box?</p></htmltext>
<tokenext>According to http : //aws.amazon.com/s3/ # pricing [ amazon.com ] , S3 will cost you about $ 150/month per TB .
OTOH , it appears that all data transfers into S3 are free until June 30th , 2010 , after which transfer fees will be about $ 100/TB .
So if you want to do it , do it now .
Be prepared to spend to get your data back out , if you ever need it.For comparison , this week I bought a 1TB USB 2.0 external HD for under $ 100 , so a DIY RAID should save you money in the long run.I do have to ask one question : Exactly how is a tape library more impractical than storing a RAID set in a safe deposit box ?</tokentext>
<sentencetext>According to http://aws.amazon.com/s3/#pricing [amazon.com], S3 will cost you about $150/month per TB.
OTOH, it appears that all data transfers into S3 are free until June 30th, 2010, after which transfer fees will be about $100/TB.
So if you want to do it, do it now.
Be prepared to spend to get your data back out, if you ever need it.For comparison, this week I bought a 1TB USB 2.0 external HD for under $100, so a DIY RAID should save you money in the long run.I do have to ask one question:  Exactly how is a tape library more impractical than storing a RAID set in a safe deposit box?</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351386</parent>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351410</id>
	<title>Different manufacturers</title>
	<author>idiot900</author>
	<datestamp>1267615500000</datestamp>
	<modclass>Insightful</modclass>
	<modscore>4</modscore>
	<htmltext><p>Hard drives are ridiculously cheap these days, especially for how much data you are storing. You may wish to consider buying drives from different manufacturers but of the same size to put in a single mirrored set. This way if there is a problem with a particular batch of drives it won't ruin everything.</p></htmltext>
<tokenext>Hard drives are ridiculously cheap these days , especially for how much data you are storing .
You may wish to consider buying drives from different manufacturers but of the same size to put in a single mirrored set .
This way if there is a problem with a particular batch of drives it wo n't ruin everything .</tokentext>
<sentencetext>Hard drives are ridiculously cheap these days, especially for how much data you are storing.
You may wish to consider buying drives from different manufacturers but of the same size to put in a single mirrored set.
This way if there is a problem with a particular batch of drives it won't ruin everything.</sentencetext>
</comment>
<comment>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351982</id>
	<title>Re:Tape is your friend</title>
	<author>Anonymous Cowpat</author>
	<datestamp>1267618200000</datestamp>
	<modclass>None</modclass>
	<modscore>1</modscore>
	<htmltext><p>$2000 is one not-particularly-brilliant workstation. If he's running a business which is heavily computation-oriented (which multi-TB datasets implies that it is) then $2000 is not a large one-time outlay.</p></htmltext>
<tokenext>$ 2000 is one not-particularly-brilliant workstation .
If he 's running a business which is heavily computation-oriented ( which multi-TB datasets implies that it is ) then $ 2000 is not a large one-time outlay .</tokentext>
<sentencetext>$2000 is one not-particularly-brilliant workstation.
If he's running a business which is heavily computation-oriented (which multi-TB datasets implies that it is) then $2000 is not a large one-time outlay.</sentencetext>
	<parent>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351634</parent>
</comment>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_40</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351342
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354668
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_42</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351342
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351504
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353102
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354644
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_65</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351414
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351562
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352888
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_67</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351414
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351562
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352178
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_70</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351420
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354686
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_7</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351414
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31355302
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31361184
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_66</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351342
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353386
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_29</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351342
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351660
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_57</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351344
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353314
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_60</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351342
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31357684
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_34</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351720
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353556
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354116
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_19</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351348
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354732
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_10</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351706
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352270
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_2</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351342
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351504
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351956
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_24</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351648
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31357382
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_58</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351344
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351924
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_31</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351604
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31370126
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_52</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351420
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351676
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_55</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351342
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351504
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353102
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353338
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_16</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351578
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353130
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353454
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_18</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351566
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353164
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_32</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351626
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31370142
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_1</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351720
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352692
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_23</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351414
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351636
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_46</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351420
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352682
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_0</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351342
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351504
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353102
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31361056
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_22</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351604
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31357846
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_13</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351566
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351892
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_50</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351448
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351872
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_64</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351414
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351838
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354712
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_38</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351578
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354902
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_71</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351604
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31355714
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_14</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351342
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351504
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352166
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_6</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351706
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352396
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_45</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351342
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351504
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352206
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_28</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351578
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351950
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_21</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351386
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351598
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353382
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_44</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351342
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351504
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352638
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_35</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351414
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351634
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351982
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_11</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351342
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351504
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352152
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_69</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352080
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352620
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_72</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351414
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354292
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_63</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351414
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351838
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352752
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_59</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351348
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352776
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_62</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351604
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353470
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_53</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351342
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351504
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31358786
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_36</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352838
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31356612
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_5</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351578
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31355146
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_27</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351344
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351778
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_43</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351342
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351504
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354042
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_4</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351414
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351524
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_26</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351626
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353014
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_17</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351342
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354756
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_33</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354430
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31365372
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_56</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351502
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31358572
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_61</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351410
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31359462
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_49</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351342
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352910
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_51</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351690
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31370242
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_3</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351414
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351562
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352470
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_25</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351414
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351838
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31361084
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_48</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351342
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351504
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352894
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_39</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351602
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352118
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_30</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351420
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352990
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_15</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351414
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351838
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354318
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_20</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351342
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351504
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351970
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_54</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351720
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354122
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_9</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351706
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352846
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_68</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351386
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351598
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353110
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_73</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351342
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352120
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_8</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351342
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351504
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31417258
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_47</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351706
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352100
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_41</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351342
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351504
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352012
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_12</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351414
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351574
</commentlist>
</thread>
<thread>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#thread_10_03_03_2148245_37</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351342
http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352882
</commentlist>
</thread>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.26</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31356678
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.2</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31361604
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.31</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351604
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31357846
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31370126
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353470
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31355714
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.8</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352438
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.11</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351566
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351892
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353164
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.3</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351348
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354732
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352776
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.32</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351850
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.17</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351448
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351872
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.9</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351994
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.22</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31360282
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.12</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351386
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351598
--http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353382
--http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353110
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.25</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353080
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.4</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353030
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.33</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31356570
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.5</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31355404
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.34</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31357600
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.19</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351410
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31359462
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.13</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351420
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352682
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352990
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351676
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354686
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.21</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352074
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.14</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351938
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.27</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351578
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354902
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351950
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31355146
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353130
--http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353454
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.24</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351342
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352910
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351660
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31357684
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354756
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352882
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353386
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352120
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351504
--http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352894
--http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351956
--http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31417258
--http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352166
--http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353102
---http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353338
---http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354644
---http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31361056
--http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352638
--http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351970
--http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352012
--http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352206
--http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31358786
--http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354042
--http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352152
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354668
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.0</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353852
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.6</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352838
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31356612
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.35</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351602
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352118
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.18</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351626
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31370142
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353014
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.1</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351690
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31370242
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.30</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351720
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352692
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354122
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353556
--http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354116
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.15</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351344
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351924
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31353314
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351778
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.28</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351540
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.7</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351414
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351838
--http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354712
--http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354318
--http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31361084
--http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352752
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31355302
--http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31361184
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351524
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354292
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351574
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351634
--http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351982
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351636
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351562
--http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352178
--http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352888
--http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352470
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.36</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31354430
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31365372
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.10</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351706
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352270
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352396
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352846
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352100
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.23</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351446
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.20</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351648
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31357382
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.16</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31351502
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31358572
</commentlist>
</conversation>
<conversation>
	<id>http://www.semanticweb.org/ontologies/ConversationInstances.owl#conversation10_03_03_2148245.29</id>
	<commentlist>http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352080
-http://www.semanticweb.org/ontologies/ConversationInstances.owl#comment10_03_03_2148245.31352620
</commentlist>
</conversation>
