l i n u x - u s e r s - g r o u p - o f - d a v i s
L U G O D
 
Next Meeting:
December 2: Social gathering
Next Installfest:
TBD
Latest News:
Nov. 18: Club officer elections
Page last updated:
2004 Oct 20 11:49

The following is an archive of a post made to our 'vox-tech mailing list' by one of its subscribers.

Report this post as spam:

(Enter your email address)
Re: [vox-tech] RAID systems
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [vox-tech] RAID systems



> > Measuring a real world workload in real world conditions.  Short
> > of that I'd recommend bonnie++ and "PostMark: A New File System
> > Benchmark"
> 
> Right now all I have been doing is cron'ing iostat to give me snapshots
> every few minutes.

Very reasonable.  Although thats a snapshot.  iostat 60 or iostat 600
will give you a more complete picture (24/7 totals instead of occasional
snapshots).

> Yea, the worst is always what I plan for with these sorts of things,
> but I guess no system is foolproof or failsafe.

Indeed, but offsite offline backups are a great place to start.

> The best idea I have of the population of files that will be stored is:
>  random.  I have general statistics, but they can change on even a
> daily basis.  Most of the storage would be for millions of <64k text
> files, but not always.

I like to run something like:
	http://broadley.org/bill/dirstat.pl

[root@localhost perl]# time ./dirstat.pl /
scanning /
 
Total directories =    25807
Total files       =   389283
Total size        =    98441.5 MB
Average Directory =       15.1 files and  3906.08 KB
Maximum Directory =     7522 files //dev
Average filesize  =      258.95 KB
 
real    0m21.077s
user    0m5.128s
sys     0m10.775s
[root@localhost perl]#

So things to look for:
* large directories might need application changes for smaller dirs,
  ext3 htrees, reiserfs or other support for large dirs.
* average file size (for inode allocation)

> > I believe ext3 will allocate additional inodes as needed, no need to
> > preallocate.
> > 
> 
> One of the previous raid systems (scsi hardware raid) that we had ran
> out of inodes (it was formatted ufs and ran in solaris) in the first
> month or two that we used it for production.  I just don't want to make
> the same mistake twice...

Ugh, indeed, I must have misremember or maybe remembering for the wrong
filesystem.  Never allocate more than one inode per block though,
they will go to waste.

> As mentioned before, pretty randomized populations, and there's a high
> degree of variance between projects.  Basically, we are sent huge
> populations of data, we process the data into different formats, and
> return it.  The input data are mostly correspondance (email, word docs,
> spreadsheets, etc), but that is generally just a rule of thumb...  The
> populations are simply moving targets that vary widely from each
> project, and that is all that I have to go on... :)

If you are ever stuck with a lack of inodes you can make a filesystem in a 
file and loop mount it.

> For some projects, there can be 3 million files where 99% are less than
> 4k in size.  For others there can be 3000 files where all are more than

mkfs.ext3 -T news will make one inode per 4kb block.

> 128k.  Most fall somewhere in between.  Knowing exact numbers would
> mean that I could tell the future and know what would be coming in the
> door (which would be cool...).

Heh.

> Again, here is my dilemma.  I just chose something that would hopefully
> e "good enough(tm)" to use everyday, and something that would handle 30
> gazillion 2k files (I for-sure know there will be gazillions of emails,
> most of which are less than 2k, what I don't know is the ratio of
> smaller files to larger files).

Files smaller than blocksize aren't coalesced afaik, you might need 
another fs if you need it, on the otherhand you can set 1k or 2k blocks.

> I have a triple supply on the drive cabinet and a double supply on the
> box, all fed by UPS.  

Nice.

-- 
Bill Broadley
Computational Science and Engineering
UC Davis
_______________________________________________
vox-tech mailing list
vox-tech@lists.lugod.org
http://lists.lugod.org/mailman/listinfo/vox-tech



LinkedIn
LUGOD Group on LinkedIn
Sign up for LUGOD event announcements
Your email address:
facebook
LUGOD Group on Facebook
'Like' LUGOD on Facebook:

Hosting provided by:
Sunset Systems
Sunset Systems offers preconfigured Linux systems, remote system administration and custom software development.

LUGOD: Linux Users' Group of Davis
PO Box 2082, Davis, CA 95617
Contact Us

LUGOD is a 501(c)7 non-profit organization
based in Davis, California
and serving the Sacramento area.
"Linux" is a trademark of Linus Torvalds.

Sponsored in part by:
Appahost Applications
For a significant contribution towards our projector, and a generous donation to allow us to continue meeting at the Davis Library.