FreeNAS: use your SSD efficiently
ZIL and Cache
Not open for discussion; I think it is a complete waste of resources to use a 120, or 250GB SSD for logs, let alone cache, as FreeNAS will (and should!) use RAM for that. So, I searched and found a way to create two partitions on a single SSD, and expose these as ZIL (ZFS Intended Log) and cache to the pool.
Mind you - there are performance tests around, questioning the efficiency of ZIL and cache. I am not going to test this, I will just add ZIL and cache - I have an SSD especially for this purpose, but I am just mentioning the fact that the use case might differ for you.
Mind you - there are performance tests around, questioning the efficiency of ZIL and cache. I am not going to test this, I will just add ZIL and cache - I have an SSD especially for this purpose, but I am just mentioning the fact that the use case might differ for you.
The pool?
Erhm, in the mean time, I have created a pool, existing of two virtual devices, existing of 6 harddisks each. According to several sources on the interweb, this will deliver resiliance, as well as performance.
I set this as goal to begin with. In the FreeNAS GUI, it looks like:
First of all, find start the command line interface (CLI). You may opt for a remote SSH session, or use the IPMI, or use the "Shell" link in the GUI.I'd use something that allows copy/paste from the screen - a remote SSH would allow for that.
Now, to find your SSD:
Now, to find your SSD:
root@store1:~ # camcontrol devlist
<ATA TOSHIBA MQ03ABB3 0U> at scbus0 target 0 lun 0 (pass0,da0)
<ATA TOSHIBA MQ03ABB3 0U> at scbus0 target 1 lun 0 (pass1,da1)
<ATA TOSHIBA MQ03ABB3 0U> at scbus0 target 2 lun 0 (pass2,da2)
<ATA TOSHIBA MQ03ABB3 0U> at scbus0 target 3 lun 0 (pass3,da3)
<ATA TOSHIBA MQ03ABB3 0U> at scbus0 target 4 lun 0 (pass4,da4)
<ATA TOSHIBA MQ03ABB3 0U> at scbus0 target 5 lun 0 (pass5,da5)
<ATA TOSHIBA MQ03ABB3 0U> at scbus0 target 6 lun 0 (pass6,da6)
<ATA TOSHIBA MQ03ABB3 0U> at scbus0 target 7 lun 0 (pass7,da7)
<TOSHIBA DT01ACA300 MX6OABB0> at scbus1 target 0 lun 0 (pass8,ada0)
<TOSHIBA DT01ACA300 MX6OABB0> at scbus2 target 0 lun 0 (pass9,ada1)
<TOSHIBA DT01ACA300 MX6OABB0> at scbus3 target 0 lun 0 (pass10,ada2)
<TOSHIBA DT01ACA300 MX6OABB0> at scbus4 target 0 lun 0 (pass11,ada3)
<Samsung SSD 850 EVO 250GB EMT02B6Q> at scbus5 target 0 lun 0 (pass12,ada4)
<TOSHIBA DT01ACA300 MX6OABB0> at scbus6 target 0 lun 0 (pass13,ada5)
<Kingston DT microDuo PMAP> at scbus8 target 0 lun 0 (pass14,da8)
<USB Flash Disk 1100> at scbus9 target 0 lun 0 (pass15,da9)
OK - my SSD is at ada4. Check if it is formatted, or partitioned
root@store1:~ # gpart show ada4
gpart: No such geom: ada4.
I isn't. If it were, I would have to destroy that. Now create GPT-based partitions, align them on 4k, leave the first 128 byte alone, so BSD/ZFS can do their magic in the first 128 byte of the disk, and finally, make it a freebsd partition type:
root@store1:~ # gpart create -s gpt ada4
ada4 created
root@store1:~ # gpart add -a 4k -b 128 -t freebsd-zfs -s 20G ada4
ada4p1 added
root@store1:~ # gpart add -a 4k -t freebsd-zfs -s 80G ada4
ada4p2 added
Note, I only specifiy the starting block once, on the first partition to be created.
Also, I used one 20GB partition, and one 80GB, still leaving about 120GB untouched. That's overprovisioning for ya!
The 20GB will be LOG, and the 80GB will be cache.
Now, let's find the guid's of the partitions, in order to add them to the pool:
Also, I used one 20GB partition, and one 80GB, still leaving about 120GB untouched. That's overprovisioning for ya!
The 20GB will be LOG, and the 80GB will be cache.
Now, let's find the guid's of the partitions, in order to add them to the pool:
root@store1:~ # gpart list ada4
Geom name: ada4
modified: false
state: OK
fwheads: 16
fwsectors: 63
last: 488397127
first: 40
entries: 128
scheme: GPT
Providers:
1. Name: ada4p1
Mediasize: 21474836480 (20G)
Sectorsize: 512
Stripesize: 4096
Stripeoffset: 0
Mode: r0w0e0
rawuuid: 3d70bd91-5a52-11e7-ab6a-d05099c1356a
rawtype: 516e7cba-6ecf-11d6-8ff8-00022d09712b
label: (null)
length: 21474836480
offset: 65536
type: freebsd-zfs
index: 1
end: 41943167
start: 128
2. Name: ada4p2
Mediasize: 85899345920 (80G)
Sectorsize: 512
Stripesize: 4096
Stripeoffset: 0
Mode: r0w0e0
rawuuid: 6b11bc08-5a52-11e7-ab6a-d05099c1356a
rawtype: 516e7cba-6ecf-11d6-8ff8-00022d09712b
label: (null)
length: 85899345920
offset: 21474902016
type: freebsd-zfs
index: 2
end: 209715327
start: 41943168
Consumers:
1. Name: ada4
Mediasize: 250059350016 (233G)
Sectorsize: 512
Stripesize: 4096
Stripeoffset: 0
Mode: r0w0e0
I will use the rawuuid codes, so a session with copy-n-paste functionality would be handy... Let's check, shorthand (note the 128 byte offset)
root@store1:~ # gpart show ada4
=> 40 488397088 ada4 GPT (233G)
40 88 - free - (44K)
128 41943040 1 freebsd-zfs (20G)
41943168 167772160 2 freebsd-zfs (80G)
209715328 278681800 - free - (133G)
Now, let's add 20GB LOG to the tank1 volume, and 80GB CACHE:
root@store1:~ # zpool add tank1 log gptid/3d70bd91-5a52-11e7-ab6a-d05099c1356a
root@store1:~ # zpool add tank1 cache gptid/6b11bc08-5a52-11e7-ab6a-d05099c1356a
Now, the pool looks like this:
root@store1:~ # zpool list -v
NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
freenas-boot 7.06G 744M 6.34G - - 10% 1.00x ONLINE -
mirror 7.06G 744M 6.34G - - 10%
da8p2 - - - - - -
da9p2 - - - - - -
tank1 32.5T 1.70M 32.5T - 0% 0% 1.00x ONLINE /mnt
raidz2 16.2T 1.09M 16.2T - 0% 0%
gptid/04919b88-5a4d-11e7-ab6a-d05099c1356a - - - - - -
gptid/0541d551-5a4d-11e7-ab6a-d05099c1356a - - - - - -
gptid/05f6e0ac-5a4d-11e7-ab6a-d05099c1356a - - - - - -
gptid/072ed4f7-5a4d-11e7-ab6a-d05099c1356a - - - - - -
gptid/08553e1a-5a4d-11e7-ab6a-d05099c1356a - - - - - -
gptid/0994cc2f-5a4d-11e7-ab6a-d05099c1356a - - - - - -
raidz2 16.2T 624K 16.2T - 0% 0%
gptid/3a8f5a91-5a4f-11e7-ab6a-d05099c1356a - - - - - -
gptid/3b38fc02-5a4f-11e7-ab6a-d05099c1356a - - - - - -
gptid/3c30f8f7-5a4f-11e7-ab6a-d05099c1356a - - - - - -
gptid/3d2d5c9c-5a4f-11e7-ab6a-d05099c1356a - - - - - -
gptid/3e2fff05-5a4f-11e7-ab6a-d05099c1356a - - - - - -
gptid/3f3aafe4-5a4f-11e7-ab6a-d05099c1356a - - - - - -
log - - - - - -
gptid/3d70bd91-5a52-11e7-ab6a-d05099c1356a 19.9G 0 19.9G - 0% 0%
cache - - - - - -
gptid/6b11bc08-5a52-11e7-ab6a-d05099c1356a 80.0G 1K 80.0G - 0% 0%
So - there it is. 32TB of raw storage, in 2 ZFS2 VDEVs, leaving about 21TB usuable space.
Next entry will be about datasets, snaphots and performance
Next entry will be about datasets, snaphots and performance
2 comments:
Do you see any value in mirroring both/either of log & cache?
Dan,
I did not perform any performance testing, but did a lot of reading. General consensus seems to be to have:
- 5 or 6 disks minimum per RAIDZ2 VDEV
- no more than 18 disks per VDEV
- as many VDEVs in a volume as possible (performance)
- Cache and Logs, and mirror these on different, fast, devices.
I do not mirror, nor have different devices for cache and log; I have one 256GB Samsung EVO SSD for both.
Probably, this will bite me, once I experience a power outage in the middle of a write (isn't the SSD backed up with large caps?). On the other hand, where I live, we seldom suffer from power outages, and I have a solar system.
So, yes, I do see value in mirroring log and cache, but limited. I would go for it in a serious production environment - performance wise, but only after testing.
Some tests reveal no use for either cache and/or logs.
The fact that SMB crashes (FreeNAS-11.0-U2 (e417d8aa5)) is more of a problem than anything else. The attitude towards resolving this too.
Post a Comment