Tuesday, August 15, 2017

Storage Server: datasets, snapshots and performance

Datasets, snapshots and performance

This is a long post, but with lots of pictures. Kind of a management overview ;)

Datasets and snapshots


As may have become clear from a previous post, I have one volume, with a (a -to date- single) dataset (ds1). This was not the result of experience, or deep thought, it was just copied from Benjamin Bryan who did an entry on ZFS hierarchy.
Makes sense to me, so I copied the approach.


Benjamin also has a clever snapshotting regime (scroll down to chapter 15 in the linked article...).
As snapshots only store the differences over time, it's quite an efficient way of allowing human errors to be reverted.


Now for the performance of this energy-efficient storage box. After all, what can one suspect of consumer class 2.5" 5400rpm disks? Must be horrible!
Well, it turns ot it performs quite well, compared to s XPEnology DS3615:
Small size files, the storage server (mapped drive: S:). Slightly outperformed the DS/Synology Hybrid RAID (mapped drive: O:), but looks at the larger blocksizes:

Now, for large files (4GB):
Look at the storage server go - the XPEnology/SHR comes to a crawl!
Nice, steady performance. That would be about maxing out by 1Gbps network.
But wait! What if... yep - let's do a 10Gbps (10GbE) network.

Perfomance over 10GbE

Does it make any sense to inverst? Well, yes - a simple test shows the Storage Server is capable of delivering well over 30Gbps (that is gigabits per second, or 3Gb/s - just to be clear on the abbreviations):
So - there you have it: one client, and one server session from/to localhost. No interfaces involved (but for the local loopback).

But; is the file system capable? Some basic tests show the slowest is about 677Mb/s (709424878 b/s). The politically correct term nowadays seems 677Mib/s, to indicate binary Mega, not decimal... I started out with an 6802, so to me it's natural; kilobytes are 1024 bytes. Kilograms are 1000 grams. The filesystem handles 2.42Gib/s tops. And it seems to like large blocksizes (64k) better than small (4k, 16k).
[root@store1 ~]# dd if=/dev/zero of=/mnt/tank1/removeme bs=4k count=1M 1048576+0 records out 4294967296 bytes transferred in 5.256461 secs (817083508 bytes/sec) [root@store1 ~]# dd if=/dev/zero of=/mnt/tank1/removeme bs=16k count=1M 1048576+0 records in 1048576+0 records out 17179869184 bytes transferred in 9.612885 secs (1787170973 bytes/sec) [root@store1 ~]# dd if=/dev/zero of=/mnt/tank1/removeme bs=64k count=1M 1048576+0 records in 1048576+0 records out 68719476736 bytes transferred in 26.669113 secs (2576744016 bytes/sec) [root@store1 ~]# dd if=/dev/zero of=/mnt/tank1/removeme bs=4k count=100k 102400+0 records in 102400+0 records out 419430400 bytes transferred in 0.591226 secs (709424878 bytes/sec) [root@store1 ~]# dd if=/dev/zero of=/mnt/tank1/removeme bs=64k count=100k 102400+0 records in 102400+0 records out 6710886400 bytes transferred in 2.648660 secs (2533691585 bytes/sec) [root@store1 ~]# dd if=/dev/zero of=/mnt/tank1/removeme bs=16k count=100k 102400+0 records in 102400+0 records out 1677721600 bytes transferred in 0.856529 secs (1958745359 bytes/sec)

Intel X540 dual port converged 10GbE (RJ45)

I managed to purchase three X540 compatible boards from a local firm, uptimed. Very affordable.
The idea is to create a triangle: my workstation connected to the VM machine, and the Storage server. The storage server connected to my PC and the VM machine. The VM machine connected to.. well - you get the idea. All fixed IP addresses, different subnet. No switch; too bloody expensive! this setup is (a lot!) cheaper than 3 X520 (single port) and a switch). Anyway - after some routing exercises, I managed to let the storage server to talk to my workstation.

The tcp performance over the 1Gbps interace. Maxed out, at 933 Mbps multi threaded, but a single thread is sufficient (the 939 Mbps entry). That is pretty much saturating a 1Gbps link.
Quite some difference with the 1Gbps interface, this maxes a about 5Gbps. Not what I paid for...
What I had to use were multiple, parallel streams to saturate the 10GbE connection:
That's more like it! Does this reflect in the speed I can copy from my Windows machine? What impact does this have on the DiskMark tests?

Cool! Two FTP connections, each running at bout 1Gbps each! But not saturated...

Ah! Better. 1GB/s is 1 Gigabyte per second or 10Gigabits per seconds, aka 10Gbps.

CrystalDiskMark results

Results of local harddisk (3TB Toshiba DT01ACA300 - drive E:), Crucial 750GB SSD (CT750MX300SSD1 - drive C:) and an SMB mout on the sotorage server via 10GbE (drive S:)

Now for the large block IO tests:


I think it is fair to say that the "slow" 2.5" harddisks perform outstanding. The storage server outperforms a local harddisk by a factor 3 or more (Watch the QD=32 differences!). Of course, it is no match for a locally attached SSD, but that wasn't a question.

1 comment:

Frank said...

I'll comment on my own entry: The current FreeNAS (9.10.2.U4) introduced a nasty bug, causing these kind of errors:
pid 94431 (smbd), uid 0: exited on signal 6 (core dumped)
There's a bug (https://bugs.freenas.org/issues/24342), and it's marked resolved, but I am waiting for 9.10.2.U5, in which is is emedded.