Sun 7000 FAQ
This FAQ was compiled through the questions we have received about the Sun 7000. If you have a question that is not answered here, there is a form to submit a new question at the bottom of this page. Thanks to the CTI engineering team for their help with the lab testing. Also, thanks go out to the Fishworks team at Sun for their tolerance for my non-stop questions and RFEs.
Also check out Peter Galvin’s article on the Sun 7000 Analytics.
Thanks,
- Jesse
Miscellaneous
Caching/SSD/Flash
Analytics
Disaster Recovery/Replication
Performance
The simulator
Miscellaneous
Fishworks is the name of the group that developed the Sun 7000 technology at Sun. The Fishworks name is a tribute to the Skunk Works at Lockheed Martin. The team keeps a blog at http://blogs.sun.com/fishworks/ Bryan Cantrill writes about the origin of the group in his blog at http://blogs.sun.com/bmc/entry/fishworks_now_it_can_be Amber Road is the Sun internal codename for Sun 7000 when it was in development. It was originally the codename for the next version of the Sun 5000 NAS. When development began on the 7000, the goal was to keep it under wraps for as long as possible. What better way to be ignored by the storage industry than to carry on the codename of the less-than-spectacular Sun 5000 product? A BUI is a Browser User Interface. No. Although Sun did indicate that deduplication was a "future" at a recent Sun 7000 roadshow event. (2/2009) The disk trays connect to the controllers with SAS. I have a habit of referring to these are loops, but they are in fact not loops. (My fibre channel history has imprinted the term loop into my brain.) If you connect SAS in a loop, it will not work properly. Sun refers to the external disk as SAS chains. As long as the new disk shelf can be added to an existing SAS disk chain or an existing SAS HBA in the controller, then there is no downtime required. If you are out of SAS ports, then the controller will need to be taken offline to install additional SAS ports. In a cluster, the cluster failover could be leveraged to minimize the impact. The max cable length is 8 meters. Each SIM contains a repeater, so there should be no signal strength issues between components in the SAS chain. SAS I/O Module. The SIM is the 'controller' in the J4400 disk shelf that provides the disk connectivity. Appliance Kit Yes. The scripting language is ECMAScript 3. It is very similar to JavaScript. For more information that you wanted to know about the development team ended up here, check out Bryan Cantrill's blog at http://blogs.sun.com/bmc/entry/on_modalities_and_misadventures No. In order to automate this operation, it would need to be triggered from another system. The easiest way to automate it is to install SSH keys and put it in cron on another system. A history of software version releases is kept here: http://wikis.sun.com/display/FishWorks/Software+Updates The page also provides links to download the firmware updates. bar Yes. User quotas were introduced in the 2009 Q3 release. Eric Schrock has a blog on the topic here. The system supports a single pool on each controller. In a clustered configuration, each controller supports a single pool under normal operation. In a failover situation, a single controller can run with two pools. A single pool (or aggregate) can contain all of the disks in the system. There is effectively no limit on the number of spindles that can be placed in a pool. A pool can be configured to use RAID-Z2 (similar to RAID 6 or RAID-DP), RAID-Z (similar to RAID 5), RAID 0 (striping), and RAID 1+0 (striped mirrors). Triple parity RAID was added in the 2009 Q3 software release. Each pool uses a single RAID type and each controller supports a single pool during normal system operations. So, each controller runs a single RAID level during normal operations. No. The phone home functionality does not need to be configured. The system phones home automatically every 24 hours. If Sun support does not see a phone home for 50 hours, a new case is automatically opened. In order to change the RAID type, the pool will need to be removed and recreated. This will destroy any data in the current pool. The record size can be changed for a share, but not a LUN. Changing the zfs record size will only affect future writes. It will not change any of the data already written to the share. Anything that causes the file to be rewritten will cause the new record size to be used. For example, if you open a file and save it, the new copy will reflect the new block size. Sun has written a capacity calculator. It is a python application that provides the RAID and usable capacity options for a given hardware configuration. In needs to communicate with a 7000 system to calculate the result. While this is not ideal, it work with the simulator so you do not need to have an actual appliance. Adam Leventhal wrote the original version of the calculator, but it has since been updated by Ryan Matthews. It is available here. In response to a few requests, we have created an online capacity calculator.
What is Fishworks?
What is Amber Road?
What is a BUI?
Does the Sun 7000 support deduplication?
How are external disk trays attached to the Sun 7000 controllers?
Does adding additional disk shelves to the system require downtime?
What is the max SAS cable length between the 7310/7410 and the SIM in a JBOD?
What does SIM stand for?
What does "AK" stand for? It appears throughout the documentation.
Is there a scripting language that can be used to automate tasks?
Can a storage pool scrub be scheduled from the BUI or CLI?
What is the latest version of firmware and where can it be downloaded?
foo
Does the Sun 7000 provide per user quotas?
How many pools does the Sun 7000 support?
What is the maximum size of a pool?
What RAID levels are supported?
How many different RAID levels are supported on a single controller?
Does the Sun 7000 require phone home?
What happens if the Sun 7000 stops phoning home?
Can I change the RAID type of the pool?
Can I change the zfs record size of an existing share or LUN?
How much usable capacity will be available for a specific configuration?
Caching/SSD/Flash
A Hybrid Storage Pool is Sun's name for a storage pool that combines SATA drives with Readzillas and Logzillas. This allows the system to be sized to meet capacity (disk), read performance (Readzilla), and write performance (Logzilla) requirements. For a more detailed explanation, visit Adam Leventhal's blog at http://blogs.sun.com/ahl/entry/fishworks_launch The ARC is the Adaptive Replacement Cache. This is the name for the DRAM cache in the 7000. The L2ARC is the second level Adaptive Replacement Cache. The L2ARC acts very much like a victim cache for the primary cache. The 7000 moves cached data and metadata that is about to be evicted from primary cache to the readzilla. It is important to note that it moves data that is about to be evicted, not data that was just evicted. This makes a major difference in system performance. A Readzilla is a read-optimized flash based SSD. The Sun 7000 uses these devices to create a very large read cache. A readzilla device is 100GB. The usable cache capacity is 93.2GB. A Logzilla is a flash-based SSD that contains a pretty big DRAM cache backed by a supercapacitor so that the cache can effectively be treated as nonvolatile. This means the Sun 7000 does not use a stack of batteries to flush DRAM to disk or NVRAM. The Logzilla provides a very fast access device for storing synchronous writes. A Logzilla device is 18GB. It provides 17GB of usable capacity, but capacity is less relevant then the number of IOPS it provides. Each Logzilla is able to deliver 9-10K write IOPS. Systems should include sufficient Logzillas to satisfy the need for write IOPS. The 7000 will not acknowledge a synchronous write to a client system until it has been written to non-volatile storage. The system memory (DRAM) can not be used as a non-volatile destination for writes *. This means that without a logzilla device synchronous writes will go all the way to hard disk before the client is acknowledged. Most environments will consider this to be too slow. As with everything, there are exceptions. Your workload may consist of primarily reads or asynchronous writes. In that case, running with no logzilla may provide adequate performance. * - There is a setting that will allow iSCSI devices to write to DRAM and acknowledge to the client, but I recommend using it extremely cautiously. Readzillas are not supported in a 7110 Logzillas are not supported in the 7110. The 7210 does not support Readzillas The 7210 is supported with 0, 1, or 2 Logzillas 6 Readzillas are supported in each 7310 controller. This means a total of 12 in a 7310 cluster. There are a total of 4 supported Logzillas in a 7310 (this is true for both single controllers and clusters). They must be in the first tray in the chain, but keep in mind that due to the dual connectivity there are two first trays in each chain. The ‘first’ and ‘last’ trays are both the first tray in the chain. 6 Readzillas are supported in each 7410 controller. This means a total of 12 in a 7410 cluster. There are a total of 8 supported Logzillas in a 7410 or 16 in a 7410 cluster. They must be in the first tray in the chain, but keep in mind that due to the dual connectivity there are two first trays in each chain. The ‘first’ and ‘last’ trays are both the first tray in the chain. If the Logzillas are mirrored a total of 8 (or 4 mirrored pairs) can be used on each node. If they are striped, then only 6 can be used on any given node. So, if the goal is to stripe the Logzillas, only install 6 per controller. No. When the system is looking at blocks to move to the L2ARC, prefetched blocks are not eligible. So, a streaming workload will benefit from additional ARC size, but not from a larger L2ARC. The L2ARC is utilized only by the volumes in the pool that is native to the controller. This means that the volumes from the failed controller will not be able to use the L2ARC. All volumes will have access to the ARC during a failover. No. The L2ARC will be empty after a reboot. The logzilla contains the ZFS Intent Log (ZIL). Under normal operating conditions, the writes are still in the ARC and are flushed to disk from DRAM. If there is a node failure, then the ZIL on the logzilla will be read and the transactions will be replayed. This is the only time you should see reads from a logzilla device.
What is a Hybrid Storage Pool?
What is the ARC?
What is the L2ARC?
What is a Readzilla?
How big is a Readzilla device?
What is a Logzilla?
How big is a Logzilla device?
Will the write performance be slow without a logzilla device? Do I need a logzilla in my configuration?
How many Readzillas are supported in a 7110?
How many Logzillas are supported in a 7110?
How many Readzillas are supported in a 7210?
How many Logzillas are supported in a 7210?
How many Readzillas are supported in a 7310?
How many Logzillas are supported in a 7310?
How many Readzillas are supported in a 7410?
How many Logzillas are supported in a 7410?
Will a streaming workload 'cache bust' the L2ARC?
How is the L2ARC utilized when a cluster is operating in degraded mode?
Does the L2ARC retain data through a reboot?
Why are there no reads from the logzilla?
Analytics
Yes. If you hold down the Shift key while clicking on the Drill Down button, each item in the left panel is highlighted. The development team calls this feature Brendan's Rainbow in honor of Brendan Gregg Yes, but it is not obvious. NFS operations broken down by latency is a perfect example of where this makes sense. When viewing all of the data, it may not be possible to see the desired level of detail. In order to zoom, select click on the low value of the desired zoom range in the left panel. Shift click on the highest value of the desired zoom range. Click on the Outliers icon to zoom in. The Outliers icon is on the far right of the chart icon row. Yes. By default the 'Add statistic...' drop down menu shows only the most frequently used statistics. Under Configuration -> Preferences, 'Make available advanced analytics statistics' can be enabled. This will add additional options to the 'Add statistic...' menu. There is not a specific amount of time saved. It is dependent on the available disk capacity. The historical data is saved on the mirrored boot drives. In order to provide the maximum retention period, the data is saved to a compressed filesystem. The system can easily store a year of historical data. Yes. There is an export button on each chart in a Worksheet. This will export the data in a comma separated value (CSV) format.
Is there a way select every item in the left panel to highlight all of the entries for a given chart?
Is it possible to zoom in on the vertical axis of a chart?
What do the different types of ARC hits and misses mean?
Can I get access to additional statistics in Analytics?
How much Analytics history is saved?
Can the data be exported from Analytics?
Disaster Recovery/Replication
No. The Sun 7000 can only replicate to another Sun 7000 system. There is no support for replicating to Solaris 10 or Open Solaris. No. The two nodes in a cluster can replicate to each other, but a single node can not replicate to itself. Yes. The replication stream is encrypted using SSL. The destination volume itself can not be mounted. A clone of the destination share will need to be created in order for it to be mounted. The destination volume itself can not be mounted. A clone of the destination share will need to be created in order for it to be mounted read/write.
Can the Sun 7000 replicate to a Solaris system or vice versa?
Can a Sun 7000 system replicate to itself?
Is replication traffic secure?
Can a replicated share be mounted read only at the destination?
Can a replicated share be mounted read/write at the destination?
Performance
Logzillas are sized based on performance, not capacity. Each Logzilla is able to deliver 9-10K write IOPS. Systems should include sufficient Logzillas to satisfy the need for write IOPS. There are none. According to some members of the development team there will never be any. The SPEC SFS benchmark is not very representative of a real world workload and Sun suggests their system is designed for the real world, not a benchmark. Bryan Cantrill has an overview in bis blog where he shares his thoughts on the SPEC SFS benchmark suite. The post generated a bit of buzz in the blogsphere and Bryan responds to the buzz in a separate post. There is no simple answer to that question. Like any storage device your IO profile and working set will have a huge impact on the performance you get from the system. The best place to find performance results for the system is in Brendan Gregg's blog. He does a great job of finding the absolute limits of the system. Check here for 7410 results and here for 7310 results. Yes. The 7000 will prefetch sequential reads (both forward and reverse). It will also detect strided reads (both forward and reverse). So, multiple processes reading from the same file will get the cache benefit of that prefetched data. The system has multiple prefetch streams. This allows it to detect sequential reads on several files and prefetch for all of them. That provides for very good streaming read performance from the ARC. Any read that was brought into the ARC by a prefetch will not use the L2ARC. See the SSD section for more details.
How many Logzillas do I need?
What are the published SPEC SFS benchmark results for the Sun 7000?
How fast is the Sun 7000?
Does the Sun 7000 do read ahead/prefetch on sequential reads?
The simulator
You can download the simulator for either VMware or Virtual box here. Sorry, but Sun is going to make you register to download it. The simulator runs the same software as the appliance. I have even seen people setup replication between two different laptops. The only feature that is not available in the simulator is clustering. The appliance uses a cluster card and that hardware does not exist in the simulator.
Where can I download the Sun 7000 simulator?
What is the difference between the simulator and the real appliance?
1 - optional, used to notify you when the question has been answered