Enterprise SA: 2009

2009/11/09

IT Reporting

A recent "Ask Slashdot" asked what information a sysadmin should take to an executive. Here's what I think. I've picked this up from a variety of sources, including a very-skilled manager.

--------------------------
There are three key things that executives want to hear:

1) What has the department done in the past? The core of this point is to get to the question "Does the past justify continued investment?" and its correlary "We've sunk so much money into IT, what have we gotten from it?" This is where usage statistics (website hits, business transaction data, dollars-per-downtime and Nines, return on cost-saving measures, etc) are presented. This should be in high-level terms with drill-down slides available, but only presented on request. Focus on the trends of service delivery vs. IT budget and/or headcount.

2) What is the department doing now? Here we focus on what is happening with their current business. This is where a primary element of capacity planning comes in: The Headroom Metric. How much additional user load can we support on our current systems and network, before the service is degraded? In concrete terms, ignoring everything except CPU, if you're delivering 100 pages per second, and using 40% of the server's CPU, you have a headroom of 150 additional pp/s. By extrapolating this to the business need - say the marketing department has launched 5 campaigns this year, the current systems may be able to support 10, but should not be expected to support 20 without additional investment. Note that this headroom metric must look at the end-to-end utilization, like disk, memory, network, and most importantly administration effort in order to be accurate.

3) What will the department do in the future? What are the business-focused projects that the department is working on? How will the investment in these projects result in money coming into or staying in the business? What is the Return on Capital, Return on Investment?

As far as timing, there should be at least an annual "full report" on the state of IT. Depending on the dynamics of the business, quarterly updates should be sufficient, unless something changes significantly. And depending on the team and scope of the projects. You don't want to face this with a "we haven't done anything since the last report" status. But it's also important to reconnect with the executives regularly so that they don't forget about what you're doing, and also so that you can react and change to meet their changing business plans.

The most important thing we in IT can do is to be aligned to the business. This means focusing on the things that matter: delivering the product or service in exchange for money. Everything else is overhead. And the better your IT department is at aligning itself, the better you look when an outsourcer tries to talk your executives into cutting everything except the "core competancies".

--Joe

2009/09/29

Idle curiosity about iLOM

Why does the service processor on our brand new Sun T5240 server have a SPARC 885 processor, and run Linux? Why not (Open)Solaris?

Kinda ironic that Sun boots its latest servers with Linux.

Maybe it's the fact that it has 32MB of flash to work with, and only 128MB of ram. But that should be enough to run Solaris.

--Joe

2009/08/27

FSF Windows 7 sins

I don't normally post political messages here, but this one's important, I think.

The Free Software Foundation has posted 7 Windows 7 "sins" at http://windows7sins.org/, and I think they left out what in my mind is the most important issue. It's sorta covered in "Corrupting Education" and "Lock-In", but not really:

With Windows 7 (and Office 2003 before that, and Vista before that, and XP before that, and Windows 9x/W2K before that) users will have to retire/obsolete all of their existing training in the Windows user interface in favor of the newest cosmetic decisions Microsoft has made for its products.

I don't argue that there aren't significant productivity benefits to the current Windows shell (vs. Program Manager in NT and 3.x) or in the improvements from '95 to XP. I haven't seen much of Vista's Aero, or the new Windows 7 UI, and I'm sure all of the changes have been run by major interface testers.

But when I switched from Office 2000 to Office 2003, I had a rather steep learning curve to deal with the "Ribbon" UI. Even though I taught Office 97 to Computers 101 users in grad school (and was able to take that through to O2K) I was lost with the new "Where the h*** did the menu go" interface. (Ok, If I were an Excel developer, would I consider search&replace General (Home) thing, or a Data thing. It used to be in the Edit menu... )

But I relearned. And I was able to relearn because as I was growing up, the UI changed dramatically (from Write on my Apple ][+ to PC/Word Perfect to WPfW to vim/TeX and on to MS Office*) But for someone who's used to and has memorized the keystrokes/mouse clicks to insert a text box, this is a whole new ballgame.

When I was applying for jobs after college for example, one of the companies asked that I take an "aptitude test" which included things like typing speed and accuracy, formatting documents, generating mail merges etc. This computer-based test was graded on if you click the right menu option first. If you picked "Edit" instead of "Tools" (or if you right-clicked and chose "Format") you got the question wrong. Not that this was a good test, but it's typical for the industry. And the answers completely changed when 2K7 came out.

Of course, in my line of work, we're more concerned about the OS than about the Office apps. So it's things like the changes in networking that annoy me about Vista. Wow, the way I set up a dialup connection has changed. Hmm, I wonder what happens if I right-click here... etc. So I have to learn a whole new way to fix things that go wrong. Not to mention that Vista Home is quite different interface-wise than Vista Business.

And I'd expect that the various Windows 7 editions will look different too. After all, would the wizard that helps gramma connect to the wireless internet at Starbucks be the best way for IT professionals to diagnose an 802.1x authentication problem? If I learn how to do it with my home PC, will that apply to the real business world?

--Joe

2009/08/19

Netapp - Waster of space

We have a Netapp that we use to provide Tier-2 LUNS to our SAN. It was price-competitive on raw disk space, but I didn't realize at the time just how much overhead this appliance had.

The obvious overhead is RAID-DP and hot spare drives. Easily calculated. 1 HS per 30 drives of each size. DP is 2 drives per plex, so that's 6 wasted drives out of the 28 in two shelves, leaving 22 * 266GB drives usable = 5.7TB.

I'd heard that space is reserved for OS and bad-block overhead (about 10%) so that brings us down to 5.2TB usable.

Well, the web interface shows the aggregate as 4.66TB. So that's 600GB I haven't accounted for. But still, 4.66 TB is a good amount of space.

From the aggregate, we create a flexvol (note that this places 20% by default as inaccessible snap reserve space). On the flexvol, we create LUNs and present them to our servers. And here's where the space consumption is nasty:

By default, if you create a 1TB lun, OnTAP reserves 1TB of disk blocks in the volume. That's nice, and exactly what I'd expect. Although in practice, we use thin provisioning (lun create -o noreserve) for most of our LUNs

What I didn't expect going in was that the first time you create a snapshot, OnTAP would reserve ANOTHER 1TB for that LUN. And interestingly enough, that 1TB is never touched until there's no other space in the volume.

Ok, That ensures that even if you overwrite the ENTIRE lun after you take a snapshot. But it reduces the usable size of LUN-allocation to 2.33TB. And if you have multiple snapshots, those don't seem to go into the snap reserve, but rather are in addition to the 2*LUNsize that is already allocated.

So out of a raw disk capacity of (28*266) 7.2 TB (which is quoted as 28*300GB disks = 8.2TB) we get just over 2TB of space that can be used for holding actual system data.

Wow.

Now, there are non-default settings that can change that, but they're only available at the CLI, not the web interface:

# snap reserve 0 - this will set the snap reserve from 20% to 0%, which is recommended for volumes that hold only LUNs.
# vol options fractional_reserve ## - This changes the % of LUNsize that is reserved when a LUN snapshot is taken.

It is not entirely clear what happens to a LUN when its delta becomes larger than the fractional_reserve. Some documentation says it may take the LUN offline, but I would hope that only would happen if there's no remaining space in the volume (like what happens with snapshot overflow in traditional NAS usages). But it's not clear.

As far as I can tell, the current best practice is to set the snap reserve to the amount of change you expect in the volume, and set the fractional_reserve to the amount of change you expect in the LUN. And to set up either volume auto-grow and/or snapshot auto-delete to make sure you have free space when things get full.

On the gripping hand, the default options make sure that you have to buy a lot of disks to get the storage you need.

--Joe

2009/07/13

SCSI disk identifiers

Whoever it was that thought they'd be cute and put the VT100 "clear screen" character string as part of their disk identifier, I want to buy you a drink.

The probe-scsi-all output wasn't nice.

Hemlock. Your choice of flavors.

--Joe

2009/04/03

Discovering R for performance analysis

I've seen references in various conferences and performance blogs about the "R" statistical analysis package, and how it can be used to data mine system performance data. I'm going to learn it.

Fun.

2009/03/06

Firewall project

A big consumer of my time this week (and last week) is building a pilot implementation of a new internet-facing DMZ. Well, that's understating the requirements a bit. Corporate requires a special "reverse proxy" system to be sitting in the internet-facing parts, so we have to make some major changes anyway, but I wasn't happy with just having a DMZ out there, it needs to be reliable. Preferably more reliable than our internet feed. But we have more than 1 datacenter, with more than 1 internet provider, why not take advantage of that?

Basically, the goal is to have a single IP address (for www.dom.ain) that is internet-routed through both datacenter ISPs, and have Linux do some magic so that packets can come or go through whichever pipe. Apparently, there are companies that make such magic happen for lots of $$$ but in this economy, they aren't an option. And since Linux is free (and my time is already paid for) here's a chance to save the company money. That's what I sold to management anyway.

It should be simple enough: advertise that magic netblock out both pipes, put a Linux router on the link as the gateway for that block, NAT the magic.xxx address of www to the internal IP address of the apache server, and toss out of state packets over to its peer so that the firewalls between this box and the apache server wouldn't see them.

In ascii:


Internet --- Linux ---- FW --+-- LAN --- apache
              ^-v            |
Internet --- Linux ---- FW --+

(We've assumed that the WAN is important enough internally that if it's down, our external site is going to have problems anyway. Which is true, unfortunately. WAN outages between our 2 main datacenters tend to break everything even for local users.)

So far I've gotten 3/4 of the packet-handling stuff working for a single system using just iptables. nat PREROUTING DNAT rewrites the magic.xxx to apache's address, POSTROUTING MASQUERADE gives apache something routable to return the packets to, and I can see the entries in the /proc/net/ip_conntrack file. Unfortunately, I can't seem to find how nat is supposed to de-masquerade the packets back according to the state that caused them.

I have a packet coming in from 10.0.05 (client) -> 192.168.1.13 (www) (magic block is 192.168.1/24). It leaves my box as 192.168.5.182 (lx-int) -> 192.168.6.13 (www-web0). www-web0 gets the SYN, and sends its SYN+ACK back 192.168.6.13 -> 192.168.5.182. I see those packets on the wire, and it's what I'd expect.

What I don't see is a way to take that SYN+ACK, look up in the connection tracking table for the original client and rewrite it to be 192.168.1.13 -> 10.0.0.5.

--Joe

2009/02/19

Photo Archiving

This is in response to BenR's post at http://www.cuddletech.com/blog/pivot/entry.php?id=1016 which I can't seem to get past his comment-spam filter.

As a fellow father and sys/storage admin, I have similar questions. Have you made the jump to video already? A MiniDV tape at LP (90 mins) quality -- a little less than DVD quality but with worse compression, eats up 15GB of disk space when I dump the AVI stream. Not to mention the gigabytes of SD and CF cards from the camera.

I'm confident in my 3-tier archiving scheme: An active in-the-house full-quality copy on simple disk, a "thumbnail" (screen-resolution or compressed video) version on S3, and two copies of the original format on DVD - one onsite and one offsite.

I expect to have to move from DVD media periodically, but I can put that off until the higher-capacity disk wars play out. Every file on the DVDs are md5sum'd, and i know I can use ddrescue to pull data blocks off either wafer, if S3 and my home drive die, assuming the scratch doesn't hit both disks in the same place. It'd be nice to have an automatic system to track which file is on what DVD, but I haven't implemented such an HSM yet.

I'm enough of a pack rat to keep a DVD drive and probably a computer that can read it essentially forever, and if not, there's always eBay.

The biggest problem I face is not deleting all of the content from a card (or tape) before popping it back into the camera and adding more. So when I copy a media into the "system" I might have other duplicate copies of the pictures. I'd love to be able to deduplicate those and store only one copy (and links to it). And even better would be a content-aware dedup that could tell that x.jpg is the same picture as Y.raw... (and that song_64kvbr.mp3 can be derived from song.flac)

But I haven't put that together yet, either.

--Joe

2009/02/18

VMware View 3.0 and proxies

Oops, I haven't blogged the first part of this story. Oh well, maybe later. In brief, we have VMware VDM to satisfy das corporate security. It was working for people on our LAN and on the corporate network, and I got it to work from the internet (but requiring a valid smartcard (SSL User Certificates) before letting a user in). This was a cool project I'll have to document here some time.

Well, time moves on and VMware View Manager 3.0 (nee VDM 3.0) was released and implemented in this environment.

The first problem we noticed started when a home user upgraded their View client to 3.0 as they were prompted on the login page. This was when the smartcard authentication from the internet stopped working. A little investigation (watching network traffic, decrypting with Wireshark, etc) and I found that while the old client would send an HTTPS post command just like IE, the new client didn't send the user SSL certificate. But since VMware never supported this sort of setup, I just worked through it (another cool solution I'll have to post later). A little bit of rearchitecture, and I was able to still protect enough of the View environment to make me feel secure and to convince the security people that it was sufficient.

Now, I've got a similar error from the corporate network. Same message: Connection to View server could not be established". But WTF? this is on the LAN, there shouldn't be a proxy problem. IE works just fine*, but View can't connect.

That is to say IE worked fine with the proxy, but the proxy requires user authentication, which is cached for the browser session, and I didn't think of that until later.

So fire up Wireshark again, and once again, the first couple of View CONNECT :443 requests from IE happily sent the Proxy-Authorization: header, but the last one tried to do a CONNECT without that header, and was tossed back a Squid Authentication Required 407.

Ah, that's a relatively easy one to fix, if only I could get the proxy admin to turn of authentication (nope, that's verbotten) or do the same sort of magic as I did on the outside firewall deployment (eww, that'd be messy) or maybe bypass the proxy for this? I mean, they're on the LAN. Luckily VMware apparently thought of this and implemented an undocumented registry key: HKEY_LOCAL_MACHINE\SOFTWARE\VMware, Inc.\VMware VDM\ProxyBypass that contains a MultiSZ list of names or IPs for View to connect directly to instead of using the proxy.

Did I mention that all of this new behavior is undocumented? And that what I'd been doing in the first place was both unsupported and completely WORKING?

I'd guess that the new View client switched from a standard MS HttpRequest method to something they threw together without the nice functionality that IE bundles into its method. Oh well. It's working again now.

--Joe

Enterprise SA