<?xml version="1.0" encoding="UTF-8"?> <rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" ><channel><title>ipHouse Blog &#187; Nick Gasper</title> <atom:link href="http://blogs.iphouse.net/author/nick/feed/" rel="self" type="application/rss+xml" /><link>http://blogs.iphouse.net</link> <description>A friendly, local ISP with a view.</description> <lastBuildDate>Sat, 04 Feb 2012 04:14:51 +0000</lastBuildDate> <language>en</language> <sy:updatePeriod>hourly</sy:updatePeriod> <sy:updateFrequency>1</sy:updateFrequency> <generator>http://wordpress.org/?v=3.3.1</generator> <item><title>Here, There Be Storage Related Dragons&#8230;</title><link>http://blogs.iphouse.net/2012/02/03/here-there-be-storage-related-dragons/</link> <comments>http://blogs.iphouse.net/2012/02/03/here-there-be-storage-related-dragons/#comments</comments> <pubDate>Fri, 03 Feb 2012 21:31:46 +0000</pubDate> <dc:creator>Nick Gasper</dc:creator> <category><![CDATA[Opinion]]></category> <category><![CDATA[System Administrators]]></category> <category><![CDATA[geeky]]></category> <category><![CDATA[Virtualization]]></category><guid isPermaLink="false">http://blogs.iphouse.net/?p=2395</guid> <description><![CDATA[I&#8217;m venturing into territory that I don&#8217;t understand; disk scheduling algorithms in Linux. If you know more about this than I then please feel free to disabuse me of any mistaken notions, fundamental errors, or unfortunate statements that I may make in the blog post for future updates. This is something that I barely grasp <a href="http://blogs.iphouse.net/2012/02/03/here-there-be-storage-related-dragons/" class="more-link">More &#62;</a>]]></description> <content:encoded><![CDATA[<p>I&#8217;m venturing into territory that I don&#8217;t understand; disk scheduling algorithms in Linux. If you know more about this than I then please feel free to disabuse me of any mistaken notions, fundamental errors, or unfortunate statements that I may make in the blog post for future updates. This is something that I barely grasp but I like to explore and learn. So at the risk of my professional pride, and with the help of Wikipedia, here I go!</p><p>Changing your <a href="http://en.wikipedia.org/wiki/I/O_scheduling#Common_disk_I.2FO_scheduling_disciplines">disk scheduler</a> on a Linux virtual machine to increase performance.</p><p><strong><span id="more-2395"></span>First some background of what we do with storage at <a title="ipHouse" href="http://www.iphouse.com/">ipHouse</a> in our VMware environments.</strong></p><p>We really like <a href="http://en.wikipedia.org/wiki/Network_File_System_%28protocol%29">NFS</a>. Architecturally it&#8217;s simpler than block based storage; you just need a good local area network and a storage system that can export a file based protocol. There&#8217;s no need for specialized hardware or intelligent host bus adapters, just let the storage array handle the storage. Virtualization lends itself to file based storage quite well. VMDKs are just files after all. I kind of snickered when <a href="http://en.wikipedia.org/wiki/VMware">VMware</a> first came out with their <a href="http://www.vmware.com/products/vstorage-apis-for-array-integration/overview.html">VAAI</a> storage extensions. It seemed, to me, like they were enhancing block-level storage devices to do a lot of what <a href="http://en.wikipedia.org/wiki/Network-attached_storage">NAS</a> based storage already does.</p><p>While I was taking my VCP4 class my colleges, most of whom were from big companies, snickered when I mentioned that our storage was on a NAS. A &#8220;filer&#8221; for them was a place for document sharing and storage. There was &#8220;no way&#8221; it would ever be fast enough, or good enough to backend their virtualized infrastructure. I&#8217;ve seen that notion fade more and more as <a href="http://en.wikipedia.org/wiki/ZFS">ZFS</a> has opened the doors for storage startups; and the big players are fighting back with their own specialized NAS devices. There are some really cool ideas floating around: NAS devices that are scale-out, that are optimized for virtualization, and that can do in-line <a href="http://en.wikipedia.org/wiki/Data_deduplication#In-line_deduplication">deduplication</a> of data.</p><p><strong>That being said&#8230;</strong></p><p>I have learned that there are some OS level tweaks that <em>can</em> enhance performance on virtual machines. Most x86 operating systems seem to be optimized for single disks, or internal RAID setups. Understandable as that has traditionally been the bulk of their install base. This means that the OS can manage disk queuing better that the dumb RAID card, or the dumber hard drive. <a href="http://en.wikipedia.org/wiki/CFQ">CFQ</a>, the default disk scheduler as of kernel 2.6.18 does this. As I understand it CFQ breaks synchronous read/write requests into queues, and assigns <a href="http://en.wikipedia.org/wiki/Preemption_%28computing%29">timeslices</a> to each queue, weighted by IO priority. The effect is that higher priority processes get longer queues which keeps IO requests from the same process close together. Great idea when the OS has direct access and is managing the storage. Not so great when the storage is handled remotely; the array on the other side is doing the scheduling. All of that optimization is ostensibly ignored. So for a virtual machine it&#8217;s better to switch to a simpler algorithm and let the storage array handle the write queuing.</p><p>From my reading (and testing) It&#8217;s better to switch to the <a href="http://en.wikipedia.org/wiki/Noop_scheduler">noop</a> scheduler. Noop simply shoves all requests into a first-in-first-out (FIFO) queue and can merge requests. It is simple, fast, and is great for flash storage (no mechanical latency) or for situations where optimization is handled by another device. Like a NAS! Perfect for virtualization!</p><p>I discovered this after getting a snippet of a shell script to try from Mike (who got it from a potential vendor that is a big storage geek). This wasn&#8217;t new information as Mike had mentioned this almost 18 months ago in passing but neither he nor myself ever tested it. After giving me the info, again, he suggested that I &#8220;test this out, and let me know if it works.&#8221;.</p><p>I&#8217;m still testing it, so caveat emptor, but I thought I&#8217;d share it with you.</p><p><span style="text-decoration: underline;">***WARNING DO NOT DO THIS ON A VM WITH SNAPSHOTS***</span></p><pre>
#!/bin/sh

grep '' /sys/block/sd*/queue/scheduler
for d in /sys/block/sd*; do
echo noop &gt; $d/queue/scheduler
done
grep '' /sys/block/sd*/queue/scheduler
</pre><p>This switches the scheduler from cfq to noop on all &#8220;SCSI&#8221; disks in the virtual machine.</p><p>He also added the following tweak to increase the read-ahead from 256 sectors to 1000 sectors, which caches more disk data for faster read times, after printing what the OS has mounted.</p><pre>
#!/bin/sh

mount
blockdev --getra /dev/sd?
blockdev --setra 10000 /dev/sd?
blockdev --getra /dev/sd?
</pre><p>Again, I&#8217;m still testing this on my personal stuff, but, qualitatively, things feel a lot faster. If anything, I haven&#8217;t crashed my Linux systems.</p><p>Anyways, I hope that helps!</p> ]]></content:encoded> <wfw:commentRss>http://blogs.iphouse.net/2012/02/03/here-there-be-storage-related-dragons/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>The Value and Cost of Persistent Data</title><link>http://blogs.iphouse.net/2012/01/27/the-value-and-cost-of-persistent-data/</link> <comments>http://blogs.iphouse.net/2012/01/27/the-value-and-cost-of-persistent-data/#comments</comments> <pubDate>Fri, 27 Jan 2012 18:33:27 +0000</pubDate> <dc:creator>Nick Gasper</dc:creator> <category><![CDATA[Opinion]]></category> <category><![CDATA[System Administrators]]></category> <category><![CDATA[Virtual Machines]]></category> <category><![CDATA[Hosting]]></category> <category><![CDATA[Storage]]></category> <category><![CDATA[Virtualization]]></category> <category><![CDATA[vmForge]]></category><guid isPermaLink="false">http://blogs.iphouse.net/?p=2236</guid> <description><![CDATA[Most 'cloud' type systems don't offer persistent data by default and ends up being an extra cost item.]]></description> <content:encoded><![CDATA[<p>I&#8217;ve been cleaning out my house recently. There&#8217;s a lot of crud that&#8217;s just been lying around, collected through years. My wife describes me as a level 2 hoarder; she says that I would be a shoe-in for that <a href="http://en.wikipedia.org/wiki/Hoarders">A&amp;E show</a>. Going through many, many boxes that I&#8217;ve collected in the basement, I pick through each cord and think &#8220;I might need that.&#8221; I won&#8217;t need it though, so with a small mental push, I put it in the trash bag. Persistent data is a lot like that. A lot of companies have, either through policy or inertia, tons of useless information sitting on disks, or tapes, or CDs, that may be useful one day, but probably will not ever be.</p><p><span id="more-2236"></span></p><p>I look at many cloud providers and I see the opposite. Their services were designed for expedience instead of permanence. They make it hard and, at times, very expensive to actually keep data around. Usually you have to attach a &#8220;disk&#8221; (or &#8220;volume&#8221;) to any machine that has data you want to keep and you have to pay for that privilege. You also better have backups because you have no idea about the underlying storage or <a href="http://en.wikipedia.org/wiki/Data_retention">data retention policies</a>.</p><p>Any data that you absolutely need could mean you&#8217;re paying two or three times what you&#8217;d expect in order to keep it.</p><p>To my hoarder eyes the cloud is one big data furnace. It&#8217;s a dangerous place for your information to stay.</p><p>Enterprise data storage is expensive. I&#8217;ve often joked that <a href="http://en.wikipedia.org/wiki/Virtualization">virtualization</a> is a scheme to sell storage arrays. It&#8217;s a tricky game of performance, space, and <a href="http://en.wikipedia.org/wiki/RAID">redundancy</a>. Disks fail, <a href="http://en.wikipedia.org/wiki/Flash_memory">flash</a> is expensive, you never have enough RAM or CPU. There are dozens of types of arrays for hundreds of applications, retention policies, regulations; it&#8217;s a mess! When you have a service that has hundreds of thousands of customers then it may make sense that you discourage persistent data. You want people to consume your resources, pay their bill, and move on. Expedience instead of permanence. I&#8217;ve often been asked: Why online storage is so expensive when hard drives are so cheap? Well, this is why.</p><p>We built the <a title="ipHouse" href="http://www.iphouse.com/">ipHouse</a> <a title="ipHouse vmForge Products, virtual data centers or individual virtual machines" href="http://www.iphouse.com/vmforge/" target="_blank">vmForge</a> product with the idea that a virtual data center (VDC) replaces co-located infrastructure. The storage is persistent from the get-go. Is it any wonder that Mike has been loath to call it a &#8216;cloud service&#8217;?</p><p>This means that there are severe implications for any storage array that we put in place. We have to make sure that anything we put in place not only performs well but also goes the distance. It&#8217;s still a very good idea to do backups, though they probably will not be nearly as large, as most customers just need to back up a few key files or the database dumps that happen regularly. (you are backing up your database, right?)</p><p>Well, that&#8217;s my opinion anyways. Now I&#8217;m going to go back home and work on my basement.</p> ]]></content:encoded> <wfw:commentRss>http://blogs.iphouse.net/2012/01/27/the-value-and-cost-of-persistent-data/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>Clone-tastic!</title><link>http://blogs.iphouse.net/2012/01/20/clone-tastic/</link> <comments>http://blogs.iphouse.net/2012/01/20/clone-tastic/#comments</comments> <pubDate>Fri, 20 Jan 2012 21:51:17 +0000</pubDate> <dc:creator>Nick Gasper</dc:creator> <category><![CDATA[ipHouse Products]]></category> <category><![CDATA[Opinion]]></category> <category><![CDATA[System Administrators]]></category> <category><![CDATA[Virtual Machines]]></category> <category><![CDATA[Hosting]]></category> <category><![CDATA[Virtualization]]></category> <category><![CDATA[vmForge]]></category><guid isPermaLink="false">http://blogs.iphouse.net/?p=2174</guid> <description><![CDATA[There are many things about virtualization is the ability to clone virtual machines. It&#8217;s really cool! Unfortunately, after you work with virtualization for a while you start to take it for granted. I can&#8217;t tell you how many times I roll out a new physical machine and sigh because I can&#8217;t simply clone it. Well, <a href="http://blogs.iphouse.net/2012/01/20/clone-tastic/" class="more-link">More &#62;</a>]]></description> <content:encoded><![CDATA[<p>There are many things about virtualization is the ability to clone virtual machines. It&#8217;s really cool! Unfortunately, after you work with virtualization for a while you start to take it for granted. I can&#8217;t tell you how many times I roll out a new physical machine and sigh because I can&#8217;t simply clone it. Well, I can but that&#8217;s a discussion for another day.<br /> <span id="more-2174"></span> Virtual machines are a set of files that are interpreted by a hypervisor.  Since they are just files they can then be copied and/or edited. That&#8217;s all cloning is, the system is just copying the VMDKs (the &#8220;hard drive&#8221; files) and editing the VMX file (the config file to change things like the MAC address of a NIC and the virtual machine&#8217;s name).</p><p>You can even do it by hand if you have access to the backend storage. Mike once one-upped me by piping the VMX through sed. That&#8217;s cheating but all&#8217;s fair I guess. Cheater.</p><p>The vmForge VDC allows you to clone vApps and the individual machines contained therein. It automatically edits the config, can handle numbering the machine, and makes everything nice and easy. This is a killer feature in my book.</p><p>A lot of cloud providers are instance based. You select the operating system, push it out, and rely on automated services to configure them for you. Most of the time, you don&#8217;t get persistent storage. If you do, it&#8217;s usually a volume you attach to the instance and has nothing to do with its operating system. By using a vmForge VDC you can do the opposite. You can create a machine, configure it how you like, and then clone it. Configure once, and be done. Then you can keep a copy of it in your catalog for later deployments. Each clone is exactly that: a complete copy of your original system.</p><p>You may think that&#8217;s really cool! But wait, there&#8217;s more! (sorry, couldn&#8217;t resist)</p><p>When you build virtual machines in your VDC you are building them in vApps. A vApp is a logical container that holds virtual machines, internal networks, and can do things like set boot/shutdown order and power-down semantics.</p><p>When creating a vApp you also have the option to &#8220;fence&#8221; it. Fencing isolates the layer-2 networks within the vApp from any outside network. This means you can have internally consistent ip addressing inside the vApp. You can then &#8220;template&#8221; the vApp by moving it to your catalog and deploy it over and over and over again. That means that your preconfigured, multi-server application can be redeployed with a few mouse clicks!</p><p>Ultimately, cloning is about saving time. You get to use conventional tools to set up and multiple machines quickly and easily. You don&#8217;t have to learn any arcane scripting language, nor trust and maintain a complicated configuration service like Chef or Puppet. You just set up servers, push them out, and start to use them.</p><p>So, clone away!</p> ]]></content:encoded> <wfw:commentRss>http://blogs.iphouse.net/2012/01/20/clone-tastic/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>FreeBSD 9 and ZFS version 28, THANK YOU!</title><link>http://blogs.iphouse.net/2012/01/18/freebsd-9-and-zfs-version-28-thank-you/</link> <comments>http://blogs.iphouse.net/2012/01/18/freebsd-9-and-zfs-version-28-thank-you/#comments</comments> <pubDate>Wed, 18 Jan 2012 21:05:37 +0000</pubDate> <dc:creator>Nick Gasper</dc:creator> <category><![CDATA[System Administrators]]></category> <category><![CDATA[technology]]></category> <category><![CDATA[Virtualization]]></category> <category><![CDATA[vmForge]]></category><guid isPermaLink="false">http://blogs.iphouse.net/?p=2121</guid> <description><![CDATA[I have great excitement to share about FreeBSD 9 and ZFS version 28 being released.Read my thoughts in this blog post.]]></description> <content:encoded><![CDATA[<p>I&#8217;m excited! My favorite operating system, <a title="FreeBSD - the power to serve!" href="http://www.freebsd.org/" target="_blank">FreeBSD</a>, has gotten an upgrade! There are a lot of small changes but the big one (the one that I&#8217;m excited about) is getting <a title="ZFS - the zettabyte filesystem" href="http://en.wikipedia.org/wiki/ZFS" target="_blank">ZFS</a> version 28 into the kernel.</p><p>ZFS Version 28 adds some of the more important features of ZFS: Deduplication, triple parity RAIDZ3, and RAIDZ. This means that I can have full featured storage devices, running ZFS natively, via FreeBSD.</p><p>As a bonus I don&#8217;t have to learn Solaris.</p><p><span id="more-2121"></span>You can run ZFS in Linux but you would either have to run via <a title="Filesystem in Userspace" href="http://en.wikipedia.org/wiki/Filesystem_in_Userspace" target="_blank">FUSE</a> which is file system emulation in user-space, not in the kernel. Or download it and build it yourself. In my opinion, both of those options are idiotic. I&#8217;m not willing to jump through those kind of hoops just to run a filesystem in Linux. I&#8217;d rather have native, in kernel support for it. Until now, your choice was either run Solaris (or a fork of Solaris) or run an outdated version of ZFS via FreeBSD.</p><p>One project that should directly benefit of this: FreeNAS. FreeNAS is a customized installation of FreeBSD designed to operate as a NAS and iSCSI SAN. It has a pretty slick ajax/web interface as of version 8 but so far had missed out on key ZFS features.</p><p>One reason I want run up FreeBSD 9 and ZFS is to better learn ZFS troubleshooting and administration. FreeNAS aside, there are a lot of vendor supported storage devices that are coming into the market based on ZFS. I want to troubleshoot those devices on a lower level. Before this, I would have to install Solaris. This means that I would actually have to navigate to Oracle&#8217;s Website. No thank you.</p><p>In-line deduplication is on of my favorite impractical features of all time. It unfortunately, required gobs of memory (8 GB RAM for every 1 TB of storage, if memory serves) Hopefully, someone smart will figure out how to do it on flash, in a practical way, as rebuilding those tables after a power failure would suck. (see Mike&#8217;s <a title="Searching for Storage: Tegile" href="http://blogs.iphouse.net/mike/2012/01/searching-for-storage-tegile/" target="_blank">post</a> about Tegile &#8211; a company actually doing such in production today)</p><p>Obviously I don&#8217;t know a lot about ZFS yet which is why I&#8217;m glad I get to learn via FreeBSD.</p><p>If only I could convince ipHouse to give me a little more storage space on my personal VDC&#8230;hint hint!</p> ]]></content:encoded> <wfw:commentRss>http://blogs.iphouse.net/2012/01/18/freebsd-9-and-zfs-version-28-thank-you/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>Monitoring, a journey</title><link>http://blogs.iphouse.net/2012/01/09/monitoring-a-journey/</link> <comments>http://blogs.iphouse.net/2012/01/09/monitoring-a-journey/#comments</comments> <pubDate>Mon, 09 Jan 2012 16:55:38 +0000</pubDate> <dc:creator>Nick Gasper</dc:creator> <category><![CDATA[ipHouse Products]]></category> <category><![CDATA[Opinion]]></category> <category><![CDATA[Virtual Machines]]></category> <category><![CDATA[IPv6]]></category> <category><![CDATA[Monitoring]]></category> <category><![CDATA[technology]]></category> <category><![CDATA[Virtualization]]></category> <category><![CDATA[vmForge]]></category><guid isPermaLink="false">http://blogs.iphouse.net/?p=2080</guid> <description><![CDATA[Or &#8220;How I Stopped Worrying and Learned to Love SaaS&#8221; I touched on monitoring in an earlier post but I thought that I would expand on my thoughts. Let me just get this out there: LogicMonitor (company site) is awesome. It&#8217;s not perfect (what is?), but it&#8217;s amazing, simple, straightforward, and it works. It combines effective monitoring with graphing <a href="http://blogs.iphouse.net/2012/01/09/monitoring-a-journey/" class="more-link">More &#62;</a>]]></description> <content:encoded><![CDATA[<p>Or &#8220;How I Stopped Worrying and Learned to Love SaaS&#8221;</p><p>I touched on monitoring in an earlier <a title="Infrastructure and Other Games, Part 4" href="http://blogs.iphouse.net/2011/12/08/infrastructure-and-other-games-part-4/">post</a> but I thought that I would expand on my thoughts.</p><p>Let me just get this out there: <a title="ipHouse monitoring service powered by LogicMonitor" href="http://www.iphouse.com/monitoring.html">LogicMonitor</a> (<a title="LogicMonitor - ipHouse likes it!" href="http://www.logicmonitor.com/">company site</a>) is awesome. It&#8217;s not perfect (what is?), but it&#8217;s amazing, simple, straightforward, and it works. It combines effective monitoring with graphing (metrics); it&#8217;s easy to understand and customize and it works.</p><p>Repeat: It works.<br /> <span id="more-2080"></span><br /> I&#8217;ve done some work with other monitoring and graphing/measurment solutions; mostly <a title="Zabbix agent-based monitoring" href="http://www.zabbix.com/">Zabbix</a>, <a title="Nagios, commercial and open source monitoring tools" href="http://www.nagios.org/">Nagios</a>, and <a title="Cacti - open source measurement tool" href="http://www.cacti.net/">Cacti</a>. They all have their strengths and weaknesses. LogicMonitor also has it&#8217;s plusses and minuses but all in all it works amazingly well with the number of minuses to be very small.</p><p>Nagios has, in my opinion, the best monitoring engine. The automatic back off and flap detection combined with per-host customization that can happen in Nagios has not been matched yet. However, configuring Nagios is a nightmare. I got really good at it and I don&#8217;t want to ever do it again. Looking at a blank Nagios setup makes me cringe. Tools like <a title="NagioSQL is an open source web based editor for Nagios configuration" href="http://www.nagiosql.org/">NagioSQL</a> help but it&#8217;s still ridiculous. Using Nagios as a customer facing solution would take up too much time and my time is precious to me and our business.</p><p>Cacti is not a monitoring system but it is a great graphing solution, unless your <a title="RRDtool is a data storage type used by many open source tools" href="http://oss.oetiker.ch/rrdtool/">RRD</a> data gets corrupted or lost. Now, that doesn&#8217;t happen much, but when it does, it&#8217;s annoying.</p><p>Zabbix is a great all in one system with a horrible interface. I hate to quibble, I still use Zabbix but I get headaches everytime I try to do something. The top down task selection with a history at the bottom is counterintuitive. Getting Zabbix to send out alerts is a chore. And requires per-host agents for different operating systems while the SNMP interface works well only if the device you are monitoring fits within the very small pre-configured templates that come with the package. Yes, I can build new templates, repeatedly but LogicMonitor does this without requiring extra time.</p><p>With our recently launched <a title="ipHouse vmForge virtualization services for virtual data centers and individual virtual machines" href="http://www.iphouse.com/vmforge/">vmForge</a> service offering, we wanted to add an excellent and easy to implement monitoring solution. It was something that we wanted to be able to set up for customers easily while also offering something that they could set up and manage themselves.</p><p><a title="Mike Horwath's articles on blogs.iphouse.net" href="http://blogs.iphouse.net/author/mike/">Mike</a> did quite a bit of digging but didn&#8217;t find anything that fit the bill entirely. Until he stumbled on LogicMonitor.</p><p>It initialy attracted our attention because it was network agent based. This allows us to put agents behind firewalls and NAT configurations without worrying about all of the details. The agent just requires outbound connectivity over HTTPS.</p><p>We decided to give it a try and we were instantly impressed! It automatically detects available datasources and adds threshold points and instrumentation graphing of operations in a single view. We can add rules and chains for alerting the engineering staff. It has a lot of features laid out in an easy to understand way. It uses SNMP, vendor APIs, and WMI depending on the target host.</p><p>It makes sense so we  fired up an evaluation and not long after signed up for services for our own use.</p><p>The developers of LogicMonitor have been great to work with. They have been open to feedback, excited to test things that they haven&#8217;t come across before. We receive queries on how a specific type of device should be measured and bug reports are handled professionally and efficiently.</p><p>The only thing that I don&#8217;t like is that the agent requires Java but that&#8217;s the cost of convienence.</p><p>The only things missing right now are support for IPv6 (which can&#8217;t come too soon) and a back off ability with flap detection. (spouses are happier when not woken up to dropped detection events)</p><p>Oh well, it&#8217;s still better than editing Nagios files!</p><p>I&#8217;m looking forward to working with LogicMonitor further and I highly recommend them.</p> ]]></content:encoded> <wfw:commentRss>http://blogs.iphouse.net/2012/01/09/monitoring-a-journey/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>Found Script: Backing up data with BASH and XZ</title><link>http://blogs.iphouse.net/2012/01/05/found-script-backing-up-data-with-bash-and-xz/</link> <comments>http://blogs.iphouse.net/2012/01/05/found-script-backing-up-data-with-bash-and-xz/#comments</comments> <pubDate>Thu, 05 Jan 2012 16:41:49 +0000</pubDate> <dc:creator>Nick Gasper</dc:creator> <category><![CDATA[Opinion]]></category> <category><![CDATA[System Administrators]]></category> <category><![CDATA[technology]]></category><guid isPermaLink="false">http://blogs.iphouse.net/?p=1998</guid> <description><![CDATA[Quick post this week as I&#8217;m a little distracted by the impending weekend. Here&#8217;s a script that I use to backup my Maildir. Normally I write my own, but I was feeling lazy. So in the interest of not re-inventing the wheel, I decided to copy one. Unfortunately, I forgot where I got it. Normally <a href="http://blogs.iphouse.net/2012/01/05/found-script-backing-up-data-with-bash-and-xz/" class="more-link">More &#62;</a>]]></description> <content:encoded><![CDATA[<p>Quick post this week as I&#8217;m a little distracted by the impending weekend.</p><p>Here&#8217;s a script that I use to backup my Maildir. Normally I write my own, but I was feeling lazy. So in the interest of not re-inventing the wheel, I decided to copy one. Unfortunately, I forgot where I got it. Normally I&#8217;m good about attributing things that I copy/utilize. So, if this is your script, thank you! If you want attribution, please let me know.</p><p>This is just a bash script. It creates a xz&#8217;d tar file in my /home/nick/Archives directory/</p><p>First, check for input.&#8217;</p><pre>#----script to backup files

if [ $# -lt 1 ]
then
   echo Usage: backup.sh Directory
   exit
fi</pre><p>Next, create the date stamp.</p><pre>JJJ=`date '+%d%m%y'`</pre><p>This is the main loop. It processes for each in the attribute list ($#)</p><pre>while [ $# -gt 0 ]
do</pre><p>Print the directory we&#8217;re currently working on, then set the DIR variable to that directory&#8217;s path.</p><pre>   echo $1
   DIR=$1</pre><p>Truncate the directory path of the DIR variable</p><pre>   MODDIR=`basename $DIR`</pre><p>Verify the directory exists before doing anything</p><pre>   #----check file exists
   if [ ! -d $DIR ]
   then
      echo Error: File \'$MODDIR\' not found!!</pre><p>Check to see if the destination folder exists. This is where I hard coded it to be /home/user/Archives. If it doesn&#8217;t, create the backup locally.</p><pre>   else
      if [ -d ~/Archives ]
      then
        DESTTAR=~/Archives/$MODDIR.$JJJ.tar.xz
      else
        DESTTAR=$MODDIR.$JJJ.tar.xz
      fi</pre><p>Check to see of the destination file exists. If it does, complain and exit.</p><pre>    if [ -d $DESTTAR ]
        then
                echo Error: File \'$DESTTAR\' already exists!!
                exit 0</pre><p>Tar the archive using the<a href="http://en.wikipedia.org/wiki/Lempel%E2%80%93Ziv%E2%80%93Markov_chain_algorithm"> xz compression method </a>(the &#8220;J&#8221; attribute in the tar command)</p><pre>    fi
      tar -cJf $DESTTAR $DIR
   fi
   shift
done</pre><p>And we&#8217;re done! At some point I&#8217;ll add an auto-pruning portion to the script.</p><p>I just stick the script in my personal bin folder, make it executable, edit my crontab and put an entry in it pointing to my Maildir, and I get nightlly backups of all my mail.</p> ]]></content:encoded> <wfw:commentRss>http://blogs.iphouse.net/2012/01/05/found-script-backing-up-data-with-bash-and-xz/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>Kickstart your Linux install</title><link>http://blogs.iphouse.net/2011/12/30/kickstart-your-linux-install/</link> <comments>http://blogs.iphouse.net/2011/12/30/kickstart-your-linux-install/#comments</comments> <pubDate>Fri, 30 Dec 2011 19:54:20 +0000</pubDate> <dc:creator>Nick Gasper</dc:creator> <category><![CDATA[Opinion]]></category> <category><![CDATA[System Administrators]]></category> <category><![CDATA[Virtual Machines]]></category> <category><![CDATA[Hosting]]></category> <category><![CDATA[Security]]></category> <category><![CDATA[Virtualization]]></category> <category><![CDATA[vmForge]]></category><guid isPermaLink="false">http://blogs.iphouse.net/?p=1984</guid> <description><![CDATA[I&#8217;ll admit it, I&#8217;m not a huge fan of Red Hat Enterprise Linux. I&#8217;ll administer it, I&#8217;ve worked with it. It&#8217;s a good distribution. I just have a bad taste for RPM based distributions based on my first forays into Linux back in my Mandrake days. I also first started to professionally work with Linux <a href="http://blogs.iphouse.net/2011/12/30/kickstart-your-linux-install/" class="more-link">More &#62;</a>]]></description> <content:encoded><![CDATA[<p>I&#8217;ll admit it, I&#8217;m not a huge fan of Red Hat Enterprise Linux. I&#8217;ll administer it, I&#8217;ve worked with it. It&#8217;s a good distribution. I just have a bad taste for RPM based distributions based on my first forays into Linux back in my Mandrake days. I also first started to professionally work with Linux during the last couple of years of RHEL 5, when things were getting long in the tooth. Red Hat&#8217;s release schedule also conflicts with what most of my users want and expect; it&#8217;s far more suited to an corporate environment where having the latest features is not nearly as important as having consistent software versions. That being said, Red Hat has some fantastic tools; Anaconda and Kickstart being my favorite. So I was overjoyed when I discovered Ubuntu had support for Kickstart files! The Ubuntu installer can take Debian style preseed directives but in my opinion is overly complicated.</p><p>A Kickstart file basically answers the questions that pop up in the installer as the installer goes removing the need for human interaction. If an question isn&#8217;t answered, the installer pops up with the proper dialog, takes user input, and continues. I can pick and choose what information I want to populate automatically and which information dialogs I want the customer to answer. In my auto install ISOs I prompt the customer for a username and password as I want the users to enter that information.</p><p>When I was tasked with making an auto installing ISO for our customers I was able to create one quickly by using a kickstart file.<br /> <span id="more-1984"></span></p><p>The process of making a CD is a bit verbose, and better handled by some of the how-tos out there.</p><p>But I&#8217;ll take your through my Kickstart file.</p><p>First are some of basic information about the system. These are fairly self-explanatory.</p><pre>platform=AMD64
#System language
lang en_US
#Language modules to install
langsupport en_US
#System keyboard
keyboard us
#System mouse
mouse none
#System timezone
timezone America/Chicago</pre><p>I disable root, to reflect the Ubuntu default. You can enable it by removing the next line, and setting it with the second.</p><pre>rootpw --disabled
#rootpw jpDhuZtql4of4rfq</pre><p>I do not automatically add a user, but you can with the next line.</p><pre>#user johndoe --fullname "John Doe" --password changeme</pre><p>I don&#8217;t think this does much in an Ubuntu Server install but I put it in anyways.</p><pre>#Use text mode install
text</pre><p>We&#8217;re installing not upgrading.</p><pre>#Install OS instead of upgrade
install</pre><p>Use the CD-ROM.</p><pre>#Use CDROM installation media
cdrom</pre><p>Where are we going to put the bootloader?</p><pre>#System bootloader configuration
bootloader --location=mbr</pre><p>Get rid of any existing partitions.</p><pre>#Partition clearing information
clearpart --all --initlabel</pre><p>Partition the disks using Ubuntu defaults (512MB swap, etc) This allows the ISO to work on whatever size disk you want. Linux isn&#8217;t great about using swap anyways, so 512 is plenty.</p><pre>#Disk partitioning information
part /boot --fstype ext3 --size=200 --ondisk=hda
part swap --recommended
part / --fstype ext4 --size 1 --grow</pre><p>Passwd information. I know&#8230; MD5&#8230; You can use something more secure if you wish.</p><pre>#System authorization infomation
auth  --useshadow  --enablemd5</pre><p>We need DHCP for some of the following steps, as I have no idea what type of network this will be run on. You can specify other info here if you want.</p><pre>#Network information
network --bootproto=dhcp --device=eth0</pre><p>My customers hate having UFW on. I don&#8217;t think this actually works yet in Ubuntu, so I also do it in a later script.</p><pre>#Firewall configuration
firewall --disabled</pre><p>X-Windows on a Server? No thanks.</p><pre>#Do not configure the X Window System
skipx</pre><p>And finally, we want to reboot after installing. This isn&#8217;t actually done, as we&#8217;re going to run a post-install script.</p><pre>#Reboot after installation
reboot</pre><p>Add additional packages to install. I install the fewest here, as I update in a later script, so why install a bunch of stuff only to update it later?</p><pre>%packages
@dns-server
@openssh-server
gcc
build-essential</pre><p>Here comes a a post install script.</p><pre>%post</pre><p>Mount the CD again, as there&#8217;s data we want off of the CD.</p><pre>echo Making CD Mountpoint
mkdir -p /mnt/cdrom
echo Mounting CD
mount -t iso9660 /dev/sr0 /mnt/cdrom</pre><p>Copy over a script that I&#8217;ve written that does updates and additional installs when the virtual machine is first booted.</p><pre>echo Copying Firstboot Script
cp /mnt/cdrom/firstboot /etc/init.d/
chmod +x /etc/init.d/firstboot</pre><p>Updated the init structure to run the firstboot script on boot.</p><pre>update-rc.d firstboot defaults
echo Adding new Crontab</pre><p>Add a custom crontab with some randomized sleep values.</p><pre>cp /mnt/cdrom/crontab-template /etc/crontab</pre><p>A script that I wrote that edits resolv.conf to point to the local bind server</p><pre>echo Copying resolvfix init script
cp /mnt/cdrom/resolvfix /etc/init.d/
chmod +x /etc/init.d/resolvfix
update-rc.d resolvfix start 99 2 3 4 5 .</pre><p>An updated sources.list with a closer mirror.</p><pre>echo Copying Apt Sources
cp /mnt/cdrom/geeks-org-sources.list /etc/apt/sources.list</pre><p>A new dhclient with the local bind server seeded.</p><pre>echo Copying dhclient.conf
cp /mnt/cdrom/dhclient.conf /etc/dhcp3/</pre><p>A new named.conf.options with some useful defaults.</p><pre>echo Copying named.conf.options
cp /mnt/cdrom/named.conf.options /etc/bind/</pre><p>Moving over vmware-tools for installation upon first boot.</p><pre>mkdir /vmware
cd /vmware
echo Extracting Tools
tar zxf /mnt/cdrom/VMwareTools-*.tar.gz</pre><p>Ejecting the CD.</p><pre>echo Unmounting CD
umount /mnt/cdrom</pre><p>Update the system.</p><pre>echo Updating
apt-get update
apt-get -y dist-upgrade</pre><p>And finally, reboot the system (sync for good luck ;) ).</p><pre>echo Rebooting
sync
reboot</pre><p>Now, as I mentioned before, there&#8217;s a firstboot script that I run that does quite a bit of work before the machine is finished. It does things like wipe out the SSH keys, install VMware Tools, remove and purge old kernels and install applications like MySQL, Apache, as required.</p><p>Well, that&#8217;s one of the tricks I have tucked up my sleeve, I hope it helps!</p> ]]></content:encoded> <wfw:commentRss>http://blogs.iphouse.net/2011/12/30/kickstart-your-linux-install/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>SysAdmin Golf: Use dd and netcat to clone a Linux machine</title><link>http://blogs.iphouse.net/2011/12/09/sysadmin-golf-use-dd-and-netcat-to-clone-a-linux-machine/</link> <comments>http://blogs.iphouse.net/2011/12/09/sysadmin-golf-use-dd-and-netcat-to-clone-a-linux-machine/#comments</comments> <pubDate>Fri, 09 Dec 2011 20:52:42 +0000</pubDate> <dc:creator>Nick Gasper</dc:creator> <category><![CDATA[ipHouse Products]]></category> <category><![CDATA[Opinion]]></category> <category><![CDATA[System Administrators]]></category> <category><![CDATA[Virtual Machines]]></category> <category><![CDATA[SysAdmin Golf]]></category> <category><![CDATA[Virtualization]]></category> <category><![CDATA[vmForge]]></category><guid isPermaLink="false">http://blogs.iphouse.net/?p=1811</guid> <description><![CDATA[So, we&#8217;ve been working real hard here at ipHouse figure out ways to help customers move machines into our vmForge VDC product. VMware Converter works for Windows machines, (allegedly, I&#8217;m going test it soon) but isn&#8217;t so helpful with Linux machines. After wracking my brain, I thought about the various tools used to clone Linux <a href="http://blogs.iphouse.net/2011/12/09/sysadmin-golf-use-dd-and-netcat-to-clone-a-linux-machine/" class="more-link">More &#62;</a>]]></description> <content:encoded><![CDATA[<p>So, we&#8217;ve been working real hard here at ipHouse figure out ways to help customers move machines into our <a href="http://www.iphouse.com/vmforge/vdc.html">vmForge VDC</a> product. <a href="http://www.vmware.com/products/converter/">VMware Converter</a> works for <a href="http://windows.microsoft.com/en-US/windows/home">Windows</a> machines, (allegedly, I&#8217;m going test it soon) but isn&#8217;t so helpful with Linux machines. After wracking my brain, I thought about the various tools used to clone Linux boxes. I&#8217;m familiar with dd, a block level disk copying tool, and tried to find a way to use dd to create a VMDK, that I could then convert into a OVF and upload. <span id="more-1811"></span></p><p>Then I stumbled on this link (<a href="http://conshell.net/wiki/index.php/Linux_P2V">conshell.net</a>) which explains how to use dd and netcat to copy a disk over a network.</p><p>It was so simple, it verged on genius! But did it work?</p><p>The steps are easy:</p><p>1) Create a virtual machine with  a disk about the same size or larger than your source (not smaller)</p><p>Pick an arbitrary port, (9001 in this example) and set up your firewall or VSE to allow that port to the target machine.</p><p>2) Boot that new VM into a rescue environment or use a <a href="http://en.wikipedia.org/wiki/Live_CD">live cd</a>.</p><p>3) Use the following commands:</p><p>On the VM: <code>nc -l -p 9001 | dd of=/dev/sda</code></p><p>On your source machine: <code> dd if=/dev/sda | nc  9001</code></p><p>4) Wait a long time&#8230; I averaged around 15Mbps from my test machine to my new VM, it ranged from 30Mbps down to 7Mbps. I&#8217;m sure that had more to do with my network than anything. Still, this can take a while.</p><p>5) Once the dd has completed (dd will dump summary information) reboot the machine back into the live-cd environment, check the partitions with <code>e2fsck</code> the partitions and re-size them. (I cheated and used <code>gparted</code>)</p><p>6) At this point you can either mount the filesystem and remove the udev rules (in /etc/udev/rules.d/) or boot into your VM and remove them via the console. Either way, you have to reboot after the udev rule are removed.</p><p>7) Reboot, and voilà!</p><p>The live cd I used was <a href="http://www.cdlinux.info/wiki/doku.php/">CDLinux</a>. It&#8217;s a small Linux distribution that runs <a href="http://www.xfce.org/">XFCE</a>, and fits in an 80MB ISO. It also includes an SSH server, so you can set up an ssh tunnel, and use netcat against that rather than use an arbitrary port. It also has the VMware paravirtual scsi drivers.</p><p>Anyways, this worked. Wow did it work. I didn&#8217;t bother to zero out the remaining space on the disk, it took me about 2.5 hours to move 8GB worth of data but I was greeted with a familiar prompt in a new place as soon as I booted it up.</p><p>Now a couple of caveats. I did this on a running system, with no prep work. I would recommend trimming unnecessary data and shutting down as many services as you can. It&#8217;s best to do this when the machine is &#8220;down,&#8221; not doing anything beyond facilitating the copy. However, it does work on a live system. Still, if I were moving a production system, I would follow the advice in the linked article above.</p><p>But, it was my system, in a test environment, so I didn&#8217;t really care.</p><p>Still, isn&#8217;t it amazing what a couple of UNIX pipes can do?</p> ]]></content:encoded> <wfw:commentRss>http://blogs.iphouse.net/2011/12/09/sysadmin-golf-use-dd-and-netcat-to-clone-a-linux-machine/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>Infrastructure and Other Games, Part 4</title><link>http://blogs.iphouse.net/2011/12/08/infrastructure-and-other-games-part-4/</link> <comments>http://blogs.iphouse.net/2011/12/08/infrastructure-and-other-games-part-4/#comments</comments> <pubDate>Thu, 08 Dec 2011 20:22:40 +0000</pubDate> <dc:creator>Nick Gasper</dc:creator> <category><![CDATA[ipHouse Products]]></category> <category><![CDATA[Opinion]]></category> <category><![CDATA[System Administrators]]></category> <category><![CDATA[DNS]]></category> <category><![CDATA[Monitoring]]></category> <category><![CDATA[Virtualization]]></category> <category><![CDATA[vmForge]]></category><guid isPermaLink="false">http://blogs.iphouse.net/?p=1747</guid> <description><![CDATA[Part 4: The Other Stuff Thanks for reading my series on moving from my single all-in-one server and my small ESXi server to ipHouse&#8217;s vmForge VDC product. I previously discussed moving my websites to a virtual webcluster, and moving email to a virtual mailcluster. Now I just had to move three small servers, and install <a href="http://blogs.iphouse.net/2011/12/08/infrastructure-and-other-games-part-4/" class="more-link">More &#62;</a>]]></description> <content:encoded><![CDATA[<h3>Part 4: The Other Stuff</h3><p>Thanks for reading my series on moving from my single all-in-one server and my small ESXi server to ipHouse&#8217;s <a title="vmForge Virtual Data Center" href="http://www.iphouse.com/vmforge/">vmForge VDC</a> product. I previously discussed moving my websites to a virtual webcluster, and moving email to a virtual mailcluster. Now I just had to move three small servers, and install a third.</p><p>The first server I moved was a small experimental VM used for testing various network, web and other items. I like to have dedicated testing environment for every operating system that I professionally run. This server was responsible for my personal <a href="http://en.wikipedia.org/wiki/Teredo_tunneling">Teredo</a> tunneling, and was the one I put my CGI testing on from awhile a go. I could have easily moved it, but I wanted see how the export/import from ESXi to vmForge worked. I stopped the machine on my ESXi server, downloaded it as a OVF and uploaded it, via my Windows machine, to my catalog. It imported it as a template. I then deployed the template and deleted the server. It worked flawlessly! All I had to do renumber the machine and I was done.<span id="more-1747"></span></p><p>The next server was a little more complicated. It was originally a CounterStrike:Source server that I had converted into a Apache Tomcat JSP host. Because it already had a working Java setup, I added an <a href="http://www.igniterealtime.org/projects/openfire/">OpenFire</a> Jabber server, and a <a href="http://www.logicmonitor.com/">LogicMonitor</a> agent to it. This gave me the ability to monitor my internal network from LogicMonitor, a monitoring solution that we&#8217;re looking into. The triple Java duties of this machine, unfortunately, put a big crunch on its RAM, so that took a lot of tweaking on the application level to get them to play nicer with each other.</p><p>The next server was a monitoring server that I had set up running <a href="http://www.zabbix.com/">Zabbix</a>. I had previously gotten Nagios working on it, but it was too burdensome for me to maintain. I also liked having graphing and service level alerting as well as agent based checks, both active and passive. The biggest problem with Zabbix was getting it initially set up to send alerts, so it was nice to be able to import this machine, that had a working base, than to start from scratch. LogicMonitors does pretty much everything that Zabbix does, and better, but why not have two monitoring solutions? I also set up that machine to be a centralized logging server if I ever want to install a log analyzer like <a href="http://www.splunk.com/">Splunk</a>. I set it to copy the logs to a MySQL database, and to run php-logcon, but that didn&#8217;t scale past a few thousand entries.</p><p>Next was installing a FreeBSD server to act as a centralized tool, mail environment, and storage space for myself and my friends. I love FreeBSD, the only reason I set up my other servers as Linux boxes was pure laziness on my part, which I&#8217;ll pay for later in administration time. Also, they are mostly single purpose appliances, and it&#8217;s nice to have some of the Debian style scripting for web built-in. I try to stay fairly OS agnostic, but I do have preferences.</p><p>Since my shell server would have the most exposure to the internet, so I wanted a relatively secure system. Also, I would be spending most of my time in that server, so I decided to go with the OS I love. That would also bring things full circle, as my pfSense box and Shell server are both FreeBSD.</p><p>I decided on installing FreeBSD 8.2 stable. I sliced my disks like this:</p><pre>/           512MB
swap        1GB (1x Memory)
/usr        5GB
/var        10GB (Modest space for DB and info)
/home       140GB (An egregious space for storing files)</pre><p>I installed the OS and ports, and I switched from <code>cvsup</code> to <code>csup</code> awhile ago, and updated my ports-supfile and stable-supfiles to point to a local(ish) mirror, and checked out /usr/src and /usr/ports. I then updated my kernel config (Tip: compile without debugging if you want it to fit in 512MB ) reinstalled, and rebooted. Voila! A new FreeBSD system. I&#8217;ll probably go into doing a comprehensive FreeBSD install in a later post.</p><p>I installed Postfix and Dovecot2 for local mail, Apache 2 for user directories, and migrated my users information, passwords, and home directories from my old server. Everything went surprisingly smooth. I installed Mutt for myself, Alpine for one of my users, and a few other pieces of software, and I had a fully running shell server. I was going to run <a href="http://www.powerdns.com/content/home-powerdns.html">PowerDNS</a> and PowerAdmin on one of my Linux boxes, but I decided to stick with BIND on the FreeBSD server, as it was more efficient for me to edit text files than use a web interface. Weird, I know. Now that my shell server was done, and everything was migrated, I could turn off my old FreeBSD box. I admit that I did feel a little bad as I typed <code>halt</code> into its shell for the last time. It served me well over the last four years.</p><p>Now my infrastructure migration was complete, running fully virtualized, lowering my power consumption, gaining redundancy, and boosting performance for the fraction of the cost of having physical infrastructure.</p><p>I Win!</p><p>Game Over.</p> ]]></content:encoded> <wfw:commentRss>http://blogs.iphouse.net/2011/12/08/infrastructure-and-other-games-part-4/feed/</wfw:commentRss> <slash:comments>1</slash:comments> </item> <item><title>Infrastucture and Other Games, Part 3</title><link>http://blogs.iphouse.net/2011/11/29/infrastucture-and-other-games-part-3/</link> <comments>http://blogs.iphouse.net/2011/11/29/infrastucture-and-other-games-part-3/#comments</comments> <pubDate>Tue, 29 Nov 2011 22:24:37 +0000</pubDate> <dc:creator>Nick Gasper</dc:creator> <category><![CDATA[Opinion]]></category> <category><![CDATA[Virtualization]]></category> <category><![CDATA[vmForge]]></category><guid isPermaLink="false">http://blogs.iphouse.net/?p=1669</guid> <description><![CDATA[Part 3: The Mailcluster Last week I discussed creating a web cluster to take over hosting websites from my decommissioned server. I wrote about setting up the layout, load balancing inside of the firewall and performance testing. This week is about setting up a load balanced mail cluster. Now, last week I described mail as <a href="http://blogs.iphouse.net/2011/11/29/infrastucture-and-other-games-part-3/" class="more-link">More &#62;</a>]]></description> <content:encoded><![CDATA[<h3>Part 3: The Mailcluster</h3><p>Last week I discussed creating a web cluster to take over hosting websites from my decommissioned server. I wrote about setting up the layout, load balancing inside of the firewall and performance testing. This week is about setting up a load balanced mail cluster.</p><p><span id="more-1669"></span>Now, last week I described mail as being finicky. It is. The biggest problem with email is that it was designed during a time when the Internet was a small place, where everyone basically trusted everyone else. The core protocol is very trusting and open. Unfortunately, this makes it very easy to exploit. The big problem being that people will want to send your users spam, and will want your server to send their spam to someone else. That means you need to layer security on top of any email solution to protect your server. Email consists of multiple services that each need to be protected. This makes each server rather complicated.</p><p>For the design of the mail cluster, I had the choice of going either very complex, with multiple servers performing one role, or very simple, one server performing multiple roles. The former is geared towards performance, and requires a couple of things that I don&#8217;t have: an internal load balancer, and a lot more RAM and CPU. The latter is something that I&#8217;ve done, and frankly, is hard to support and upgrade. I might write about my ideal cluster in another post, but for now, this is what I created.</p><p>My mail cluster consists of three servers. Much like my web cluster, there&#8217;s a NFS/Database server, and a couple of front-ends. Each front-end does multiple things: greylist incoming mail, scans incoming mail for spam content, scans all mail for viruses, handles POP/IMAP connections and, acts as a webmail client. The load balancer manages all connections coming from the external network, and balances the connections between the front-ends. The NFS/Database server acts as the mailstore and has the account information, passwords, and common configuration files.</p><p>Setting up the backend server was a lot like setting up my mail cluster&#8217;s NFS and databases. I would need a directory structure for storing mail using the Maildir format. I needed a database for Maia, another for PostfixAdmin/User information, and a third for SQLGrey. Easy as pie.</p><p>The first question for me was what SMTP server to use. I&#8217;m very familiar with Postfix, I like the way it handles queuing, and it&#8217;s configuration files are fairly straightforward. I&#8217;m also a big fan of Dovecot, as it does POP3, IMAP, SASL and can act as a delivery agent with sieve rules. For greylisting, I chose SQLGrey. Again, it&#8217;s something that I&#8217;m familiar with, it has a very low overhead (unlike Policyd) and it just works. For mail content scanning, I used an Amavisd based software called Maia Mailguard and ClamAV. Maia acts as a front-end to SpamAssassin, giving my users individual configuration settings, quarantine support and a few other things.</p><p>Setting up Postfix was very similar to my decommissioned server&#8217;s configuration. I enabled SASL, and followed, more or less the settings from the Dovecot wiki for an SASL setup.</p><p>I added a couple of RBLs that I find *very* effective. An RBL is a dynamic list that someone maintains of servers that you shouldn&#8217;t accept mail from. Many RBLs are too aggressive, or have draconian criteria for removal that should not be encouraged (SORBS and Barracuda, to name a couple.) The only one that I currently enabled is zen.spamhaus.org. This is a very effective RBL with a very very low rate of false-positives, and is managed fairly, and effectively. There are a couple more that I have run in the past, but most illicit mail is caught be Zen.</p><p>Next is a parameter that checks SQLGrey to see if the message should be bounced by greylisting. Greylisting is a technique that temporarily bounces messages from a new source. This prevents poorly implemented mailservers from sending mail because they usually don&#8217;t have a properly implemented queue, and just move on to the next target. Unfortunately, this can delay mail from legitimate servers, as there&#8217;s no telling how long it will take them to send re-queued mail. Also, some server that are not RFC compliant may react unpredictably to greylisting. However, it works as expected greater than 90% of the time, and cuts down on a lot of what gets filtered by Amavisd/Maia/SpamAssassin. That filtering is CPU expensive, so qreylisting allows me to handle a lot more mail than I could without it, with the resources I have available to me.</p><p>The next change I add is a &#8220;sleep 7&#8243; smtpd_client_restrictions. This adds a delay before one of the “250 OK&#8221; responses, that ensures that an smtp server is interactive and RFC compliant, and not some flat script that is designed to blast mail. Again, this winnows the amount of incoming mail.</p><p>The next change is a content_filter parameter that pipes incoming mail to Amavisd. This shuffles mail to the content filters. As I stated before, Maia uses SpamAssassin for content filtering. This gives a set of static rules that are editable, and allows for Bayesian based filtering. Basically, Bayesian filtering is: Every message is broken into &#8220;tokens&#8221; consisting of the content between any two spaces. Each token is dynamically weighed, messages that are marked as good contain good tokens, messages considered spam contain bad tokens. The tokens in a message are added together, and if  the number of bad tokens is a certain level above the number of good tokens, the message is marked as spam, and the tokens are reassessed. Overtime, good tokens get *really* good, and bad tokens get *really* bad, but the bulk of a message will be made up of fairly neutral tokens. Maia, unfortunately, needs a lot of Perl packages. There&#8217;s a fairly useful script out there for installing Maia on Ubuntu (that I had to be slightly modified, as it was for an older Ubuntu version) You can Google it. I might be convinced to write an in depth post on how to install Maia in the future.</p><p>Virus scanning is a bit tricky, because you *must* set up Amavisd to use a ClamAV socket. Otherwise, as a backup, Amavisd will run a clamscan process on every message, and your server will be brought to its proverbial knees (fork, exec again ;)) My frontends can normally handle a few hundred messages at a time will see an exponential decrease (down to maybe a dozen at a time) when the clamscan is used.</p><p>Going back to the Postfix setup, the next thing to configure is the database settings. The database contains mail account user information. I use PostfixAdmin as a front-end for managing my users, and it has a fairly comprehensive set of setup instructions. I use a read-only user for Dovecot and a separate one for Postfix. I also put in proxy_read_maps to share the database lookups between Postfix threads.</p><p>After getting Postfix squared away, I moved to setting up Dovecot. Now, a lot winds up going through Dovecot, it handles authentication, incoming mail, managesieve, and delivery.</p><p>I set up authentication to first refer to a passwd-file. This is used to proxy IMAP/POP3/Sieve connections to another server that I have set up for users that want to have a shell environment. Yes, Dovecot can act as a proxy for the services it supports. This allows me to filter mail inside my frontends, while giving them access to the mail features of the cluster. Unfortunately, I have to manually keep the file updated, but this is much easier than setting up LDAP, It would have to be LDAP because I need to add extra fields that are not in a traditional passwd file, so, as far as I know, I could not use NIS.</p><p>I then set up Dovecot to check my PostfixAdmin database. This is a simple SQL query, with a small customization. I added a recipient delimiter of a &#8220;+&#8221; so that my users could sign up for things using something bob+spamlist@example.com. This &#8220;to&#8221; is preserved in headers so you can filter against it, but would mess up the SQL query, so I have the query edit out the &#8220;+&#8221; and anything thereafter until you get to the &#8220;@&#8221; sign. My virtual users all sign in with their full email address. This is the way PostfixAdmin formats things, and it avoids name collision.</p><p>SASL is handled via a socket, Postfix queries the socket with the username and password, and Dovecot returns whether the account info is valid or not. This, again, allows for SMTP authentication.</p><p>Incoming mail is handled by POP3 and IMAP4. I enabled SSL and TLS, and added a self-signed certificate. This was relatively straightforward, I had to tweak the Mail Location though, as I added namespace support. This allowed me to set up a &#8220;Public&#8221; mail folder for each domain, so that they could share mail without replicating too much data.</p><p>On the Dovecot side, the mail deliver is fairly mindless. I had to add parameters to support quotas for virtual mail, and sieve filtering.</p><p>Once I had the front-ends and backend setup, I had to configure the firewall and load balancer. There are a lot more TCP ports to worry about. Instead of just 80(HTTP) and 443(HTTPS), you have POP3 (110) IMAP4 (143) POP3-SSL (995) IMAP4-SSL (993) SMTP (25) SMTP-SSL (465) SUBMISSION (587) sieve (2000)</p><p>Because of the way Relayd handles load balancing, each port required a virtual server and a pool of servers to attach to. Then, they had to be doubled, one for IPv4 and IPv6. This added up to a ton of configuration overhead.</p><p>Once I got all the pieces together, I had a fairly powerful little mailcluster with content scanning, and some ability to scale. It took awhile, but it was worth doing.</p><p>Now I just had a few other application/custom servers to set up.</p><p>Next week: The Other Stuff.</p> ]]></content:encoded> <wfw:commentRss>http://blogs.iphouse.net/2011/11/29/infrastucture-and-other-games-part-3/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> </channel> </rss>
<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Minified using memcached
Page Caching using memcached
Database Caching using memcached
Object Caching 680/723 objects using memcached

Served from: blogs.iphouse.net @ 2012-02-07 06:01:32 -->
