<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Roobyz Ramblings]]></title><description><![CDATA[A responsive site for sharing thoughts and lessons learned on data science, programming, and technology.]]></description><link>https://roobyz.github.io</link><image><url>/images/background.jpg</url><title>Roobyz Ramblings</title><link>https://roobyz.github.io</link></image><generator>RSS for Node</generator><lastBuildDate>Tue, 30 May 2017 03:35:13 GMT</lastBuildDate><atom:link href="https://roobyz.github.io/rss/" rel="self" type="application/rss+xml"/><ttl>60</ttl><item><title><![CDATA[PVE: Virtual Windows]]></title><description><![CDATA[<div class="sect1">
<h2 id="_create_a_windows_image">Create a Windows Image&#8230;&#8203;</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Imagine you are running Windows 7 in a VM, with near-native Windows performance and the ability to easily switch between Windows and Linux. Sit back and relax, because we are planning to solve the first part of that wish!</p>
</div>
<div id="toc" class="toc">
<div id="toctitle" class="title">Table of Contents</div>
<ul class="sectlevel1">
<li><a href="#_create_a_windows_image">Create a Windows Image&#8230;&#8203;</a></li>
<li><a href="#_install_windows_7">Install Windows 7</a>
<ul class="sectlevel2">
<li><a href="#_step_1_create_the_vm">Step 1: Create the VM</a>
<ul class="sectlevel3">
<li><a href="#_initial_setup">Initial Setup</a></li>
<li><a href="#_setting_tweaks">Setting Tweaks</a></li>
</ul>
</li>
<li><a href="#_step_2_begin_installation">Step 2: Begin Installation</a></li>
<li><a href="#_step_3_post_installation">Step 3: Post Installation</a>
<ul class="sectlevel3">
<li><a href="#_update_windows">Update Windows!</a></li>
<li><a href="#_install_additional_drivers">Install Additional Drivers</a></li>
</ul>
</li>
</ul>
</li>
<li><a href="#_checkpoint">Checkpoint</a>
<ul class="sectlevel2">
<li><a href="#_backup">Backup</a></li>
<li><a href="#_cloning">Cloning</a></li>
</ul>
</li>
<li><a href="#_next_steps">Next Steps</a></li>
</ul>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_install_windows_7">Install Windows 7</h2>
<div class="sectionbody">
<div class="paragraph">
<p>We will complete this process in three steps. First, we create the VM. Second, we install Windows into the VM. Third, we tweak the VM and Windows environment for improved performance.</p>
</div>
<table class="tableblock frame-all grid-all spread">
<colgroup>
<col style="width: 11.1111%;">
<col style="width: 88.8889%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-center valign-middle"><p class="tableblock"><span class="image"><img src="/images/icons/important.png" alt="important.png" width="56"></span></p></td>
<td class="tableblock halign-left valign-top"><div><div class="paragraph">
<p><strong>Note</strong>: This post took a bit longer than planned. This is a result of the beta status of the PVE version we are using combined with the newness of the Ryzen platform. This means that not all of the kinks are guaranteed to be sorted, yet. As a result, there were some interesting issues with UEFI, ZFS, and the backup/restore process that needed to be resolved. That said, we have some workarounds and can continue with our journey.</p>
</div></div></td>
</tr>
</tbody>
</table>
<div class="sect2">
<h3 id="_step_1_create_the_vm">Step 1: Create the VM</h3>
<div class="sect3">
<h4 id="_initial_setup">Initial Setup</h4>
<div class="olist arabic">
<ol class="arabic">
<li>
<p>Place the ISO images into the templates folder.</p>
</li>
<li>
<p>Setup our VM using OVMF (UEFI Bios)</p>
</li>
</ol>
</div>
<div class="paragraph">
<p>At the end of this process, we will have a PVE configuration file (<a href="https://pve.proxmox.com/pve-docs/chapter-qm.html" target="_blank">/etc/pve/qemu-server/&lt;VMID&gt;.conf</a>) with our Windows VM settings. The VMID for this example is 100, so therefore the file would be called <code>/etc/pve/qemu-server/100.conf</code> and will contain something like the following:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="highlight"><code>bios: ovmf
bootdisk: virtio0
cores: 4
cpu: host
hotplug: disk,usb
ide0: zfs-templates:iso/virtio-win-0.1.126.iso,media=cdrom,size=152204K
ide2: zfs-templates:iso/Win7_HomePrem_SP1_English_x64.iso,media=cdrom
memory: 16384
name: Windoze
numa: 0
ostype: win7
scsihw: virtio-scsi-pci
sockets: 1
vga: qxl
virtio0: vm-disks:vm-100-disk-1,cache=writeback,size=128G</code></pre>
</div>
</div>
</div>
<div class="sect3">
<h4 id="_setting_tweaks">Setting Tweaks</h4>
<div class="paragraph">
<p>We need to make some manual updates to this configuration.</p>
</div>
<div class="olist arabic">
<ol class="arabic">
<li>
<p>Add the <code>args</code> line to enable sound and make our Ryzen processor <em>appear</em> as a Haswell processor in Windows. This should allow for continued Windows updates with balanced compatibility and performance.</p>
</li>
<li>
<p>Update the <code>cpu</code> line to include a <code>hidden</code> parameter that makes Windows appear <em>as if</em> it isn&#8217;t running inside a virtual machine. This should help avoid issues with GPU passthrough on Nvidia cards.</p>
</li>
<li>
<p>Add the <code>hostpci</code> lines to include the GPU IDs identified in our prior post: <a href="/2017/05/03/Server-Virtualization-Management-Part3.html#_configure_the_vfio_modules" target="_blank">Configure the VFIO Modules</a>. This will allow Windows to recognize the GPU for driver setup. We will update this again later.</p>
</li>
<li>
<p>Update the ide lines to sata lines.</p>
</li>
<li>
<p>Ensure the machine line is set to <code>q35</code></p>
</li>
</ol>
</div>
<div class="listingblock">
<div class="content">
<pre class="highlight"><code>args: -cpu Haswell-noTSX,hv_vendor_id=TowerVM,kvm=off -device intel-hda,id=sound5,bus=pcie.0,addr=0x18 -device hda-micro,id=sound5-codec0,bus=sound5.0,cad=0 -device hda-duplex,id=sound5-codec1,bus=sound5.0,cad=1
cpu: host,hidden=1
hostpci0: host=28:00.0,pcie=1
hostpci1: host=28:00.1,pcie=1
sata0: zfs-templates:iso/virtio-win-0.1.126.iso,media=cdrom,size=152204K
sata2: zfs-templates:iso/Win7_HomePrem_SP1_English_x64.iso,media=cdrom
machine: q35</code></pre>
</div>
</div>
<div class="paragraph">
<p>We should now be ready to install Windows.</p>
</div>
</div>
</div>
<div class="sect2">
<h3 id="_step_2_begin_installation">Step 2: Begin Installation</h3>
<div class="paragraph">
<p>Before watching this great <a href="https://www.youtube.com/watch?v=thVmhIw4-jU" target="_blank">PVE Installation Video</a> from ProxMox, make note of the following key points:</p>
</div>
<div class="olist arabic">
<ol class="arabic">
<li>
<p>Important first step: quickly open the Spice console and hit the spacebar after the UEFI boot screen, to launch the Windows installation process. Then initiate the Windows Setup. Windows will complain that "no drives were found".</p>
</li>
<li>
<p>Select "Custom (advanced)" and then:</p>
<div class="olist loweralpha">
<ol class="loweralpha" type="a">
<li>
<p>Install NetKVM drivers:</p>
<div class="olist lowerroman">
<ol class="lowerroman" type="i">
<li>
<p>click <mark>Load Driver</mark></p>
</li>
<li>
<p>click <mark>Browse</mark> and then select the drive with the "virtio-win" drivers.</p>
</li>
<li>
<p>select the folder: NetKVM &#8594; w7 &#8594; amd64</p>
</li>
<li>
<p>click <mark>OK</mark></p>
</li>
<li>
<p>click <mark>Next</mark></p>
</li>
</ol>
</div>
</li>
<li>
<p>Install viostor drivers:</p>
<div class="olist lowerroman">
<ol class="lowerroman" type="i">
<li>
<p>click <mark>Load Driver</mark></p>
</li>
<li>
<p>click <mark>Browse</mark> and then select the drive with the "virtio-win" drivers.</p>
</li>
<li>
<p>select folder: viostor&#8594; w7 &#8594; amd64</p>
</li>
<li>
<p>click <mark>OK</mark></p>
</li>
<li>
<p>click <mark>Next</mark></p>
</li>
</ol>
</div>
</li>
</ol>
</div>
</li>
</ol>
</div>
<div class="paragraph">
<p>At this point, the installation process will recognize the virtual disk and be able to continue with the installation process. In addition, Windows should automatically recognize and configure the network card. We can bypass the registration process until later. Select our "Home" network.</p>
</div>
</div>
<div class="sect2">
<h3 id="_step_3_post_installation">Step 3: Post Installation</h3>
<div class="sect3">
<h4 id="_update_windows">Update Windows!</h4>
<div class="paragraph">
<p>After installing Windows we should immediately check for Windows updates. This will likely take the most time of this process and require many reboots. Keep checking for updates until Windows says there are no more.</p>
</div>
<div class="paragraph">
<p>Afterward, we should go back to our PVE Web GUI and select our Windows VM, select <code>Hardware</code> and then for each <code>CD/DVD Drive</code> click <code>Edit</code>.  In the edit window, we should select "Do not use any media" and then click <code>OK</code>.</p>
</div>
<table class="tableblock frame-all grid-all spread">
<colgroup>
<col style="width: 11.1111%;">
<col style="width: 88.8889%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-center valign-middle"><p class="tableblock"><span class="image"><img src="/images/icons/important.png" alt="important.png" width="56"></span></p></td>
<td class="tableblock halign-left valign-top"><div><div class="paragraph">
<p><strong>Note</strong>: Although Windows 7 supports UEFI boot, Microsoft designed it under the assumption that it is running alone on real hardware. It is likely that when you try to run our Windows VM it will bail out to the <em>UEFI Interactive Shell</em>. If you get stuck there, on the PVE Web GUI click <code>Shutdown &#8594; PowerOff</code> for the Windows VM. We can download <a href="http://www.supergrubdisk.org/" target="_blank">Super Grub Disk</a> and then upload the iso to our zfs-templates folder. Once we have assigned it to our first "CD/DVD", we can start our VM, which will boot into the Grub bootloader. Press &lt;enter&gt; on <code>Detect and show boot methods</code> to quickly identify the boot partitions, select the first entry <code>(hd0,gpt1)/efi/Boot/booxx64.efi</code>, and press &lt;enter&gt; again to boot into Windows.</p>
</div></div></td>
</tr>
</tbody>
</table>
</div>
<div class="sect3">
<h4 id="_install_additional_drivers">Install Additional Drivers</h4>
<div class="paragraph">
<p>After updating, we should install the <a href="https://www.spice-space.org/download.html" target="_blank">Spice Guest Tools</a> for Windows. This will include drivers that will speed up the VM and make it nicer to use with Virt-Viewer. Also, we should install the <a href="http://www.geforce.com/drivers" target="_blank">Nvidia Geforce graphics drivers</a>.</p>
</div>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_checkpoint">Checkpoint</h2>
<div class="sectionbody">
<div class="sect2">
<h3 id="_backup">Backup</h3>
<div class="paragraph">
<p>This is a great time to save our work. In case something goes haywire, we should have a backup so that we don&#8217;t go through that whole process again, right? In addition, wouldn&#8217;t it be nice to have a clean slate in case of a virus or worm? We are talking about Windows after all.</p>
</div>
<div class="paragraph">
<p>Backing up our VM is easy to do with PVE. First, we shutdown Windows. Next, on our PVE Web GUI, we select the "Summary" tab of our Windows VM and confirm the Status is "Stopped". Finally, we select the "Backup" tab, click "Backup Now" and then click "Backup". Then we need to be patient until it completes.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="highlight"><code>INFO: transferred 137438 MB in 515 seconds (266 MB/s)
INFO: stopping kvm after backup task
INFO: archive file size: 13.89GB
INFO: Finished Backup of VM 100 (00:08:43)</code></pre>
</div>
</div>
<div class="paragraph">
<p>After backing up our VM, we may want to verify the integrity of our backup. PVE uses a new backup format called <a href="https://pve.proxmox.com/wiki/VMA" target="_blank">VMA</a>.</p>
</div>
<table class="tableblock frame-all grid-all spread">
<colgroup>
<col style="width: 11.1111%;">
<col style="width: 88.8889%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-center valign-middle"><p class="tableblock"><span class="image"><img src="/images/icons/important.png" alt="important.png" width="56"></span></p></td>
<td class="tableblock halign-left valign-top"><div><div class="paragraph">
<p><strong>Note</strong>: While trying to restore, the PVE process generated an error and failed. Unfortunately, this process removes the original VM disk image and then tries to restore the backup, which meant that I had to repeat the installation process multiple times. However, this should be resolved when the production version of PVE is released. In the mean time, we can use the following manual process to backup and restore.</p>
</div></div></td>
</tr>
</tbody>
</table>
<div class="paragraph">
<p>We can also back the drive image directly from the command line. There are multiple ways to do this. Using the Unix command <code><a href="https://en.wikipedia.org/wiki/Dd_(Unix)" target="_blank">dd</a></code>, we will duplicate the emulated zvol block device as a raw file in our zfs pool.</p>
</div>
<div class="listingblock">
<div class="title">Backing up our VM disk image</div>
<div class="content">
<pre class="highlight"><code># Install "pixie" for parallel compression of our disk image
agt-get install pixz

# Backup ("dd") and compress ("pixz") our VM disk image
# Note: it is a good idea to add a date to the filename in case
#       you want to have multiple backup versions.
dd if=/dev/zvol/tank/vm-disks/vm-100-disk-1 | pixz -2 &gt; /tank/100-windows.raw.xz

# 137438953472 bytes (137 GB, 128 GiB) copied, 603.995 s, 228 MB/s
# 4.4G 100-windows.raw.xz

# Note: we could back up our image without compression and let zfs
# handle the compression by default (lz4). Although lz4 is
# faster for interactive I/O, it isn't as space efficient as xz. In
# addition, since pixz can run in parallel, it is more than twice as
# fast as the default compression. Compare this performance to above.
# dd if=/dev/zvol/tank/vm-disks/vm-100-disk-1 of=/tank/100-windows.raw
# 137438953472 bytes (137 GB, 128 GiB) copied, 1480.54 s, 92.8 MB/s
# 128G 100-windows.raw</code></pre>
</div>
</div>
<table class="tableblock frame-all grid-all spread">
<colgroup>
<col style="width: 11.1111%;">
<col style="width: 88.8889%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-center valign-middle"><p class="tableblock"><span class="image"><img src="/images/icons/lightbulb.png" alt="lightbulb.png" width="56"></span></p></td>
<td class="tableblock halign-left valign-top"><div><div class="paragraph">
<p><strong>About That</strong>: A zvol is an emulated <a href="https://en.wikipedia.org/wiki/Device_file#Block_devices" target="_blank">block device</a> provided by ZFS. By default, PVE creates one zvol for each VM-disk. Once we create our VM we can see our disks by running: <code>zfs list -t all -r tank</code>. The disks would be hidden in the tank folder, however, they would be mapped as block devices which we can see if we run: <code>ll /sys/block/zd*</code>. It is important to note that PVE saves the disk size in our configuration file, however, it also runs a command similar to <code>zfs create -V 128G tank/vm-disks/vm-100-disk-1</code> which also sets the zvol block device size in our zpool. Ensure you update your drive size in the PVE Web GUI so that they match up.</p>
</div></div></td>
</tr>
</tbody>
</table>
<div class="listingblock">
<div class="title">Restoring our backup:</div>
<div class="content">
<pre class="highlight"><code># Check to ensure our VM disk is still enabled:
zfs list -t all -r tank

# We should see something like:
# tank/vm-disks/vm-100-disk-1   132G  1.72T  21.7G  -
#
# if something bad happened and we cannot see the zvol, we can
# manually recreate it with the following command:
zfs create -V 128G tank/vm-disks/vm-100-disk-1

# Restore our compressed VM disk image.
pixz -d -i /tank/100-windows.raw.xz | dd of=/dev/zvol/tank/vm-disks/vm-100-disk-1

# 137438953472 bytes (137 GB, 128 GiB) copied, 799.285 s, 172 MB/s</code></pre>
</div>
</div>
<div class="paragraph">
<p>Congratulations! We can now backup and restore our disk images. In case of emergency, we can recreate our image to a known good state. In addition, we have the opportunity to move our disk image to another computer.</p>
</div>
<table class="tableblock frame-all grid-all spread">
<colgroup>
<col style="width: 11.1111%;">
<col style="width: 88.8889%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-center valign-middle"><p class="tableblock"><span class="image"><img src="/images/icons/lightbulb.png" alt="lightbulb.png" width="56"></span></p></td>
<td class="tableblock halign-left valign-top"><div><div class="paragraph">
<p><strong>About That</strong>: If we already have a VM disk image that we want to restore to PVE, we can use a similar process to our backup and restore process. Key points to consider, whether the disk image is in qcow2 or some other format, we need to convert it to "raw" format. For example, we could run something like: <code>qemu-img convert -O raw windoze.qcow2 windoze.raw</code>. Once converted to raw format we might want to compress it like: <code>pixz -2 windoze.raw windoze.raw.xz</code>. Finally, we need to create a VM that has a disk size that is equal to our original image and then we can restore similar to above.</p>
</div></div></td>
</tr>
</tbody>
</table>
</div>
<div class="sect2">
<h3 id="_cloning">Cloning</h3>
<div class="paragraph">
<p>Another cool feature of PVE is cloning. We can get our existing Windows VM and clone it to have an alternate installation to experiment with. For example, maybe we want to install some experimental software or make hardware changes without risking our standard VM.</p>
</div>
<div class="paragraph">
<p>Depending on how large the VM disk image is, the cloning process may take a while. Be patient for it to complete. To check progress, we can run from the command line: <code>zfs list -t all -r tank</code>. In this example, we can see that <strong>vm-101-disk-1</strong> REFER size is only 20.1G compared to 22.1G for <strong>vm-100-disk-1</strong>, which we are cloning:</p>
</div>
<div class="listingblock">
<div class="title">Example</div>
<div class="content">
<pre class="highlight"><code>NAME                          USED  AVAIL  REFER  MOUNTPOINT
tank                          169G  1.59T  16.7G  /tank
tank/vm-disks                 152G  1.59T    96K  /tank/vm-disks
tank/vm-disks/vm-100-disk-1   132G  1.70T  22.1G  -
tank/vm-disks/vm-101-disk-1  20.1G  1.59T  20.1G  -</code></pre>
</div>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_next_steps">Next Steps</h2>
<div class="sectionbody">
<div class="paragraph">
<p>At this point, we should have our Windows installation running on our Ryzen server as-if it were running on a Haswell machine, at near native performance. Our NVidia drivers should be installed, however, the "Display Adapter" status will likely show as: "This device cannot find enough free resources that it can use. (Code 12)".</p>
</div>
<div class="paragraph">
<p>Considering that Ryzen is a new platform, the system bios and Linux Kernel support are not quite at 100% yet. AMD recently released an <a href="https://community.amd.com/community/gaming/blog/2017/05/25/community-update-4-lets-talk-dram" target="_blank">AGESA Update</a> that should make it into a bios update over the next few weeks. This should improve virtualization support for PCI Express Access Control Services (ACS). The ACS support also needs to make it into the Linux kernel. After these updates are ready, we should be able to finish enabling full GPU Passthrough to our VM.</p>
</div>
</div>
</div>]]></description><link>https://roobyz.github.io/2017/05/29/Windows-on-Prox-Mox-PVE.html</link><guid isPermaLink="true">https://roobyz.github.io/2017/05/29/Windows-on-Prox-Mox-PVE.html</guid><category><![CDATA[Blog]]></category><category><![CDATA[Open_Source]]></category><category><![CDATA[Technology]]></category><category><![CDATA[ProxMox]]></category><dc:creator><![CDATA[Roberto Rivera]]></dc:creator><pubDate>Mon, 29 May 2017 00:00:00 GMT</pubDate></item><item><title><![CDATA[PVE: Virtualization for Work and Play (Part 3)]]></title><description><![CDATA[<div class="sect1">
<h2 id="_system_optimization">System Optimization&#8230;&#8203;</h2>
<div class="sectionbody">
<div class="paragraph">
<p>In the <a href="/2017/04/25/Server-Virtualization-Management-Part2.html">previous post</a> we installed ProxMox Virtual Environment (PVE) and configured our ZFS ZPool storage system. Let&#8217;s tweak our system to improve performance.</p>
</div>
<div id="toc" class="toc">
<div id="toctitle" class="title">Table of Contents</div>
<ul class="sectlevel1">
<li><a href="#_system_optimization">System Optimization&#8230;&#8203;</a></li>
<li><a href="#_zfs_tune_up">ZFS Tune-up</a></li>
<li><a href="#_graphics_processing_unit_gpu_passthrough">Graphics Processing Unit (GPU) Passthrough</a>
<ul class="sectlevel2">
<li><a href="#_enable_the_vfio_modules">Enable the VFIO Modules:</a></li>
<li><a href="#_configure_the_vfio_modules">Configure the VFIO Modules</a>
<ul class="sectlevel3">
<li><a href="#_identify_passthrough_device">Identify Passthrough Device</a></li>
<li><a href="#_enable_passthrough_device">Enable Passthrough Device</a></li>
</ul>
</li>
</ul>
</li>
<li><a href="#_update_boot_settings">Update Boot Settings</a></li>
<li><a href="#_final_thoughts">Final Thoughts</a></li>
</ul>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_zfs_tune_up">ZFS Tune-up</h2>
<div class="sectionbody">
<div class="paragraph">
<p><a href="https://www.openindiana.org/documentation/faq/#what-is-openindiana" target="_blank">Solaris-based UNIX</a> and Linux treat <a href="https://en.wikipedia.org/wiki/Extended_file_attributes" target="_blank">extended attributes</a> differently, which has some performance implications. By default, ZFS on Linux is set to <code>xattr=on</code> which causes the extended attributes to be stored in separate hidden directories. Changing this property to "<em>system attributes</em>" improves performance significantly under Linux as a results of extended attributes being stored more efficiently on disk.</p>
</div>
<div class="paragraph">
<p>To check the attribute run <code>zfs get -r xattr <mark>tank</mark></code> (update tank with your zpool name) and we&#8217;ll see something like the following:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="highlight"><code>NAME                         PROPERTY  VALUE  SOURCE
tank                         xattr     on     default
tank/vm-disks                xattr     on     default</code></pre>
</div>
</div>
<div class="paragraph">
<p>To update the property run <code>zfs set xattr=sa <mark>tank</mark></code></p>
</div>
<div class="paragraph">
<p>Linux has extended attributes called <a href="https://access.redhat.com/documentation/en-US/Red_Hat_Storage/2.0/html/Administration_Guide/ch09s05.html" target="_blank">Posix ACL</a> that are not functional on other platforms. To check the attribute, run <code>zfs get -r acltype <mark>tank</mark></code> and we should get something like:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="highlight"><code>NAME                         PROPERTY  VALUE     SOURCE
tank                         acltype   off       local
tank/vm-disks                acltype   off       inherited from zentank</code></pre>
</div>
</div>
<div class="paragraph">
<p>To update the property run <code>zfs set acltype=posixacl <mark>tank</mark></code>. Also, so that ACLs get passed to files created within a directory, we need to run <code>zfs set aclinherit=passthrough <mark>tank</mark></code> as well.</p>
</div>
<div class="paragraph">
<p>Linux has two parameters related to the times when files are accessed. The first is <code>atime</code>, which tracks the “last” access time. This creates a lot of overhead because every time a file is read, an update has to write to disk to reflect this access time. The second is <code>relatime</code>, which is similar but is a relative atime, and writes on fewer occasions. We can check both of them with <code>zfs get -r atime tank &amp;&amp; zfs get -r relatime tank</code> and we should see something like:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="highlight"><code>NAME                         PROPERTY  VALUE  SOURCE
tank                         atime     on     default
tank/vm-disks                atime     on     default

NAME                         PROPERTY  VALUE     SOURCE
tank                         relatime  off       default
tank/vm-disks                relatime  off       default</code></pre>
</div>
</div>
<div class="paragraph">
<p>If we need to track access times, "relatime" is preferable, however, we can disable both by running <code>zfs set atime=off <mark>tank</mark></code>, since "relatime" is already disabled.</p>
</div>
<div class="paragraph">
<p>We can set additional options for reliability. Run <code>zpool set autoreplace=on <mark>tank</mark></code> so that ZFS can automatically switch to an available hot spare if hardware errors are detected on online disks. Run <code>zpool set autoexpand=on  <mark>tank</mark></code> to allows the pool to grow when all VDEVs have been replaced with larger ones. This must be set before any drives are replaced, so we may as well set it now.</p>
</div>
<table class="tableblock frame-all grid-all spread">
<colgroup>
<col style="width: 11.1111%;">
<col style="width: 88.8889%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-center valign-middle"><p class="tableblock"><span class="image"><img src="/images/icons/lightbulb.png" alt="lightbulb.png" width="56"></span></p></td>
<td class="tableblock halign-left valign-top"><div><div class="paragraph">
<p><strong>About That</strong>: Many ZFS properties are not retroactive. To apply to existing files, we would need to replace the files. In other words, if you already have files or data stored on your ZFS pool, you would need to move them somewhere else (i.e. backup) and then move them back so that the changes in properties are applied correctly.</p>
</div></div></td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="sect1">
<h2 id="_graphics_processing_unit_gpu_passthrough">Graphics Processing Unit (GPU) Passthrough</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Passthrough allows our Virtual Machine (VM) to access GPU hardware for games, graphics, and heavy computation (i.e. deep learning). We must enable <a href="https://en.wikipedia.org/wiki/Input%E2%80%93output_memory_management_unit" target="_blank">IOMMU ("Input–output memory management unit")</a> drivers, which allocate device-visible <em>virtual</em> addresses to the actual physical addresses. IOMMU enables our VM to communicate with our GPU using the virtual addresses <em>as if</em> it were directly communicating to the GPU.</p>
</div>
<div class="paragraph">
<p><a href="https://www.kernel.org/doc/Documentation/vfio.txt" target="_blank">VFIO ("Virtual Function I/O")</a> modules are part of an IOMMU device-agnostic framework for exposing direct device access to userspace, in a <em>secure</em> IOMMU protected environment.  In other words, they provide access to non-privileged, low-overhead userspace drivers.</p>
</div>
<div class="sect2">
<h3 id="_enable_the_vfio_modules">Enable the VFIO Modules:</h3>
<div class="paragraph">
<p>Run <code>nano /etc/modules</code> and add the following:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="highlight"><code>vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd</code></pre>
</div>
</div>
<div class="paragraph">
<p>Save and exit: press CTRL+X, Y for yes, and ENTER.</p>
</div>
</div>
<div class="sect2">
<h3 id="_configure_the_vfio_modules">Configure the VFIO Modules</h3>
<div class="sect3">
<h4 id="_identify_passthrough_device">Identify Passthrough Device</h4>
<div class="paragraph">
<p>To identify the GPU to passthrough run <code>lspci -nn | grep VGA</code>.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="highlight"><code>21:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM107 [GeForce GTX 745] [10de:1382] (rev a2)
28:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104 [GeForce GTX 1080] [10de:1b80] (rev a1)</code></pre>
</div>
</div>
<div class="paragraph">
<p>Identify the GPU slot IDs (first pair of numbers separated by a colon):</p>
</div>
<div class="olist arabic">
<ol class="arabic">
<li>
<p>My GPU Slot ID for passthrough is: <code><mark>28:00</mark></code></p>
</li>
<li>
<p>My GPU Slot ID for the host is: <code>21:00</code></p>
</li>
</ol>
</div>
<div class="paragraph">
<p>Identify the vendor ID for passthrough: <code>lspci -nns <mark>28:00</mark> | cut -d "(" -f 1 | cut -d ":" -f 3,4</code></p>
</div>
<div class="listingblock">
<div class="content">
<pre class="highlight"><code> NVIDIA Corporation GP104 [GeForce GTX 1080] [10de:1b80]
 NVIDIA Corporation GP104 High Definition Audio Controller [10de:10f0]</code></pre>
</div>
</div>
<div class="olist arabic">
<ol class="arabic">
<li>
<p>Vendor ID for GPU VGA device: <code>10de:1b80</code></p>
</li>
<li>
<p>Vendor ID for GPU Audio device: <code>10de:10f0</code></p>
</li>
</ol>
</div>
</div>
<div class="sect3">
<h4 id="_enable_passthrough_device">Enable Passthrough Device</h4>
<div class="paragraph">
<p>To enable <a href="https://pve.proxmox.com/wiki/Pci_passthrough" target="_blank">passthrough</a>, add the following module options (including the comma separated vendor IDs identified in the prior step). This loads options for the vfio-pci kernel module, which maps memory regions from the PCI bus to the VM, and activates support for IOMMU groups.</p>
</div>
<div class="paragraph">
<p>Run <code>nano /etc/modprobe.d/kvm.conf</code> and add <mark><em>some</em></mark> of the following options (see Table 1 for details):</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="highlight"><code># uncomment the first option if required for your system.
#options vfio_iommu_type1 allow_unsafe_interrupts=1
options vfio-pci         ids=10de:1b80,10de:10f0
options vfio-pci         disable_vga=1
options kvm-amd          npt=0
options kvm              ignore_msrs=1</code></pre>
</div>
</div>
<div class="paragraph">
<p>Save and exit: press CTRL+X, Y for yes, and ENTER.</p>
</div>
<table class="tableblock frame-all grid-all spread">
<caption class="title">Table 1. Module option details</caption>
<colgroup>
<col style="width: 30.7692%;">
<col style="width: 69.2308%;">
</colgroup>
<thead>
<tr>
<th class="tableblock halign-left valign-top">Option</th>
<th class="tableblock halign-left valign-top">Details</th>
</tr>
</thead>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">allow_unsafe_interrupts=1</p></td>
<td class="tableblock halign-left valign-top"><div><div class="paragraph">
<p>This workaround is for platforms without interrupt remapping support, which provides device isolation. It removes protection against <a href="http://invisiblethingslab.com/resources/2011/Software%20Attacks%20on%20Intel%20VT-d.pdf" target="_blank">MSI-based interrupt injection attacks</a> by guests.  Only trusted guests and drivers should be run with this configuration.</p>
</div></div></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">ids=<mark>10de:1b80,10de:10f0</mark></p></td>
<td class="tableblock halign-left valign-top"><div><div class="paragraph">
<p>Assign desired GPU to the virtual pci for use in our VM.</p>
</div></div></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">disable_vga=1</p></td>
<td class="tableblock halign-left valign-top"><div><div class="paragraph">
<p>Opt-out devices from vga arbitration if possible.</p>
</div></div></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">npt=0</p></td>
<td class="tableblock halign-left valign-top"><div><div class="paragraph">
<p>Disable Nested Page Table If VM performance is very slow. Linux guests with Q35 and OVMF may work with npt on or off, however a Linux guest with i440fx only works with npt disabled.</p>
</div></div></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock">ignore_msrs=1</p></td>
<td class="tableblock halign-left valign-top"><div><div class="paragraph">
<p>Prevent some Nvidia applications from crashing the VM.</p>
</div></div></td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_update_boot_settings">Update Boot Settings</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Configure IOMMU and VFIO to load first so that framebuffer drivers don’t grab the GPU while booting. After these changes, commit them to grub and generate a new boot image.</p>
</div>
<div class="paragraph">
<p>Run <code>nano /etc/default/grub</code> and change <code>GRUB_CMDLINE_LINUX_DEFAULT="quiet"</code> as follows:</p>
</div>
<div class="paragraph">
<p><code>GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on kvm_amd.avic=1 rd.driver.pre=vfio-pci video=efifb:off"</code></p>
</div>
<div class="paragraph">
<p>Save and exit: press CTRL+X, Y for yes, and ENTER.</p>
</div>
<div class="paragraph">
<p>Afterward, run:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="highlight"><code>update-grub          # update boot loader
update-initramfs -u  # update boot image
reboot               # reboot machine</code></pre>
</div>
</div>
<div class="paragraph">
<p>After our computer reboots, run <code>lspci -nnks <mark>28:00</mark></code> to check that the driver loaded correctly. If everything went well, for each device we should see <code>vfio-pci</code> for our "Kernel driver in use".</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="highlight"><code>28:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104 [GeForce GTX 1080] [10de:1b80] (rev a1)
	Subsystem: ZOTAC International (MCO) Ltd. GP104 [GeForce GTX 1080] [19da:1451]
	Kernel driver in use: vfio-pci
	Kernel modules: nvidiafb, nouveau
28:00.1 Audio device [0403]: NVIDIA Corporation GP104 High Definition Audio Controller [10de:10f0] (rev a1)
	Subsystem: ZOTAC International (MCO) Ltd. GP104 High Definition Audio Controller [19da:1451]
	Kernel driver in use: vfio-pci
	Kernel modules: snd_hda_intel</code></pre>
</div>
</div>
<div class="paragraph">
<p>Also, run <code>dmesg | grep -e AMD-Vi -e vAPIC</code> to check our IOMMU settings.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="highlight"><code>[    0.893699] AMD-Vi: IOMMU performance counters supported
[    0.895145] AMD-Vi: Found IOMMU at 0000:00:00.2 cap 0x40
[    0.895146] AMD-Vi: Extended features (0xf77ef22294ada):
[    0.895146]  PPR NX GT IA GA PC GA_vAPIC
[    0.895148] AMD-Vi: Interrupt remapping enabled
[    0.895149] AMD-Vi: virtual APIC enabled
[    0.895257] AMD-Vi: Lazy IO/TLB flushing enabled</code></pre>
</div>
</div>
<table class="tableblock frame-all grid-all spread">
<colgroup>
<col style="width: 11.1111%;">
<col style="width: 88.8889%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-center valign-middle"><p class="tableblock"><span class="image"><img src="/images/icons/lightbulb.png" alt="lightbulb.png" width="56"></span></p></td>
<td class="tableblock halign-left valign-top"><div><div class="paragraph">
<p><strong>About That</strong>: AMD Virtual Interrupt Controller (AVIC) virtualizes local <a href="https://en.wikipedia.org/wiki/Advanced_Programmable_Interrupt_Controller" target="_blank">APIC</a> registers of each vCPU via the virtual APIC (vAPIC) backing page. This allows guest access to certain APIC registers without needing to emulate the hardware behavior, and should speed up workloads that generate large amount of interrupts.</p>
</div></div></td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="sect1">
<h2 id="_final_thoughts">Final Thoughts</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Congratulations! We have our PVE server configured and ready to use. We can now begin <a href="https://pve.proxmox.com/wiki/VM_Templates_and_Clones" target="_blank">creating Virtual Machines (VMs)</a> or <a href="https://pve.proxmox.com/wiki/Linux_Container" target="_blank">Containers</a>. In future posts, we&#8217;ll consider additional opportunties for enhancing performance and security for our server, VMs, and Containers.</p>
</div>
<div class="paragraph">
<p>Although we have configured passthrough on the server, updates to our VMs are required to leverage that feature. Because Nvidia sells a commercial line of GPUs (Quadro), they do not <em>support</em> passthrough, and actively try to inhibit passthrough on their consumer line (GeForce). We will have to consider potential workarounds to enable that functionality, which may involve future tweaks to our server settings.</p>
</div>
</div>
</div>]]></description><link>https://roobyz.github.io/2017/05/03/Server-Virtualization-Management-Part3.html</link><guid isPermaLink="true">https://roobyz.github.io/2017/05/03/Server-Virtualization-Management-Part3.html</guid><category><![CDATA[Blog]]></category><category><![CDATA[Open_Source]]></category><category><![CDATA[Technology]]></category><category><![CDATA[ProxMox]]></category><dc:creator><![CDATA[Roberto Rivera]]></dc:creator><pubDate>Wed, 03 May 2017 00:00:00 GMT</pubDate></item><item><title><![CDATA[PVE: Virtualization for Work and Play (Part 2)]]></title><description><![CDATA[<div class="sect1">
<h2 id="_getting_started">Getting Started&#8230;&#8203;</h2>
<div class="sectionbody">
<div class="paragraph">
<p>In the <a href="/2017/04/23/Server-Virtualization-Management.html">previous post</a>, we learned about ProxMox Virtual Environment (PVE) and outlined the plan to build a powerful "bang for the buck" home server for games <em>and</em> other system-intensive pursuits. Before proceeding, you should feel a little comfortable with the <a href="http://linuxcommand.org/lc3_learning_the_shell.php" target="_blank">Linux CLI (command line interface)</a>. Now let&#8217;s begin.</p>
</div>
<div id="toc" class="toc">
<div id="toctitle" class="title">Table of Contents</div>
<ul class="sectlevel1">
<li><a href="#_getting_started">Getting Started&#8230;&#8203;</a></li>
<li><a href="#_installation">Installation</a>
<ul class="sectlevel2">
<li><a href="#_pve_drive_options">PVE Drive Options</a></li>
<li><a href="#_zfs_partitions">ZFS Partitions</a></li>
<li><a href="#_zfs_setup">ZFS Setup</a></li>
</ul>
</li>
<li><a href="#_post_installation">Post-Installation</a>
<ul class="sectlevel2">
<li><a href="#_adjusting_the_pve_repositories">Adjusting the PVE Repositories</a></li>
<li><a href="#_update_pve">Update PVE</a></li>
<li><a href="#_update_the_pve_storage_system">Update the PVE Storage System:</a></li>
</ul>
</li>
</ul>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_installation">Installation</h2>
<div class="sectionbody">
<div class="paragraph">
<p>The <a href="https://pve.proxmox.com/wiki/Quick_installation" target="_blank">PVE Quick Installation</a> guide does a wonderful job of highlighting the key installation points and showing how simple the process really is. If you follow the defaults settings, PVE will install to <em>local disks</em> and should take about 10 minutes or less. If you have a large single drive that you want to use, then you can skip to the post-installation section below since that guide is all that you would need.</p>
</div>
<table class="tableblock frame-all grid-all spread">
<colgroup>
<col style="width: 10%;">
<col style="width: 80%;">
<col style="width: 10%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"><div><div class="paragraph">
<p>PVE Installation Process
<span class="image"><img src="https://roobyz.github.io/images/Server-Virtualization-Management/pve-installation.gif" alt="pve-install"></span></p>
</div></div></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
</tbody>
</table>
<div class="sect2">
<h3 id="_pve_drive_options">PVE Drive Options</h3>
<div class="paragraph">
<p>For our setup, we will disconnect all the drives except for our boot drive, and follow most of the default installation options with one exception. Have a peek at the <a href="https://pve.proxmox.com/wiki/Installation" target="_blank">ProxMox PVE installation</a> guide for background on these options. Since we only want to use half of the boot drive (512Gb NVMe) our <em>hard disk options</em> are as follows:</p>
</div>
<div class="ulist">
<ul>
<li>
<p><strong><code>ext4</code></strong> <em>filesystem</em>: Standard Linux filesystem is the safe bet.</p>
</li>
<li>
<p><strong><code>176.0</code></strong> <em>hdsize</em>: Shooting for half of our 512Gb. The following values should add up to 256Gb. Equal to 256Gb-80Gb (minfree).</p>
</li>
<li>
<p><strong><code>64.0</code></strong> <em>swapsize</em>: Linux swap file size (equal to our ram size). Be sure to set vm.swappiness to a low value if you have your swap file on an SSD! It&#8217;ll increase RAM usage a bit, but will be easier on our SSD.</p>
</li>
<li>
<p><strong><code>96.0</code></strong> <em>maxroot</em>: / root file partition</p>
</li>
<li>
<p><strong><code>80.0</code></strong> <em>minfree</em>: This should equal our ZFS log (16GB) plus our ZFS cache (64GB).</p>
</li>
<li>
<p><strong><code>16.0</code></strong> <em>maxve</em>: This is the pve-data partition.</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>When you get to the <em>Installation Successful</em> step of the PVE install, click the "Reboot" button.</p>
</div>
</div>
<div class="sect2">
<h3 id="_zfs_partitions">ZFS Partitions</h3>
<div class="paragraph">
<p>After rebooting you can log in via the PVE Web GUI or through the command line using SSH. Logging in via ssh:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="highlight"><code>ssh root@10.10.1.10

#The authenticity of host '10.10.1.10 (10.10.1.10)' can't be established.
#ECDSA key fingerprint is SHA256:2ExP+SHaCo+9ZOt+sk90DPLAafdHFJTHPyeU1qtFXIg.
#Are you sure you want to continue connecting (yes/no)?</code></pre>
</div>
</div>
<div class="paragraph">
<p>Type "yes" and then enter the password set during installation.</p>
</div>
<div class="paragraph">
<p>After logging into our new PVE installation, we want to add two additional partitions (ZFS Log (16Gb), and ZFS Cache(64Gb)). The combined storage will be 256Gb, leaving us with half of our NVMe for other options like dual-boot, additional storage, etc.</p>
</div>
<div class="paragraph">
<p>After logging in, run (update for your drive): <code>cfdisk /dev/sda</code></p>
</div>
<div class="paragraph">
<p>Go down to the "free space" line in green and add a 16Gb partition. Move down again to the next "free space" line in green and add a 64Gb partition. Then select <code>write</code> and then select <code>quit</code> and we are done.</p>
</div>
<table class="tableblock frame-all grid-all spread">
<colgroup>
<col style="width: 10%;">
<col style="width: 80%;">
<col style="width: 10%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"><div><div class="paragraph">
<p>Creating ZFS Log and Cache Partitions
<span class="image"><img src="https://roobyz.github.io/images/Server-Virtualization-Management/pve-cfdisk-process.gif" alt="pve-cfdisk"></span></p>
</div></div></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
</tbody>
</table>
<div class="paragraph">
<p>Once these two partitions are added, we can shut down PVE from the command line:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="highlight"><code>shutdown -h now</code></pre>
</div>
</div>
</div>
<div class="sect2">
<h3 id="_zfs_setup">ZFS Setup</h3>
<div class="paragraph">
<p>Once PVE has shut down, we can reconnect the remaining drives and restart our system. Before setting up our ZFS storage, we must backup any data that we want to keep.</p>
</div>
<div class="paragraph">
<p>Let&#8217;s start our <a href="http://open-zfs.org/wiki/Performance_tuning" target="_blank">ZFS configuration</a>. As mentioned in our previous post, we are configuring ZFS as striped-mirrored storage. Since we have a 2TB spinning disk that we want to use for backup, we will mirror it as an <em>automatic</em> backup.</p>
</div>
<div class="paragraph">
<p>Our drives should all be the same size, otherwise, we will lose storage capacity. Since our SSD drives are 1TB each, we need to partition our 2TB spinning disk to two 1TB partitions. Before partitioning, identify the correct drives; run <code>lsblk</code> to get the list of block devices:</p>
</div>
<table class="tableblock frame-all grid-all spread">
<colgroup>
<col style="width: 10%;">
<col style="width: 80%;">
<col style="width: 10%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"><div><div class="paragraph">
<p>List of block devices
<span class="image"><img src="https://roobyz.github.io/images/Server-Virtualization-Management/pve-lsblk.png" alt="pve-lsblk"></span></p>
</div></div></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
</tbody>
</table>
<div class="paragraph">
<p>In my example, the 2TB (1.8T) drive is <code>/dev/sdc</code>. The following commands will replace the drive with a new GPT partition table and create the 2 partitions:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="highlight"><code># Install parted
apt-get install parted

# remove everything on the /dev/sdc drive and
# replace with two empty equal-sized partitions
parted /dev/sdc --script mklabel gpt \
       mkpart primary 0% 50% \
       mkpart primary 50% 100% p</code></pre>
</div>
</div>
<div class="paragraph">
<p>After partitioning, we can mirror and stripe the drives. When we create the drive mirrors, ZFS creates virtual devices (vdevs). We can then connect the vdevs together into zpools. For example, we can <em>mirror</em> two 1TB drives and we end up with a 1TB vdev that will automatically replicate our data across both drives. Then we combine the two 1TB mirrored vdevs and end up with 2TB of storage.</p>
</div>
<div class="paragraph">
<p>Since the zpool read/write transactions are balanced across the two vdevs, we can actually get an increase in drive performance with the transactions happening in parallel across two physical drives. We can also compress the read-write transactions on the zpool. Because our CPU can compress-decompress data much faster than the drives can read-write data, our drive performance can improve even more, because of the smaller size of the read-write transactions on the zpool.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="highlight"><code># Creating our ZFS mirrored storage pool
zpool create -o ashift=12 tank \
      mirror /dev/sda /dev/sdc1 \
      mirror /dev/sdb /dev/sdc2 \
      log   /dev/nvme0n1p4 \
      cache /dev/nvme0n1p5</code></pre>
</div>
</div>
<table class="tableblock frame-all grid-all spread">
<caption class="title">Table 1. Creating our ZFS Storage Pool</caption>
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><div><div class="ulist">
<ul>
<li>
<p>zpool create -o ashift=12 tank \</p>
</li>
<li>
<p>mirror /dev/sda /dev/sdc1 \</p>
</li>
<li>
<p>mirror /dev/sdb /dev/sdc2 \</p>
</li>
<li>
<p>log   /dev/nvm0n1p4 \</p>
</li>
<li>
<p>cache /dev/nvm0n1p5</p>
</li>
</ul>
</div></div></td>
<td class="tableblock halign-left valign-top"><div><div class="ulist">
<ul>
<li>
<p>pool called tank with 4k sectors</p>
</li>
<li>
<p>first vdev</p>
</li>
<li>
<p>second vdev</p>
</li>
<li>
<p>16GB log partition</p>
</li>
<li>
<p>64GB cache partition</p>
</li>
</ul>
</div></div></td>
</tr>
</tbody>
</table>
<div class="listingblock">
<div class="content">
<pre class="highlight"><code>zfs set compression=lz4 tank  # lz4 pool compression
zfs create tank/vm-disks      # ZFS layer to store VM images</code></pre>
</div>
</div>
<div class="paragraph">
<p>Once that&#8217;s done, we can run the following commands:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="highlight"><code>zpool list          # verify that our pool has been created
zpool status tank   # check pool status and configuration
pvesm zfsscan       # list available ZFS file systems</code></pre>
</div>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_post_installation">Post-Installation</h2>
<div class="sectionbody">
<div class="paragraph">
<p>The PVE open-source license allows for testing and non-production use. If we would like to use PVE for production or we want commercial support, we can purchase a subscription, enter our key through the web interface, and skip to the "Update PVE" section.</p>
</div>
<div class="sect2">
<h3 id="_adjusting_the_pve_repositories">Adjusting the PVE Repositories</h3>
<div class="paragraph">
<p>The <a href="https://pve.proxmox.com/wiki/Package_Repositories" target="_blank">PVE Package Repositories</a> can be configured depending on your usage goals. Let&#8217;s include the non-commercial list of repositories.</p>
</div>
<div class="paragraph">
<p>Run <code>nano /etc/apt/sources.list</code> and update as follows:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="highlight"><code># main debian repo
deb http://ftp.us.debian.org/debian stretch main contrib

# security updates
deb http://security.debian.org stretch/updates main contrib</code></pre>
</div>
</div>
<div class="paragraph">
<p>Save and exit: press CTRL+X, Y for yes, and ENTER.</p>
</div>
<div class="paragraph">
<p>Comment-out the PVE commercial repository.</p>
</div>
<div class="paragraph">
<p>Run <code>nano /etc/apt/sources.list.d/pve-enterprise.list</code> and update as follows:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="highlight"><code># non-subscription repo (manual update)
deb http://download.proxmox.com/debian/pve stretch pve-no-subscription
#deb https://enterprise.proxmox.com/debian/pve stretch pve-enterprise</code></pre>
</div>
</div>
<div class="paragraph">
<p>Save and exit: press CTRL+X, Y for yes, and ENTER.</p>
</div>
</div>
<div class="sect2">
<h3 id="_update_pve">Update PVE</h3>
<div class="paragraph">
<p>Edit our <em>resume</em> settings: run <code>nano /etc/initramfs-tools/conf.d/resume</code> and add:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="highlight"><code>RESUME=none</code></pre>
</div>
</div>
<div class="paragraph">
<p>Save and exit: press CTRL+X, Y for yes, and ENTER.</p>
</div>
<div class="paragraph">
<p>Update the software packages, boot loader, and system image. From the PVE, command line type:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="highlight"><code>apt-get update &amp;&amp; apt-get upgrade -y
update-grub
update-initramfs -u</code></pre>
</div>
</div>
</div>
<div class="sect2">
<h3 id="_update_the_pve_storage_system">Update the PVE Storage System:</h3>
<div class="paragraph">
<p>Once we create our ZFS storage, we can go to the PVE Web GUI and add it to our setup. Being sure to use <em>HTTPS</em>, open <a href="https://machine-ip-address:8006" class="bare">https://machine-ip-address:8006</a> in a web browser. When we get the <em>certificate warning</em> message, we should proceed anyway. This happens because the machine does not have a certificate signed by a third party. Our goal is to end up with four storage volumes:</p>
</div>
<table class="tableblock frame-all grid-all spread">
<caption class="title">Table 2. PVE storage volumes.</caption>
<colgroup>
<col style="width: 27.2727%;">
<col style="width: 72.7273%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><div><div class="olist arabic">
<ol class="arabic">
<li>
<p>vm-disks</p>
</li>
<li>
<p>zfs-backups</p>
</li>
<li>
<p>zfs-containers</p>
</li>
<li>
<p>zfs-templates</p>
</li>
</ol>
</div></div></td>
<td class="tableblock halign-left valign-top"><div><div class="ulist">
<ul>
<li>
<p>Stores RAW disk images more efficiently</p>
</li>
<li>
<p>Stores VZDump backups of virtual machines</p>
</li>
<li>
<p>Stores LXC container filesystems</p>
</li>
<li>
<p>Stores ISOs and container templates</p>
</li>
</ul>
</div></div></td>
</tr>
</tbody>
</table>
<div class="paragraph">
<p>Once logged in, we go to Datacenter &gt; Storage, and:</p>
</div>
<div class="olist arabic">
<ol class="arabic">
<li>
<p>click <strong>Add</strong> &gt; <strong>ZFS</strong>, then enter "<strong><em>vm-disks</em></strong>" for ID, and select <em>tank/vm-disks</em> for pool, choose only <em>Disk Image</em> for content, and finally tick the <em>Thin Provision</em> checkbox and select <strong>Add</strong>.</p>
</li>
<li>
<p>click <strong>Add</strong> &gt; <strong>ZFS</strong>, then enter "<strong><em>zfs-containers</em></strong>" for ID, and select <em>tank</em> for pool, and <em>Container</em> for content, and select <strong>Add</strong>.</p>
</li>
<li>
<p>click <strong>Add</strong> &gt; <strong>Directory</strong>, then enter "<strong><em>zfs-backups</em></strong>" for ID, enter "<em>/tank</em>" (/our-zfs-pool) for directory, and choose only <em>VZDump backup files</em> for content, then select <strong>Add</strong>.</p>
</li>
<li>
<p>click <strong>Add</strong> &gt; <strong>Directory</strong>, then enter "<strong><em>zfs-templates</em></strong>" for ID, enter "<em>/tank</em>" (/our-zfs-pool) for directory, and choose both <em>container templates</em> and <em>ISO images</em> for content, then select <strong>Add</strong>.</p>
</li>
</ol>
</div>
<div class="paragraph">
<p>After adding our new storage options, we can disable the local storage:</p>
</div>
<div class="olist arabic">
<ol class="arabic">
<li>
<p>select <strong>local-lvm</strong>, click <strong>Edit</strong>, untick the <em>Enable</em> checkbox, and click "OK".</p>
</li>
<li>
<p>select <strong>local</strong>, click <strong>Edit</strong>, untick the <em>Enable</em> checkbox, add "1" for <em>Max Backups</em>, and then click "OK".</p>
</li>
</ol>
</div>
<div class="paragraph">
<p>Afterward, if we select the arrow next to pve in the <em>Server View</em>, we will only see only four enabled storage options.</p>
</div>
<table class="tableblock frame-all grid-all spread">
<colgroup>
<col style="width: 10%;">
<col style="width: 80%;">
<col style="width: 10%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"><div><div class="paragraph">
<p>PVE Storage Volume Setup
<span class="image"><img src="https://roobyz.github.io/images/Server-Virtualization-Management/pve-zfs-setup.gif" alt="pve-zfs-setup"></span></p>
</div></div></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
</tbody>
</table>
<div class="paragraph">
<p>We made it! With only one storage volume for each type of content, there&#8217;s no way to accidentally misplace something. Creating containers and VMs should function as expected.</p>
</div>
<div class="paragraph">
<p>Our machine is ready to go, however this is only part 2 of our multipart tutorial. Our next installment will cover some opportunities for <em>System Optimization</em>.</p>
</div>
<div class="ulist">
<ul>
<li>
<p><a href="/2017/05/03/Server-Virtualization-Management-Part3.html">Part 3: System Optimization</a></p>
</li>
</ul>
</div>
</div>
</div>
</div>]]></description><link>https://roobyz.github.io/2017/04/25/Server-Virtualization-Management-Part2.html</link><guid isPermaLink="true">https://roobyz.github.io/2017/04/25/Server-Virtualization-Management-Part2.html</guid><category><![CDATA[Blog]]></category><category><![CDATA[Open_Source]]></category><category><![CDATA[Technology]]></category><category><![CDATA[ProxMox]]></category><dc:creator><![CDATA[Roberto Rivera]]></dc:creator><pubDate>Tue, 25 Apr 2017 00:00:00 GMT</pubDate></item><item><title><![CDATA[PVE: Virtualization for Work and Play (Part 1)]]></title><description><![CDATA[<div class="sect1">
<h2 id="_the_plan">The Plan&#8230;&#8203;</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Say we want a powerful "bang for the buck" home server for games <em>and</em> other system-intensive pursuits. We may want to run powerful analytics applications which would undoubtedly require Linux, but we may also want to run Windows applications. We may want near native 2D and 3D graphics performance inside the guest operating system (OS) while making dual-booting obsolete. Finally, we may want to do all of that from the comfort of our couch using a Windows, Linux or Mac laptop. Lets do it!</p>
</div>
<div id="toc" class="toc">
<div id="toctitle" class="title">Table of Contents</div>
<ul class="sectlevel1">
<li><a href="#_the_plan">The Plan&#8230;&#8203;</a></li>
<li><a href="#_introduction">Introduction</a></li>
<li><a href="#_hardware_considerations">Hardware Considerations</a></li>
<li><a href="#_software_considerations">Software Considerations</a></li>
<li><a href="#_next_steps">Next Steps&#8230;&#8203;</a></li>
</ul>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_introduction">Introduction</h2>
<div class="sectionbody">
<div class="paragraph">
<p><a href="https://en.wikipedia.org/wiki/X86_virtualization" target="_blank">Hardware virtualization</a> allows multiple operating systems to simultaneously share processor resources. With the <a href="https://opensource.org/" target="_blank">open source</a> server management solution, <a href="https://www.proxmox.com/en/" target="_blank">Proxmox Virtual Environment (PVE)</a>, we can  leverage hardware virtualization to achieve our goals. PVE enables the creation of multiple virtual OS "servers" via a Web GUI; as many as our hardware setup will allow. This guide will document the setup of PVE on the following hardware:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>AMD Ryzen 7 1600 (8 cores, 16 threads @ 3.7GHz)</p>
</li>
<li>
<p>64GB 2400MHz DDR4</p>
</li>
<li>
<p>Boot Drive: 1x 512GB NVMe SSD</p>
</li>
<li>
<p>Storage (Striped-mirrored ZFS):</p>
<div class="ulist">
<ul>
<li>
<p>2x 1TB SATA SSD (striped)</p>
</li>
<li>
<p>1x 2TB 7200rpm mechanical drives (mirrored)</p>
</li>
</ul>
</div>
</li>
</ul>
</div>
<div class="paragraph">
<p>So what&#8217;s the difference between a VM and a container anyway, and how do we choose between them? A VM is computer software that emulates a particular computer hardware system and requires an OS to function. In other words, VMs "pretend" to be an actual computer of the type that <em>we</em> specify and will need to have a Guest OS like Windows or Linux running. Containers are software that emulates the Host OS, to enable software to run predictably.</p>
</div>
<table class="tableblock frame-all grid-all spread">
<colgroup>
<col style="width: 10%;">
<col style="width: 80%;">
<col style="width: 10%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"></td>
<td class="tableblock halign-left valign-top"><div><div class="paragraph">
<p>Diagram 1: Comparison of a VM &amp; Container on One Machine
<span class="image"><img src="https://roobyz.github.io/images/Server-Virtualization-Management/vms-and-containers.png" alt="vms-cnt"></span></p>
</div></div></td>
<td class="tableblock halign-left valign-top"></td>
</tr>
</tbody>
</table>
<div class="paragraph">
<p>If we want to run multiple applications on one server, to have increased security, or to run an operating system that is different from our host system, then a VM is our choice. To run different versions of an application (i.e RStudio) and validate reproducibility and reliability, then we want to use containers. Compared to VMs, containers are quicker, "lighter weight" and more transient so they can be readily packaged, shared, and moved to other hardware.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_hardware_considerations">Hardware Considerations</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Our CPU and motherboard must support "virtualization” (SVM) and IOMMU, which needs to be enabled in firmware for resource sharing. Also, we should have 32GB of RAM or more, so that we can reserve at least 16GB for a single virtual machine (VM) and still have enough memory left over for PVE and potentially other VMs running simultaneously.</p>
</div>
<div class="paragraph">
<p>While most of our computer hardware can be shared between multiple VMs, the graphics card (GPU) may not readily be shared, so we&#8217;ll need at least two GPUs:</p>
</div>
<div class="olist arabic">
<ol class="arabic">
<li>
<p>One GPU for PVE (the host);</p>
</li>
<li>
<p>One powerful GPU for our VMs (the guests: Windows, Linux, etc.).</p>
</li>
</ol>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_software_considerations">Software Considerations</h2>
<div class="sectionbody">
<div class="paragraph">
<p><a href="https://jannikjung.me/proxmox-ve-5-0-beta1/" target="_blank">PVE 5.0</a> is based in <a href="https://wiki.debian.org/DebianStretch" target="_blank">Debian Linux (Stretch)</a>. Since our Ryzen hardware is rather new, our host system needs to have a Linux kernel version 4.10 or later. Although in beta at the time of this writing, PVE 5.0 has better support for Ryzen than PVE 4.4.</p>
</div>
<div class="paragraph">
<p>PVE natively supports both <a href="https://www.linux-kvm.org/page/Main_Page" target="_blank">KVM</a> for hardware virtualization and <a href="https://linuxcontainers.org/lxc/introduction/" target="_blank">LXC containers</a> for Linux system virtualization. Since the guest systems can run under hardware virtualization, we get some added bonuses. For example, we can benefit from Ryzen hardware and still get <a href="http://www.pcworld.com/article/3189990/windows/microsoft-blocks-kaby-lake-and-ryzen-pcs-from-windows-7-81-updates.html" target="_blank">Windows 7 updates</a>. We would need to identify our Windows <a href="https://www.nextofwindows.com/the-best-way-to-uniquely-identify-a-windows-machine" target="_blank">Universally Unique Identifier (UUID)</a> so that it may be identical on our VM. Otherwise, Microsoft may think that we have a new version of Windows that needs to be registered.</p>
</div>
<div class="paragraph">
<p>We will use <a href="https://github.com/zfsonlinux/zfs/wiki/faq" target="_blank">ZFS</a>, a storage platform that encompasses the functionality of traditional filesystems, volume managers, and more, with consistent reliability, and performance. Our ZFS installation will be compressed and striped: our two SSD drives will run in parallel and require less storage space, which improves read/write performance. In addition, our ZFS will be mirrored: our SSD drives will be cloned so that we have a backup in case of drive failure.</p>
</div>
<table class="tableblock frame-all grid-all spread">
<colgroup>
<col style="width: 11.1111%;">
<col style="width: 88.8889%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-center valign-middle"><p class="tableblock"><span class="image"><img src="/images/icons/lightbulb.png" alt="lightbulb.png" width="56"></span></p></td>
<td class="tableblock halign-left valign-top"><div><div class="paragraph">
<p><strong>About That</strong>: KVM supports multiple disk formats; raw images, the native QEMU format (qcow2), VMware format, and many more. When working with ZFS on PVE, we need to use raw images. It may not seem obvious at first, but we can easily convert an existing KVM file from one format to a raw image. Near the end of this guide, we&#8217;ll cover the process to convert a qcow2 format to the required PVE raw image.</p>
</div></div></td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="sect1">
<h2 id="_next_steps">Next Steps&#8230;&#8203;</h2>
<div class="sectionbody">
<div class="paragraph">
<p>This is Part 1 of a multipart tutorial. The next two part will cover installation of PVE and server tweaks we can make to improve performance of our VMs and containers:</p>
</div>
<div class="ulist">
<ul>
<li>
<p><a href="/2017/04/25/Server-Virtualization-Management-Part2.html">Part 2: Getting Started</a></p>
</li>
<li>
<p><a href="/2017/05/03/Server-Virtualization-Management-Part3.html">Part 3: System Optimization</a></p>
</li>
</ul>
</div>
</div>
</div>]]></description><link>https://roobyz.github.io/2017/04/23/Server-Virtualization-Management.html</link><guid isPermaLink="true">https://roobyz.github.io/2017/04/23/Server-Virtualization-Management.html</guid><category><![CDATA[Blog]]></category><category><![CDATA[Open_Source]]></category><category><![CDATA[Technology]]></category><category><![CDATA[ProxMox]]></category><dc:creator><![CDATA[Roberto Rivera]]></dc:creator><pubDate>Sun, 23 Apr 2017 00:00:00 GMT</pubDate></item><item><title><![CDATA[Titanic: Learning Data Science with RStudio]]></title><description><![CDATA[<div id="preamble">
<div class="sectionbody">
<div class="paragraph">
<p>So we are aspiring data scientists and want to dip our toes into <a href="http://rmarkdown.rstudio.com/" target="_blank">RStudio</a>. How do we get started? We dive into the into the waters of the <a href="https://www.kaggle.com/c/titanic" target="_blank">Kaggle Titanic "Competition"</a>, of course!</p>
</div>
<div id="toc" class="toc">
<div id="toctitle" class="title">Table of Contents</div>
<ul class="sectlevel1">
<li><a href="#_our_objective">Our Objective</a></li>
<li><a href="#_kaggle_basics">Kaggle Basics</a></li>
<li><a href="#_titanic_history_lesson">Titanic History Lesson</a></li>
<li><a href="#_next_steps">Next Steps&#8230;&#8203;</a></li>
</ul>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_our_objective">Our Objective</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Use this Kaggle exercise to:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>learn to reason and problem solve like a data scientist</p>
</li>
<li>
<p>get somewhat comfortable with RStudio</p>
</li>
<li>
<p>predict whether a passenger would survive the sinking of the <a href="https://en.wikipedia.org/wiki/RMS_Titanic" target="_blank">Titanic</a></p>
</li>
<li>
<p>enter a Kaggle submission file for evaluation</p>
</li>
<li>
<p>have fun!</p>
</li>
</ul>
</div>
</div>
</div>
<div class="sect1">
<h2 id="_kaggle_basics">Kaggle Basics</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Kaggle is a community of data scientists and a platform for facilitating data science journeys. One way to participate, is by entering data science competitions. Similar to other competitions, Kaggle provides two Titanic datasets containing passenger attributes:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>a <em>training set</em>, complete with the outcome (target) variable for training our predictive model(s)</p>
</li>
<li>
<p>a <em>test set</em>, for predicting the unknown outcome variable based on the passenger attributes provided in both datasets.</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>After training and validating our predictive model(s), we can then enter the submission file to Kaggle for evaluation. As we iterate, we can submit more files and assess our progress on the leaderboard. Subtle model improvements can lead to significant leaps on the leaderboard.</p>
</div>
<table class="tableblock frame-all grid-all spread">
<colgroup>
<col style="width: 11.1111%;">
<col style="width: 88.8889%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-center valign-middle"><p class="tableblock"><span class="image"><img src="/images/icons/lightbulb.png" alt="lightbulb.png" width="56"></span></p></td>
<td class="tableblock halign-left valign-top"><div><div class="paragraph">
<p><strong>About That</strong>: Predictive models are trained using attributes (variables), right? How does that work?</p>
</div>
<div class="paragraph">
<p>Some attributes are correlated: as they vary, to some degree other attributes may also vary. Machine learning leverages that interdependence to model the predicted outcomes. For accurate model performance, we need to:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>maximize the number of explanatory variables: those that are correlated with the outcome variable, and</p>
</li>
<li>
<p>compensate for the correlation of explanatory variables to each other (<a href="https://en.wikipedia.org/wiki/Multicollinearity" target="_blank">multicollinearity</a>).</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>In other words, we need to find the fewest quantity of variables that can explain almost everything that is going on with the outcome that we want to predict.</p>
</div></div></td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="sect1">
<h2 id="_titanic_history_lesson">Titanic History Lesson</h2>
<div class="sectionbody">
<div class="paragraph">
<p>The Titanic was a British passenger liner that sank after colliding with an iceberg in the Atlantic on its maiden voyage en route to New York City. It was the largest ship of its time with 10 decks, 8 of which were for passengers.</p>
</div>
<div class="paragraph">
<p>There were 2,224 passengers and crew aboard. Of the 1,317 passengers, there were: 324 in First Class (including some of the wealthiest people of the time), 284 in Second Class, and 709 in Third Class. Of these, 869 (66%) were male and 447 (34%) female. There were 107 children aboard, the largest number of which were in Third Class.</p>
</div>
<div class="paragraph">
<p>The ship had enough lifeboats for about 1,100 people, and more than 1,500 died. Due to the "women and children first" protocol, men were disproportionately left aboard. Also, not all lifeboats were completely filled during the evacuation. The 705 surviving passengers were rescued by the RMS Carpathia around 2 hours after the catastrophe.</p>
</div>
<table class="tableblock frame-all grid-all spread">
<colgroup>
<col style="width: 11.1111%;">
<col style="width: 88.8889%;">
</colgroup>
<tbody>
<tr>
<td class="tableblock halign-center valign-middle"><p class="tableblock"><span class="image"><img src="/images/icons/lightbulb.png" alt="lightbulb.png" width="56"></span></p></td>
<td class="tableblock halign-left valign-top"><div><div class="paragraph">
<p><strong>About That</strong>: So the different variables like gender and class could influence whether someone survived, correct?</p>
</div>
<div class="paragraph">
<p>Yes, for example, someone may not have been able to get to the lifeboats because they were in a lower deck, which is correlated with third class. Also, children were disporptionately in third class, but they were also impacted by the "children first" protocol.</p>
</div></div></td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="sect1">
<h2 id="_next_steps">Next Steps&#8230;&#8203;</h2>
<div class="sectionbody">
<div class="paragraph">
<p>We&#8217;ll approach this in multiple parts. This is still a work in progress, but roughly speaking it should look like:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>Part 1: Basic Setup</p>
</li>
<li>
<p>Part 2: Feature Engineering</p>
</li>
<li>
<p>Part 3: Prediction</p>
</li>
<li>
<p>Part 4: Conclusion</p>
</li>
</ul>
</div>
</div>
</div>]]></description><link>https://roobyz.github.io/2017/04/16/Predict-Survival-Propensity-of-Titanic-Passengers.html</link><guid isPermaLink="true">https://roobyz.github.io/2017/04/16/Predict-Survival-Propensity-of-Titanic-Passengers.html</guid><category><![CDATA[Blog]]></category><category><![CDATA[Open_Source]]></category><category><![CDATA[Machine_Learning]]></category><category><![CDATA[Analytics]]></category><category><![CDATA[Data_Science]]></category><dc:creator><![CDATA[Roberto Rivera]]></dc:creator><pubDate>Sun, 16 Apr 2017 00:00:00 GMT</pubDate></item><item><title><![CDATA[Qualitative Prediction of Weight-Lifting Exercises]]></title><description><![CDATA[<div class="paragraph">
<p>One day your company tells your team that they should switch from a proprietary analytics platform like <a href="https://www.sas.com/" target="_blank">SAS</a>, to something <a href="https://opensource.org/" target="_blank">open source</a> like <a href="http://rmarkdown.rstudio.com/" target="_blank">RStudio</a>. Undoubtedly some analysts may become fascinated, while others anxious. How would you get them excited about the switch?</p>
</div>
<div class="paragraph">
<p>My approach was to demonstrate the potential of the new open source platform. How easily could RStudio generate reproducible research and facilitate story telling with data? How could we weave together narrative text and code to seamlessly produce and deliver elegantly formatted analyses to multiple audiences?</p>
</div>
<div class="paragraph">
<p>Leveraging Human Activity Recognition (HAR) data provided from a <a href="http://groupware.les.inf.puc-rio.br/har#ixzz3de67BWZU" target="_blank">Groupware@LES</a> study, a machine learning use-case was born. HAR data has become ubiquitous with the advent of devices like the Fitbit, Nike FuelBand, and even smartphones. Although users of these devices tend to quantify how much they participate in an activity, they rarely consider how <em>well</em> they perform the activity.</p>
</div>
<div class="paragraph">
<p>The provided multiclass variable was generated by participants wearing HAR devices, and is relatively balanced (equally distributed). This simplifies the analysis somewhat since we don&#8217;t need to consider tactics to combat <a href="http://machinelearningmastery.com/tactics-to-combat-imbalanced-classes-in-your-machine-learning-dataset/" target="_blank">imbalanced classes</a>. In addition, this lets us focus primarily on the other major analytics steps required in most machine learning projects.</p>
</div>
<div class="paragraph">
<p>Machine learning project goals:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>use the multiclass variable to build a predictor that distinguishes between participants that correctly completed fitness exercises versus those that hadn’t and what their mistakes may have been.</p>
</li>
<li>
<p>demonstrate the feasibility of using RStudio for delivering reproducible research via a webpage.</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>The <a href="https://cdn.rawgit.com/roobyz/PredictiveML/c0297e0d771e39633436b3cff87707f0c5f4b851/ml_activity_success.html" target="_blank">project results</a>, and <a href="https://raw.githubusercontent.com/roobyz/PredictiveML/master/ml_activity_success.Rmd" target="_blank">code</a> are available on my <a href="https://github.com/roobyz/PredictiveML" target="_blank">GitHub Repository</a>.</p>
</div>]]></description><link>https://roobyz.github.io/2017/04/15/Identifying-the-Successful-Completion-of-Weight-Lifting-Exercises.html</link><guid isPermaLink="true">https://roobyz.github.io/2017/04/15/Identifying-the-Successful-Completion-of-Weight-Lifting-Exercises.html</guid><category><![CDATA[Blog]]></category><category><![CDATA[Open_Source]]></category><category><![CDATA[Machine_Learning]]></category><category><![CDATA[Analytics]]></category><category><![CDATA[Data_Science]]></category><dc:creator><![CDATA[Roberto Rivera]]></dc:creator><pubDate>Sat, 15 Apr 2017 00:00:00 GMT</pubDate></item></channel></rss>