Roobyz Ramblings

PVE: Virtual Windows

Roberto Rivera — Mon, 29 May 2017 00:00:00 GMT

Create a Windows Image…

Imagine you are running Windows 7 in a VM, with near-native Windows performance and the ability to easily switch between Windows and Linux. Sit back and relax, because we are planning to solve the first part of that wish!

Table of Contents

Create a Windows Image…
Install Windows 7
Checkpoint
- Backup
- Cloning
Next Steps

Install Windows 7

We will complete this process in three steps. First, we create the VM. Second, we install Windows into the VM. Third, we tweak the VM and Windows environment for improved performance.

Note: This post took a bit longer than planned. This is a result of the beta status of the PVE version we are using combined with the newness of the Ryzen platform. This means that not all of the kinks are guaranteed to be sorted, yet. As a result, there were some interesting issues with UEFI, ZFS, and the backup/restore process that needed to be resolved. That said, we have some workarounds and can continue with our journey.

Step 1: Create the VM

Initial Setup

Place the ISO images into the templates folder.
Setup our VM using OVMF (UEFI Bios)

At the end of this process, we will have a PVE configuration file (/etc/pve/qemu-server/.conf) with our Windows VM settings. The VMID for this example is 100, so therefore the file would be called /etc/pve/qemu-server/100.conf and will contain something like the following:

bios: ovmf
bootdisk: virtio0
cores: 4
cpu: host
hotplug: disk,usb
ide0: zfs-templates:iso/virtio-win-0.1.126.iso,media=cdrom,size=152204K
ide2: zfs-templates:iso/Win7_HomePrem_SP1_English_x64.iso,media=cdrom
memory: 16384
name: Windoze
numa: 0
ostype: win7
scsihw: virtio-scsi-pci
sockets: 1
vga: qxl
virtio0: vm-disks:vm-100-disk-1,cache=writeback,size=128G

Setting Tweaks

We need to make some manual updates to this configuration.

Add the args line to enable sound and make our Ryzen processor appear as a Haswell processor in Windows. This should allow for continued Windows updates with balanced compatibility and performance.
Update the cpu line to include a hidden parameter that makes Windows appear as if it isn’t running inside a virtual machine. This should help avoid issues with GPU passthrough on Nvidia cards.
Add the hostpci lines to include the GPU IDs identified in our prior post: Configure the VFIO Modules. This will allow Windows to recognize the GPU for driver setup. We will update this again later.
Update the ide lines to sata lines.
Ensure the machine line is set to q35

args: -cpu Haswell-noTSX,hv_vendor_id=TowerVM,kvm=off -device intel-hda,id=sound5,bus=pcie.0,addr=0x18 -device hda-micro,id=sound5-codec0,bus=sound5.0,cad=0 -device hda-duplex,id=sound5-codec1,bus=sound5.0,cad=1
cpu: host,hidden=1
hostpci0: host=28:00.0,pcie=1
hostpci1: host=28:00.1,pcie=1
sata0: zfs-templates:iso/virtio-win-0.1.126.iso,media=cdrom,size=152204K
sata2: zfs-templates:iso/Win7_HomePrem_SP1_English_x64.iso,media=cdrom
machine: q35

We should now be ready to install Windows.

Step 2: Begin Installation

Before watching this great PVE Installation Video from ProxMox, make note of the following key points:

Important first step: quickly open the Spice console and hit the spacebar after the UEFI boot screen, to launch the Windows installation process. Then initiate the Windows Setup. Windows will complain that "no drives were found".
Select "Custom (advanced)" and then:
1. Install NetKVM drivers:
  1. click Load Driver
  2. click Browse and then select the drive with the "virtio-win" drivers.
  3. select the folder: NetKVM → w7 → amd64
  4. click OK
  5. click Next
2. Install viostor drivers:
  1. click Load Driver
  2. click Browse and then select the drive with the "virtio-win" drivers.
  3. select folder: viostor→ w7 → amd64
  4. click OK
  5. click Next

At this point, the installation process will recognize the virtual disk and be able to continue with the installation process. In addition, Windows should automatically recognize and configure the network card. We can bypass the registration process until later. Select our "Home" network.

Step 3: Post Installation

Update Windows!

After installing Windows we should immediately check for Windows updates. This will likely take the most time of this process and require many reboots. Keep checking for updates until Windows says there are no more.

Afterward, we should go back to our PVE Web GUI and select our Windows VM, select Hardware and then for each CD/DVD Drive click Edit. In the edit window, we should select "Do not use any media" and then click OK.

Note: Although Windows 7 supports UEFI boot, Microsoft designed it under the assumption that it is running alone on real hardware. It is likely that when you try to run our Windows VM it will bail out to the UEFI Interactive Shell. If you get stuck there, on the PVE Web GUI click Shutdown → PowerOff for the Windows VM. We can download Super Grub Disk and then upload the iso to our zfs-templates folder. Once we have assigned it to our first "CD/DVD", we can start our VM, which will boot into the Grub bootloader. Press on Detect and show boot methods to quickly identify the boot partitions, select the first entry (hd0,gpt1)/efi/Boot/booxx64.efi, and press again to boot into Windows.

Install Additional Drivers

After updating, we should install the Spice Guest Tools for Windows. This will include drivers that will speed up the VM and make it nicer to use with Virt-Viewer. Also, we should install the Nvidia Geforce graphics drivers.

Checkpoint

Backup

This is a great time to save our work. In case something goes haywire, we should have a backup so that we don’t go through that whole process again, right? In addition, wouldn’t it be nice to have a clean slate in case of a virus or worm? We are talking about Windows after all.

Backing up our VM is easy to do with PVE. First, we shutdown Windows. Next, on our PVE Web GUI, we select the "Summary" tab of our Windows VM and confirm the Status is "Stopped". Finally, we select the "Backup" tab, click "Backup Now" and then click "Backup". Then we need to be patient until it completes.

INFO: transferred 137438 MB in 515 seconds (266 MB/s)
INFO: stopping kvm after backup task
INFO: archive file size: 13.89GB
INFO: Finished Backup of VM 100 (00:08:43)

After backing up our VM, we may want to verify the integrity of our backup. PVE uses a new backup format called VMA.

Note: While trying to restore, the PVE process generated an error and failed. Unfortunately, this process removes the original VM disk image and then tries to restore the backup, which meant that I had to repeat the installation process multiple times. However, this should be resolved when the production version of PVE is released. In the mean time, we can use the following manual process to backup and restore.

We can also back the drive image directly from the command line. There are multiple ways to do this. Using the Unix command dd, we will duplicate the emulated zvol block device as a raw file in our zfs pool.

Backing up our VM disk image

# Install "pixie" for parallel compression of our disk image
agt-get install pixz

# Backup ("dd") and compress ("pixz") our VM disk image
# Note: it is a good idea to add a date to the filename in case
#       you want to have multiple backup versions.
dd if=/dev/zvol/tank/vm-disks/vm-100-disk-1 | pixz -2 > /tank/100-windows.raw.xz

# 137438953472 bytes (137 GB, 128 GiB) copied, 603.995 s, 228 MB/s
# 4.4G 100-windows.raw.xz

# Note: we could back up our image without compression and let zfs
# handle the compression by default (lz4). Although lz4 is
# faster for interactive I/O, it isn't as space efficient as xz. In
# addition, since pixz can run in parallel, it is more than twice as
# fast as the default compression. Compare this performance to above.
# dd if=/dev/zvol/tank/vm-disks/vm-100-disk-1 of=/tank/100-windows.raw
# 137438953472 bytes (137 GB, 128 GiB) copied, 1480.54 s, 92.8 MB/s
# 128G 100-windows.raw

About That: A zvol is an emulated block device provided by ZFS. By default, PVE creates one zvol for each VM-disk. Once we create our VM we can see our disks by running: zfs list -t all -r tank. The disks would be hidden in the tank folder, however, they would be mapped as block devices which we can see if we run: ll /sys/block/zd*. It is important to note that PVE saves the disk size in our configuration file, however, it also runs a command similar to zfs create -V 128G tank/vm-disks/vm-100-disk-1 which also sets the zvol block device size in our zpool. Ensure you update your drive size in the PVE Web GUI so that they match up.

Restoring our backup:

# Check to ensure our VM disk is still enabled:
zfs list -t all -r tank

# We should see something like:
# tank/vm-disks/vm-100-disk-1   132G  1.72T  21.7G  -
#
# if something bad happened and we cannot see the zvol, we can
# manually recreate it with the following command:
zfs create -V 128G tank/vm-disks/vm-100-disk-1

# Restore our compressed VM disk image.
pixz -d -i /tank/100-windows.raw.xz | dd of=/dev/zvol/tank/vm-disks/vm-100-disk-1

# 137438953472 bytes (137 GB, 128 GiB) copied, 799.285 s, 172 MB/s

Congratulations! We can now backup and restore our disk images. In case of emergency, we can recreate our image to a known good state. In addition, we have the opportunity to move our disk image to another computer.

About That: If we already have a VM disk image that we want to restore to PVE, we can use a similar process to our backup and restore process. Key points to consider, whether the disk image is in qcow2 or some other format, we need to convert it to "raw" format. For example, we could run something like: qemu-img convert -O raw windoze.qcow2 windoze.raw. Once converted to raw format we might want to compress it like: pixz -2 windoze.raw windoze.raw.xz. Finally, we need to create a VM that has a disk size that is equal to our original image and then we can restore similar to above.

Cloning

Another cool feature of PVE is cloning. We can get our existing Windows VM and clone it to have an alternate installation to experiment with. For example, maybe we want to install some experimental software or make hardware changes without risking our standard VM.

Depending on how large the VM disk image is, the cloning process may take a while. Be patient for it to complete. To check progress, we can run from the command line: zfs list -t all -r tank. In this example, we can see that vm-101-disk-1 REFER size is only 20.1G compared to 22.1G for vm-100-disk-1, which we are cloning:

Example

NAME                          USED  AVAIL  REFER  MOUNTPOINT
tank                          169G  1.59T  16.7G  /tank
tank/vm-disks                 152G  1.59T    96K  /tank/vm-disks
tank/vm-disks/vm-100-disk-1   132G  1.70T  22.1G  -
tank/vm-disks/vm-101-disk-1  20.1G  1.59T  20.1G  -

Next Steps

At this point, we should have our Windows installation running on our Ryzen server as-if it were running on a Haswell machine, at near native performance. Our NVidia drivers should be installed, however, the "Display Adapter" status will likely show as: "This device cannot find enough free resources that it can use. (Code 12)".

Considering that Ryzen is a new platform, the system bios and Linux Kernel support are not quite at 100% yet. AMD recently released an AGESA Update that should make it into a bios update over the next few weeks. This should improve virtualization support for PCI Express Access Control Services (ACS). The ACS support also needs to make it into the Linux kernel. After these updates are ready, we should be able to finish enabling full GPU Passthrough to our VM.

PVE: Virtualization for Work and Play (Part 3)

Roberto Rivera — Wed, 03 May 2017 00:00:00 GMT

System Optimization…

In the previous post we installed ProxMox Virtual Environment (PVE) and configured our ZFS ZPool storage system. Let’s tweak our system to improve performance.

Table of Contents

System Optimization…
ZFS Tune-up
Graphics Processing Unit (GPU) Passthrough
- Enable the VFIO Modules:
- Configure the VFIO Modules
  - Identify Passthrough Device
  - Enable Passthrough Device
Update Boot Settings
Final Thoughts

ZFS Tune-up

Solaris-based UNIX and Linux treat extended attributes differently, which has some performance implications. By default, ZFS on Linux is set to xattr=on which causes the extended attributes to be stored in separate hidden directories. Changing this property to "system attributes" improves performance significantly under Linux as a results of extended attributes being stored more efficiently on disk.

To check the attribute run zfs get -r xattr tank (update tank with your zpool name) and we’ll see something like the following:

NAME                         PROPERTY  VALUE  SOURCE
tank                         xattr     on     default
tank/vm-disks                xattr     on     default

To update the property run zfs set xattr=sa tank

Linux has extended attributes called Posix ACL that are not functional on other platforms. To check the attribute, run zfs get -r acltype tank and we should get something like:

NAME                         PROPERTY  VALUE     SOURCE
tank                         acltype   off       local
tank/vm-disks                acltype   off       inherited from zentank

To update the property run zfs set acltype=posixacl tank. Also, so that ACLs get passed to files created within a directory, we need to run zfs set aclinherit=passthrough tank as well.

Linux has two parameters related to the times when files are accessed. The first is atime, which tracks the “last” access time. This creates a lot of overhead because every time a file is read, an update has to write to disk to reflect this access time. The second is relatime, which is similar but is a relative atime, and writes on fewer occasions. We can check both of them with zfs get -r atime tank && zfs get -r relatime tank and we should see something like:

NAME                         PROPERTY  VALUE  SOURCE
tank                         atime     on     default
tank/vm-disks                atime     on     default

NAME                         PROPERTY  VALUE     SOURCE
tank                         relatime  off       default
tank/vm-disks                relatime  off       default

If we need to track access times, "relatime" is preferable, however, we can disable both by running zfs set atime=off tank, since "relatime" is already disabled.

We can set additional options for reliability. Run zpool set autoreplace=on tank so that ZFS can automatically switch to an available hot spare if hardware errors are detected on online disks. Run zpool set autoexpand=on tank to allows the pool to grow when all VDEVs have been replaced with larger ones. This must be set before any drives are replaced, so we may as well set it now.

About That: Many ZFS properties are not retroactive. To apply to existing files, we would need to replace the files. In other words, if you already have files or data stored on your ZFS pool, you would need to move them somewhere else (i.e. backup) and then move them back so that the changes in properties are applied correctly.

Graphics Processing Unit (GPU) Passthrough

Passthrough allows our Virtual Machine (VM) to access GPU hardware for games, graphics, and heavy computation (i.e. deep learning). We must enable IOMMU ("Input–output memory management unit") drivers, which allocate device-visible virtual addresses to the actual physical addresses. IOMMU enables our VM to communicate with our GPU using the virtual addresses as if it were directly communicating to the GPU.

VFIO ("Virtual Function I/O") modules are part of an IOMMU device-agnostic framework for exposing direct device access to userspace, in a secure IOMMU protected environment. In other words, they provide access to non-privileged, low-overhead userspace drivers.

Enable the VFIO Modules:

Run nano /etc/modules and add the following:

vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

Save and exit: press CTRL+X, Y for yes, and ENTER.

Configure the VFIO Modules

Identify Passthrough Device

To identify the GPU to passthrough run lspci -nn | grep VGA.

21:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM107 [GeForce GTX 745] [10de:1382] (rev a2)
28:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104 [GeForce GTX 1080] [10de:1b80] (rev a1)

Identify the GPU slot IDs (first pair of numbers separated by a colon):

My GPU Slot ID for passthrough is: 28:00
My GPU Slot ID for the host is: 21:00

Identify the vendor ID for passthrough: lspci -nns 28:00 | cut -d "(" -f 1 | cut -d ":" -f 3,4

 NVIDIA Corporation GP104 [GeForce GTX 1080] [10de:1b80]
 NVIDIA Corporation GP104 High Definition Audio Controller [10de:10f0]

Vendor ID for GPU VGA device: 10de:1b80
Vendor ID for GPU Audio device: 10de:10f0

Enable Passthrough Device

To enable passthrough, add the following module options (including the comma separated vendor IDs identified in the prior step). This loads options for the vfio-pci kernel module, which maps memory regions from the PCI bus to the VM, and activates support for IOMMU groups.

Run nano /etc/modprobe.d/kvm.conf and add some of the following options (see Table 1 for details):

# uncomment the first option if required for your system.
#options vfio_iommu_type1 allow_unsafe_interrupts=1
options vfio-pci         ids=10de:1b80,10de:10f0
options vfio-pci         disable_vga=1
options kvm-amd          npt=0
options kvm              ignore_msrs=1

Save and exit: press CTRL+X, Y for yes, and ENTER.

Table 1. Module option details
Option	Details
allow_unsafe_interrupts=1	This workaround is for platforms without interrupt remapping support, which provides device isolation. It removes protection against MSI-based interrupt injection attacks by guests. Only trusted guests and drivers should be run with this configuration.
ids=10de:1b80,10de:10f0	Assign desired GPU to the virtual pci for use in our VM.
disable_vga=1	Opt-out devices from vga arbitration if possible.
npt=0	Disable Nested Page Table If VM performance is very slow. Linux guests with Q35 and OVMF may work with npt on or off, however a Linux guest with i440fx only works with npt disabled.
ignore_msrs=1	Prevent some Nvidia applications from crashing the VM.

Update Boot Settings

Configure IOMMU and VFIO to load first so that framebuffer drivers don’t grab the GPU while booting. After these changes, commit them to grub and generate a new boot image.

Run nano /etc/default/grub and change GRUB_CMDLINE_LINUX_DEFAULT="quiet" as follows:

GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on kvm_amd.avic=1 rd.driver.pre=vfio-pci video=efifb:off"

Save and exit: press CTRL+X, Y for yes, and ENTER.

Afterward, run:

update-grub          # update boot loader
update-initramfs -u  # update boot image
reboot               # reboot machine

After our computer reboots, run lspci -nnks 28:00 to check that the driver loaded correctly. If everything went well, for each device we should see vfio-pci for our "Kernel driver in use".

28:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104 [GeForce GTX 1080] [10de:1b80] (rev a1)
	Subsystem: ZOTAC International (MCO) Ltd. GP104 [GeForce GTX 1080] [19da:1451]
	Kernel driver in use: vfio-pci
	Kernel modules: nvidiafb, nouveau
28:00.1 Audio device [0403]: NVIDIA Corporation GP104 High Definition Audio Controller [10de:10f0] (rev a1)
	Subsystem: ZOTAC International (MCO) Ltd. GP104 High Definition Audio Controller [19da:1451]
	Kernel driver in use: vfio-pci
	Kernel modules: snd_hda_intel

Also, run dmesg | grep -e AMD-Vi -e vAPIC to check our IOMMU settings.

[    0.893699] AMD-Vi: IOMMU performance counters supported
[    0.895145] AMD-Vi: Found IOMMU at 0000:00:00.2 cap 0x40
[    0.895146] AMD-Vi: Extended features (0xf77ef22294ada):
[    0.895146]  PPR NX GT IA GA PC GA_vAPIC
[    0.895148] AMD-Vi: Interrupt remapping enabled
[    0.895149] AMD-Vi: virtual APIC enabled
[    0.895257] AMD-Vi: Lazy IO/TLB flushing enabled

About That: AMD Virtual Interrupt Controller (AVIC) virtualizes local APIC registers of each vCPU via the virtual APIC (vAPIC) backing page. This allows guest access to certain APIC registers without needing to emulate the hardware behavior, and should speed up workloads that generate large amount of interrupts.

Final Thoughts

Congratulations! We have our PVE server configured and ready to use. We can now begin creating Virtual Machines (VMs) or Containers. In future posts, we’ll consider additional opportunties for enhancing performance and security for our server, VMs, and Containers.

Although we have configured passthrough on the server, updates to our VMs are required to leverage that feature. Because Nvidia sells a commercial line of GPUs (Quadro), they do not support passthrough, and actively try to inhibit passthrough on their consumer line (GeForce). We will have to consider potential workarounds to enable that functionality, which may involve future tweaks to our server settings.

PVE: Virtualization for Work and Play (Part 2)

Roberto Rivera — Tue, 25 Apr 2017 00:00:00 GMT

Getting Started…

In the previous post, we learned about ProxMox Virtual Environment (PVE) and outlined the plan to build a powerful "bang for the buck" home server for games and other system-intensive pursuits. Before proceeding, you should feel a little comfortable with the Linux CLI (command line interface). Now let’s begin.

Table of Contents

Getting Started…
Installation
Post-Installation

Installation

The PVE Quick Installation guide does a wonderful job of highlighting the key installation points and showing how simple the process really is. If you follow the defaults settings, PVE will install to local disks and should take about 10 minutes or less. If you have a large single drive that you want to use, then you can skip to the post-installation section below since that guide is all that you would need.

PVE Installation Process

PVE Drive Options

For our setup, we will disconnect all the drives except for our boot drive, and follow most of the default installation options with one exception. Have a peek at the ProxMox PVE installation guide for background on these options. Since we only want to use half of the boot drive (512Gb NVMe) our hard disk options are as follows:

ext4 filesystem: Standard Linux filesystem is the safe bet.
176.0 hdsize: Shooting for half of our 512Gb. The following values should add up to 256Gb. Equal to 256Gb-80Gb (minfree).
64.0 swapsize: Linux swap file size (equal to our ram size). Be sure to set vm.swappiness to a low value if you have your swap file on an SSD! It’ll increase RAM usage a bit, but will be easier on our SSD.
96.0 maxroot: / root file partition
80.0 minfree: This should equal our ZFS log (16GB) plus our ZFS cache (64GB).
16.0 maxve: This is the pve-data partition.

When you get to the Installation Successful step of the PVE install, click the "Reboot" button.

ZFS Partitions

After rebooting you can log in via the PVE Web GUI or through the command line using SSH. Logging in via ssh:

ssh root@10.10.1.10

#The authenticity of host '10.10.1.10 (10.10.1.10)' can't be established.
#ECDSA key fingerprint is SHA256:2ExP+SHaCo+9ZOt+sk90DPLAafdHFJTHPyeU1qtFXIg.
#Are you sure you want to continue connecting (yes/no)?

Type "yes" and then enter the password set during installation.

After logging into our new PVE installation, we want to add two additional partitions (ZFS Log (16Gb), and ZFS Cache(64Gb)). The combined storage will be 256Gb, leaving us with half of our NVMe for other options like dual-boot, additional storage, etc.

After logging in, run (update for your drive): cfdisk /dev/sda

Go down to the "free space" line in green and add a 16Gb partition. Move down again to the next "free space" line in green and add a 64Gb partition. Then select write and then select quit and we are done.

Creating ZFS Log and Cache Partitions

Once these two partitions are added, we can shut down PVE from the command line:

shutdown -h now

ZFS Setup

Once PVE has shut down, we can reconnect the remaining drives and restart our system. Before setting up our ZFS storage, we must backup any data that we want to keep.

Let’s start our ZFS configuration. As mentioned in our previous post, we are configuring ZFS as striped-mirrored storage. Since we have a 2TB spinning disk that we want to use for backup, we will mirror it as an automatic backup.

Our drives should all be the same size, otherwise, we will lose storage capacity. Since our SSD drives are 1TB each, we need to partition our 2TB spinning disk to two 1TB partitions. Before partitioning, identify the correct drives; run lsblk to get the list of block devices:

List of block devices

In my example, the 2TB (1.8T) drive is /dev/sdc. The following commands will replace the drive with a new GPT partition table and create the 2 partitions:

# Install parted
apt-get install parted

# remove everything on the /dev/sdc drive and
# replace with two empty equal-sized partitions
parted /dev/sdc --script mklabel gpt \
       mkpart primary 0% 50% \
       mkpart primary 50% 100% p

After partitioning, we can mirror and stripe the drives. When we create the drive mirrors, ZFS creates virtual devices (vdevs). We can then connect the vdevs together into zpools. For example, we can mirror two 1TB drives and we end up with a 1TB vdev that will automatically replicate our data across both drives. Then we combine the two 1TB mirrored vdevs and end up with 2TB of storage.

Since the zpool read/write transactions are balanced across the two vdevs, we can actually get an increase in drive performance with the transactions happening in parallel across two physical drives. We can also compress the read-write transactions on the zpool. Because our CPU can compress-decompress data much faster than the drives can read-write data, our drive performance can improve even more, because of the smaller size of the read-write transactions on the zpool.

# Creating our ZFS mirrored storage pool
zpool create -o ashift=12 tank \
      mirror /dev/sda /dev/sdc1 \
      mirror /dev/sdb /dev/sdc2 \
      log   /dev/nvme0n1p4 \
      cache /dev/nvme0n1p5

Table 1. Creating our ZFS Storage Pool
zpool create -o ashift=12 tank \ mirror /dev/sda /dev/sdc1 \ mirror /dev/sdb /dev/sdc2 \ log /dev/nvm0n1p4 \ cache /dev/nvm0n1p5	pool called tank with 4k sectors first vdev second vdev 16GB log partition 64GB cache partition

zfs set compression=lz4 tank  # lz4 pool compression
zfs create tank/vm-disks      # ZFS layer to store VM images

Once that’s done, we can run the following commands:

zpool list          # verify that our pool has been created
zpool status tank   # check pool status and configuration
pvesm zfsscan       # list available ZFS file systems

Post-Installation

The PVE open-source license allows for testing and non-production use. If we would like to use PVE for production or we want commercial support, we can purchase a subscription, enter our key through the web interface, and skip to the "Update PVE" section.

Adjusting the PVE Repositories

The PVE Package Repositories can be configured depending on your usage goals. Let’s include the non-commercial list of repositories.

Run nano /etc/apt/sources.list and update as follows:

# main debian repo
deb http://ftp.us.debian.org/debian stretch main contrib

# security updates
deb http://security.debian.org stretch/updates main contrib

Save and exit: press CTRL+X, Y for yes, and ENTER.

Comment-out the PVE commercial repository.

Run nano /etc/apt/sources.list.d/pve-enterprise.list and update as follows:

# non-subscription repo (manual update)
deb http://download.proxmox.com/debian/pve stretch pve-no-subscription
#deb https://enterprise.proxmox.com/debian/pve stretch pve-enterprise

Save and exit: press CTRL+X, Y for yes, and ENTER.

Update PVE

Edit our resume settings: run nano /etc/initramfs-tools/conf.d/resume and add:

RESUME=none

Save and exit: press CTRL+X, Y for yes, and ENTER.

Update the software packages, boot loader, and system image. From the PVE, command line type:

apt-get update && apt-get upgrade -y
update-grub
update-initramfs -u

Update the PVE Storage System:

Once we create our ZFS storage, we can go to the PVE Web GUI and add it to our setup. Being sure to use HTTPS, open https://machine-ip-address:8006 in a web browser. When we get the certificate warning message, we should proceed anyway. This happens because the machine does not have a certificate signed by a third party. Our goal is to end up with four storage volumes:

Table 2. PVE storage volumes.
vm-disks zfs-backups zfs-containers zfs-templates	Stores RAW disk images more efficiently Stores VZDump backups of virtual machines Stores LXC container filesystems Stores ISOs and container templates

Once logged in, we go to Datacenter > Storage, and:

click Add > ZFS, then enter "vm-disks" for ID, and select tank/vm-disks for pool, choose only Disk Image for content, and finally tick the Thin Provision checkbox and select Add.
click Add > ZFS, then enter "zfs-containers" for ID, and select tank for pool, and Container for content, and select Add.
click Add > Directory, then enter "zfs-backups" for ID, enter "/tank" (/our-zfs-pool) for directory, and choose only VZDump backup files for content, then select Add.
click Add > Directory, then enter "zfs-templates" for ID, enter "/tank" (/our-zfs-pool) for directory, and choose both container templates and ISO images for content, then select Add.

After adding our new storage options, we can disable the local storage:

select local-lvm, click Edit, untick the Enable checkbox, and click "OK".
select local, click Edit, untick the Enable checkbox, add "1" for Max Backups, and then click "OK".

Afterward, if we select the arrow next to pve in the Server View, we will only see only four enabled storage options.

PVE Storage Volume Setup

We made it! With only one storage volume for each type of content, there’s no way to accidentally misplace something. Creating containers and VMs should function as expected.

Our machine is ready to go, however this is only part 2 of our multipart tutorial. Our next installment will cover some opportunities for System Optimization.

Part 3: System Optimization

PVE: Virtualization for Work and Play (Part 1)

Roberto Rivera — Sun, 23 Apr 2017 00:00:00 GMT

The Plan…

Say we want a powerful "bang for the buck" home server for games and other system-intensive pursuits. We may want to run powerful analytics applications which would undoubtedly require Linux, but we may also want to run Windows applications. We may want near native 2D and 3D graphics performance inside the guest operating system (OS) while making dual-booting obsolete. Finally, we may want to do all of that from the comfort of our couch using a Windows, Linux or Mac laptop. Lets do it!

Table of Contents

The Plan…
Introduction
Hardware Considerations
Software Considerations
Next Steps…

Introduction

Hardware virtualization allows multiple operating systems to simultaneously share processor resources. With the open source server management solution, Proxmox Virtual Environment (PVE), we can leverage hardware virtualization to achieve our goals. PVE enables the creation of multiple virtual OS "servers" via a Web GUI; as many as our hardware setup will allow. This guide will document the setup of PVE on the following hardware:

AMD Ryzen 7 1600 (8 cores, 16 threads @ 3.7GHz)
64GB 2400MHz DDR4
Boot Drive: 1x 512GB NVMe SSD
Storage (Striped-mirrored ZFS):
- 2x 1TB SATA SSD (striped)
- 1x 2TB 7200rpm mechanical drives (mirrored)

So what’s the difference between a VM and a container anyway, and how do we choose between them? A VM is computer software that emulates a particular computer hardware system and requires an OS to function. In other words, VMs "pretend" to be an actual computer of the type that we specify and will need to have a Guest OS like Windows or Linux running. Containers are software that emulates the Host OS, to enable software to run predictably.

Diagram 1: Comparison of a VM & Container on One Machine

If we want to run multiple applications on one server, to have increased security, or to run an operating system that is different from our host system, then a VM is our choice. To run different versions of an application (i.e RStudio) and validate reproducibility and reliability, then we want to use containers. Compared to VMs, containers are quicker, "lighter weight" and more transient so they can be readily packaged, shared, and moved to other hardware.

Hardware Considerations

Our CPU and motherboard must support "virtualization” (SVM) and IOMMU, which needs to be enabled in firmware for resource sharing. Also, we should have 32GB of RAM or more, so that we can reserve at least 16GB for a single virtual machine (VM) and still have enough memory left over for PVE and potentially other VMs running simultaneously.

While most of our computer hardware can be shared between multiple VMs, the graphics card (GPU) may not readily be shared, so we’ll need at least two GPUs:

One GPU for PVE (the host);
One powerful GPU for our VMs (the guests: Windows, Linux, etc.).

Software Considerations

PVE 5.0 is based in Debian Linux (Stretch). Since our Ryzen hardware is rather new, our host system needs to have a Linux kernel version 4.10 or later. Although in beta at the time of this writing, PVE 5.0 has better support for Ryzen than PVE 4.4.

PVE natively supports both KVM for hardware virtualization and LXC containers for Linux system virtualization. Since the guest systems can run under hardware virtualization, we get some added bonuses. For example, we can benefit from Ryzen hardware and still get Windows 7 updates. We would need to identify our Windows Universally Unique Identifier (UUID) so that it may be identical on our VM. Otherwise, Microsoft may think that we have a new version of Windows that needs to be registered.

We will use ZFS, a storage platform that encompasses the functionality of traditional filesystems, volume managers, and more, with consistent reliability, and performance. Our ZFS installation will be compressed and striped: our two SSD drives will run in parallel and require less storage space, which improves read/write performance. In addition, our ZFS will be mirrored: our SSD drives will be cloned so that we have a backup in case of drive failure.

About That: KVM supports multiple disk formats; raw images, the native QEMU format (qcow2), VMware format, and many more. When working with ZFS on PVE, we need to use raw images. It may not seem obvious at first, but we can easily convert an existing KVM file from one format to a raw image. Near the end of this guide, we’ll cover the process to convert a qcow2 format to the required PVE raw image.

Next Steps…

This is Part 1 of a multipart tutorial. The next two part will cover installation of PVE and server tweaks we can make to improve performance of our VMs and containers:

Titanic: Learning Data Science with RStudio

Roberto Rivera — Sun, 16 Apr 2017 00:00:00 GMT

So we are aspiring data scientists and want to dip our toes into RStudio. How do we get started? We dive into the into the waters of the Kaggle Titanic "Competition", of course!

Table of Contents

Our Objective
Kaggle Basics
Titanic History Lesson
Next Steps…

Our Objective

Use this Kaggle exercise to:

learn to reason and problem solve like a data scientist
get somewhat comfortable with RStudio
predict whether a passenger would survive the sinking of the Titanic
enter a Kaggle submission file for evaluation
have fun!

Kaggle Basics

Kaggle is a community of data scientists and a platform for facilitating data science journeys. One way to participate, is by entering data science competitions. Similar to other competitions, Kaggle provides two Titanic datasets containing passenger attributes:

a training set, complete with the outcome (target) variable for training our predictive model(s)
a test set, for predicting the unknown outcome variable based on the passenger attributes provided in both datasets.

After training and validating our predictive model(s), we can then enter the submission file to Kaggle for evaluation. As we iterate, we can submit more files and assess our progress on the leaderboard. Subtle model improvements can lead to significant leaps on the leaderboard.

About That: Predictive models are trained using attributes (variables), right? How does that work?

Some attributes are correlated: as they vary, to some degree other attributes may also vary. Machine learning leverages that interdependence to model the predicted outcomes. For accurate model performance, we need to:

maximize the number of explanatory variables: those that are correlated with the outcome variable, and
compensate for the correlation of explanatory variables to each other (multicollinearity).

In other words, we need to find the fewest quantity of variables that can explain almost everything that is going on with the outcome that we want to predict.

Titanic History Lesson

The Titanic was a British passenger liner that sank after colliding with an iceberg in the Atlantic on its maiden voyage en route to New York City. It was the largest ship of its time with 10 decks, 8 of which were for passengers.

There were 2,224 passengers and crew aboard. Of the 1,317 passengers, there were: 324 in First Class (including some of the wealthiest people of the time), 284 in Second Class, and 709 in Third Class. Of these, 869 (66%) were male and 447 (34%) female. There were 107 children aboard, the largest number of which were in Third Class.

The ship had enough lifeboats for about 1,100 people, and more than 1,500 died. Due to the "women and children first" protocol, men were disproportionately left aboard. Also, not all lifeboats were completely filled during the evacuation. The 705 surviving passengers were rescued by the RMS Carpathia around 2 hours after the catastrophe.

About That: So the different variables like gender and class could influence whether someone survived, correct?

Yes, for example, someone may not have been able to get to the lifeboats because they were in a lower deck, which is correlated with third class. Also, children were disporptionately in third class, but they were also impacted by the "children first" protocol.

Next Steps…

We’ll approach this in multiple parts. This is still a work in progress, but roughly speaking it should look like:

Part 1: Basic Setup
Part 2: Feature Engineering
Part 3: Prediction
Part 4: Conclusion

Qualitative Prediction of Weight-Lifting Exercises

Roberto Rivera — Sat, 15 Apr 2017 00:00:00 GMT

One day your company tells your team that they should switch from a proprietary analytics platform like SAS, to something open source like RStudio. Undoubtedly some analysts may become fascinated, while others anxious. How would you get them excited about the switch?

My approach was to demonstrate the potential of the new open source platform. How easily could RStudio generate reproducible research and facilitate story telling with data? How could we weave together narrative text and code to seamlessly produce and deliver elegantly formatted analyses to multiple audiences?

Leveraging Human Activity Recognition (HAR) data provided from a Groupware@LES study, a machine learning use-case was born. HAR data has become ubiquitous with the advent of devices like the Fitbit, Nike FuelBand, and even smartphones. Although users of these devices tend to quantify how much they participate in an activity, they rarely consider how well they perform the activity.

The provided multiclass variable was generated by participants wearing HAR devices, and is relatively balanced (equally distributed). This simplifies the analysis somewhat since we don’t need to consider tactics to combat imbalanced classes. In addition, this lets us focus primarily on the other major analytics steps required in most machine learning projects.

Machine learning project goals:

use the multiclass variable to build a predictor that distinguishes between participants that correctly completed fitness exercises versus those that hadn’t and what their mistakes may have been.
demonstrate the feasibility of using RStudio for delivering reproducible research via a webpage.

The project results, and code are available on my GitHub Repository.

Roobyz Ramblings

PVE: Virtual Windows

Create a Windows Image…​

Install Windows 7

Step 1: Create the VM

Initial Setup

Setting Tweaks

Step 2: Begin Installation

Step 3: Post Installation

Update Windows!

Install Additional Drivers

Checkpoint

Backup

Cloning

Next Steps

PVE: Virtualization for Work and Play (Part 3)

System Optimization…​

ZFS Tune-up

Graphics Processing Unit (GPU) Passthrough

Enable the VFIO Modules:

Configure the VFIO Modules

Identify Passthrough Device

Enable Passthrough Device

Update Boot Settings

Final Thoughts

PVE: Virtualization for Work and Play (Part 2)

Getting Started…​

Installation

PVE Drive Options

ZFS Partitions

ZFS Setup

Post-Installation

Adjusting the PVE Repositories

Update PVE

Update the PVE Storage System:

PVE: Virtualization for Work and Play (Part 1)

The Plan…​

Introduction

Hardware Considerations

Software Considerations

Next Steps…​

Titanic: Learning Data Science with RStudio

Our Objective

Kaggle Basics

Titanic History Lesson

Next Steps…​

Qualitative Prediction of Weight-Lifting Exercises

Create a Windows Image…

System Optimization…

Getting Started…

The Plan…

Next Steps…

Next Steps…