Create Linux swap on external USB 3 hard drive

7,895

There are a number of things to consider and maybe test bearing in mind your setup and use case.

Suggested SWAP partition locations

If the USB HDD is not a good idea where should I put some swap?

Short answer, yes you can create a swap partition on the USB3 HDD, but the 2x750GB HDD is possibly the safest place to put the swap.

However, you could also spread and prioritise your swap partitions across all disks with varied priority to try max performance and swap capacity. If you like over-optimising like me, I'd recommend trying something like the following (which requires tinkering with fstab, etc):

  • Allocate a little swap partition space on the 2x SSD array, e.g. 4GB, with high priority (limited SSD space and paranoia over the lifespan of SSD are reasons other people don't do this).
  • Allocate more swap partition space on the 2x HDD array, e.g. 8GB, with medium priority.
  • Allocate even more swap space in a swap file on the USB3 HDD, e.g. 16GB with low priority.

That way, if system RAM gets crushed with lots of processes begging for RAM and being swapping out, the load is distributed across all disk devices. Note also, the swap priority was based on performance of the underlying disc systems.

Next I'll try go over some detailed reasoning.

Storage speed is probably much more important

You've probably read the recommendation to place the swap on a less busy or dedicated drive, but it only applies in an apples vs apples type comparison, and isn't an accurate rule for a more complex system mixing different storage mediums such as SSDs vs HDDs and interfaces SATA vs USB3. For your case, the guiding principle should be to balance the I/O load types and allocate the SWAP where you expect the storage interface types and drives that have the best spare/free random I/O throughput. That could be the SSDs, but with a caveat...

USB3 HDD for SWAP

You mentioned in a comment that the USB3 option didn't perform too well and indeed, the reasons could be:

  • Your USB3 drive is probably a single disk system, whereas your 2x SSD and 2x HDD with RAID should have better performance, given:
    • RAID 0 nearly doubles both read and write performance.
    • RAID 1 nearly doubles just read performance and can degrade write performance by a marginal amount.
    • So, assuming similar individual drive performance, USB3 HDD would would only be better if, on average, the 2x HDD SATA array was busy 50% of the time and the USB3 1x HDD was 0% busy.
  • And even more so, if you compare swapping on one HDD to 2x SSDs, there's no contest/chance it would perform as well. The SATA SSDs would have to be like 95%+ busy before a single dormant USB HDD might begin to compare...
  • USB3 will have more latency than SATA. And low latency is a key factor in memory access performance and responsiveness.

Internal HDDs array for swap

As above, the 2x HDDs for swap should be better than just 1 HDD hanging off USB3, and as will be explained, should be be safe to use as SWAP.

  • The 2x HDDs are best suited to large data sets which would tend to have sequential access patterns, e.g. media files (music/video/images).
  • I'm not sure about Intel RAID setup, but with Linux RAID (mdadm) I know you have options e.g.:
    • you could share the same discs, but make a RAID 0 for swap and RAID 1 for VM images/data
    • you could avoid raid overhead and directly configure a 1st swap partition on the start of each individual drive while configuring mdadm to create the array out a the 2nd partition on each drive
  • HDD magnetic media is supposed to have better write longevity compared to SSDs (if they don't suffer other types of premature failure...)
  • If a system SWAPs a lot, it implies a lot writes.

SSDs for swap

2x SSD 120GB would be great for swap performance, but SSD lifespan is a factor to look out for.

  • SSDs are more like RAM compared rotating discs and have much better random I/O support.
  • If lots of VMs and processes are running and your RAM is heavily utilised, the page-fault (read) access patterns to the swap partition/file are going to end up random.
    • Memory page allocation units are small, i.e. 4KB
    • I assume the Linux kernel is smart about 'swapout' (freeing some ram by taking pages out and putting them on disc) and does it in batches to optimize for a more sequential writes to disk.
    • For 'swapin' (when a process needs data from RAM that's not there but in fact in the swap / page fault), this could be quite random and that's where an SSD can excel.
    • Windows 7 Engineering MDSN blog recommends SSDs given reads outnumber writes by about 40 to 1 (hopefully Linux is in principle similar), alleviating the concern about too much writing to SSD
  • Even if your SSDs are used to store your main OS and some VM images, there's probably plenty of headroom for SWAP file operations too. I have 2x 128GB Crucial M4s in RAID0 and they get awesome sequential IO (almost 1000MB/s) plus fairly good random read/write performance too (I measured close to 5000 IOPs and 50MB/s on a nasty mix of random read with mixed sizes mostly in the 4K and 16K blocks, but up to 256K).
  • Enterprise class SSD, i.e. based on more robust SLC tech, can handle more erase-write cycles and should be okay for swap.
  • Consumer based SSDs, i.e. based on cheaper higher density MLC, might suffer a worse than expected lifespan if swap usage gets very heavy very often (I'm assuming you have consumer based SSDs given the budget comments you made). However, at least in normal desktop workload scenarios, I sounds like swap on an SSD isn't an issue.
  • When SSDs get fully utilised, the write performance degrades and the write wear problem and lifespan issue becomes even worse.
  • You can potentially mitigate the erase-write limits and write performance issues of the SSD array by under provisioning to leave more headroom for SSD garbage collection to free up contiguous write blocks for better write performance and longevity.
    • Assuming you previously used the SSDs to full capacity an ATA secure erase operation might help refresh them so the wear leveling algorithms see the full SSD as unallocated.
    • Simply partition only 80 to 90% of capacity and leave the end of the SSD space free.
  • RAID type? If you have more faith in the reliability of SSDs and can afford the time to restore from a backup, I recommend RAID0. Note RAID 1 on 2 SSDs will technically have double the impact on write lifespan compared to RAID0 (as it doubles every write). So maybe steer clear of RAID1...

Other tweaks

There are also several other tweaks and options you should consider given the concerns of supporting multiple VMs, etc.

Linux loves more RAM for caching I/O and Virtualisation Hates Disk I/O

Potential pitfalls:

  • Don't over allocate all your RAM to guest operating systems so that you can save some for caching I/O
  • Find the sweet spot for 'swappiness'. Swapping should leave some room in RAM to cache disk I/O, but too much swapping will cause processes to be swapped out too soon and hurt general multitasking.

Modern CPUs have good hardware support for virtualizing the CPU and Memory resources, but when it comes to sharing disk storage, virtualization workloads often bottleneck. Linux (and windows) can improve I/O performance by using the RAM to cache I/O operations while the SSD or HDD disk devices are still busy 'catching up'. Therefore, you're extra RAM might not just be useful for running multiple OSes, but also for caching virtual machine I/O.

Virtual guest pagefile location

It would be a great solution if I could also use the same location for the Windows vbox clients' to move swap from C: to there!

I'm not sure about this, but my hunch is:

  • rather allocate enough / more RAM per VM and let linux swap out and in pages form the virtual box process on the host as and when needed look at using the VirtualBox Memory ballooning control
    • after double checking, sounds like VirtualBox locks and hogs the RAM and the host OS can't page it in and out
    • so you'll still need some swap for virtual guests
  • Having enough RAM for each guest and using memory ballooning should be faster / better compared to each individual VM guest doing it's own swapping via virtual I/O which has a performance penalty
  • also explore the option of installing the virtio drivers for Windows (VirtualBox supports this now and RedHat has these drivers)

Compress swapped storage

If your virtual host has a fair number of spare CPU cores, then something like zswap could work well:

  • Could have good performance boost if using the 2x HDD for swap space.
  • Might not help performance that much with swapping to 2x SSD, but compression would imply less write cycles.
  • And implies more virtual memory capacity from less storage

Anyhow, this may not be worth the effort as it would require a newer kernel and Debian is notorious for sticking with older tried and tested kernels, so not an easy option unless you backport a kernel or look at a different distro: E.g. Ubuntu 14.04 or CentOS 7 should offer more recent kernels.

Benchmarking Experience

On my own workstation (Windows 7), I used fio (http://www.bluestop.org/fio/) to mimic the random read and random write I/O trends mentioned in the MSDN blog. Anyone else wanting to test what various storage options can offer under swap/page file workloads could try something similar.

In looking at telemetry data from thousands of traces and focusing on pagefile reads and writes, we find that

  • Pagefile.sys reads outnumber pagefile.sys writes by about 40 to 1,
  • Pagefile.sys read sizes are typically quite small, with 67% less than or equal to 4 KB, and 88% less than 16 KB. Pagefile.sys writes are relatively large, with 62% greater than or equal to 128 KB and 45% being exactly 1 MB in size.

Benchmark Setup

This is the fio job file I used:

[global]
description="test random read and write to estimate suitability for page file use"
filename=fakeswap
numjobs=1
iodepth=1
direct=1
sync=1
filesize=2048m

[pageout]
rw=randwrite
bssplit=64k/38:256K/15:1024K/45:4096k/2

[pagein]
rw=randread
bssplit=4K/67:16K/21:64K/10:256K/2

Since the MSDN blog post text only briefly mentioned a few stats, I made some educated guesses about block sizes and proportions of IOs for those sizes. I used the bssplit option to weight the different block sizes. My guesses were hopefully not too bad given the final ratio of random read vs write IOs I got was 38.5 : 1 which is quite close to the 40 : 1 mentioned by the blog post.

I ran the benchmarks on an AMD SB850 based storage chipset and compared them to the performance of a RAM drive.

  • DDR3 Dual Channel @ 1600MHz with 2G RAMDisk (using DataRAM RAMDisk product)
  • SSDx2 RAID 0 (Crucial M4 128GB), NTFS
  • HDDx4 RAID 10 (Seagate 7200.14 3TB), NTFS
  • ADATA UV150 USB3 Flash Drive 32GB, FAT32

Note, I executed random read and random write benchmarks independently (not mixed, but a real system may see mixed patterns - I was interested in comparing read/pagein versus write/pageout, hence I seperated it). E.g. the commands I used were:

fio --section=pageout --output raid10_hdd4_pageout_2G.txt page2g.fio
fio --section=pagein --output raid10_hdd4_pagein_2G.txt page2g.fio

Benchmark Results

After running the benchmarks, they confirmed my own suspicion that a USB3 flash drive (note, not a hard disk on USB3) could perform fairly well with small random I/O. It turns out however that it isn't that good at the larger random write blocks with very erratic latency times.

The following graph shows time taken to page out and page back in 2G of swap space with the representative/estimate random I/O patterns for paging

benchmark result - time taken to page out and page back in 2G of swap space with a representative random I/O pattern

I also looked at average throughput and compared it to that of RAM - it gives an idea of how bad things get when to system has to use swap ;-)

Table comparing storage options for swap space and page files

Further observations

  • Random Read I/O matters more than Random Write because of the smaller block sizes and larger number of IOs. Proportionally, pagein is more painful than pageout...
  • SSDx2 RAID 0 was about 10x slower than RAM
  • HDDx4 RAID10 looks to be terrible at pagein - about 300x slower than RAM and 30x slower than SSD.
  • However, HDDx4 RAID10 looks like it'll do relatively better at pageout - about 40x slower than RAM and only about 4x slower than SSD
  • The USB3 flash drive was much better at small random reads compared to the HDD RAID (~9x faster), so much so, that it made up for how poor it was at random write (~7x slower). Even when plugged into a USB 2 port, overall, it beats the HDD RAID.

WARNING - not recommending putting swap / page file on USB flash drive

  • A USB flash drive's NAND and controller could lack robust wear leveling and garbage collection implementations (e.g. can't benefit from SSD ATA TRIM command) making it more likely that, if used for swap space/page file, it'll suffer a short lifespan and performance degradation over time. My tests were on a fresh/new flash drive. Maybe after 6 months of swapping to and from it, it won't keep up the performance and have a premature death.

Last few notes

  • My SSDs and HDDs have fairly large caches. 256MB and 64GB respectively on each device, so this presumably gives them a boost whereas the USB flash drive probably lacks this.
  • I'm not sure how well the observation M$ made about windows page file use applies to a Linux swap partition or file, but I'd bet it's not far off...

References

More reading (sorry, would've posted more links, but I've just signed up and superuser doesn't trust me yet)

Share:
7,895

Related videos on Youtube

arch-abit
Author by

arch-abit

Updated on September 18, 2022

Comments

  • arch-abit
    arch-abit almost 2 years

    I have an external USB 3 hard drive USB 3 port on Debian, and I would like to create a swap partition on that drive which drive also holds the partitions for databases and other high-volume stuff.

    My system is Debian 7.7.0 on a TS140 box 16 GB RAM, 2x120 SSD and 2x750 HDD via Intel BIOS RAID 1. If the USB hard drive is not a good idea where should I put some swap? I have been using the same box with Debian 7.6 with a single HDD for a while and the way I use it never needed to swap, so I am not even sure I need a swap partition. I also hope to upgrade the RAM to 32 GB soon.

    Due to the storage upgrade there is going to be more load on the server (more VirtualBox clients), so I would like to prepare a permanent location, or locations, for swaps during the install. It would be a great solution if I could also use the same location for the Windows vbox clients' to move swap from C: to there!

    As of now I do not know how many VirtualBox clients I am going to install, and how many of those I need to run at the same time. I only know that I need a Windows server, a Windows 7 workstation, and a Linux workstation as a minimum.

    EDIT: 11/15/2014 16:20

    JPvRiel

    What you wrote is a gold-mine of info, thank you sooo much for taking the time!! I am not even going to question anything you wrote, at least 99% of it so far makes sense to me so far - and I only read through it twice so far and much more than twice in part of it.

    I guess the config I am trying to make work (no optimization concerns or such so far just to make it work) is in darn near perfect reflection of your proposals. So far..

    • Yes, US $40-60 apice SDD drives are not to sweat over how, when and why they wear out when used as SWAP. I though this was Very-Very-Very-well understood by me already and never bothered me, so Thank YOU! I would have never-ever considered boosting my small box's 32GB RAM limit to where these RAID0 120 GB commodity drives could take it -WHEN, again-WHEN the need emerges. Considering it is going to never, or maybe for a few seconds here and there, the SDDs are never going to wear out - but some 240GB of additional SWAP on SDD is going to make a HUUUUGE difference in those rare WHENs! You are one 100% correct in that! My system designs are never going to look the same again after reading this. RAID0 on SSD as RAM - you are priceless! You Madam or Sir have given me not only a solution but also a prospect!

    I am still reading and trying to comprehend the most out of your post, and again - thank you!

    I am working on this mostly for my employer who by today's date not so recently purchased and put to work a number of the then newest and by our use proved to be the best tech on the planet, but no-one in the plant including myself have the slightest clue how to deal with the nature and the volume of the data those machines generate every (split)second in operation. So far the tech's systems and hardware work very well and produce swell, we just cannot make them report, and I have no idea what should I build to make them do so - and KEEP IT! I am (still) very confused about this project.

    While most posters here worry about millions of records in one database I worry about hundreds and hundreds of hundreds of databases created and populated by automated optical inspection machines each hour of a day, and I need to find a way to read and keep what is important to my employer before the databases erased and the inspection cycle begins anew.

    Yeah, my kinda engineers are really just a frustrated gang of sort.. Again, truly, thanks for your help! Sandor

    ADD: 12-03-2014

    My frustration levels were going up&up until I finally decided to cut to the chase, and focus on the essentials. And here they are, I hope you all may benefit: If you cannot afford a box with ‘unlimited’ RAM for your purposes - opt for some cheap SSD to store Linux/UNIX swap and Windows page files. Budget to wear those SSD out when they fail, so purchase them in identical masses, and pay no attention to reviews on SSD of your choice unless your plan is to go certifiably insane. Whatever you buy the fact is, there is always something inherently wrong with SSD products, and at this point keeping your sanity should be priority, yes no?
    Once you make some hard decisions along these lines the rest is easy... This works best for experiments - if you are building a real production load around it, or if you are considering using it with high-workload workstations - your ARE on your own, do not quote me. Never consider putting boot MBR or /boot on SSD or on any RAID. Do not put /home on the same SSD as swap - oh boy! Use them, wear them those SSD’s down so you get your money’s worth! C: drive, /boot & /home on SSD - Faster boot yes, but for any experimental server that is straight-out of Twilight Zone! I have USB3 HDD replicators, Linux-boot doctors, software to cover from a-z and all work and trusted to work with ‘spinning rust’. I admit they are ‘rusting trust’ but… protect your C’s and /home’s. Put them on spinning rust. Do not make new tech fool you. Please at least consider trusting spinning rust one more time! There is not much else to tell, but good luck to you all! I consider this thread to be closed. - Sandor

    • Admin
      Admin over 9 years
      With 16GB of RAM you shouldn't worry too much about the swap (especially if you're doubling it soon!) but, ideally, keep it on the same disk as the OS or on a similar spec disk. It's like Windows pagefile - for the most part you should let the system manage it. I wouldn't put it on USB because, out of the 3 types of disk you have, this is the slowest technology. If you want a separate disk then use another SSD or SATA...
    • Admin
      Admin over 9 years
      @BigChris Unfortunately my situ is way more complex.. I need to simulate a PCB production environment as a proposal for data interchange between waay too many old and new techs. I have one metal box to cram all this into and I have no idea how big the vbox client list is going to grow. If, as I am installing Debian again for yet one more time, I put the swap on a USB drive, should that be OK? - no matter how slow it is. I may connect another dozen or more USB HDD if I need to beg borrow or steal. I am not an IT pro. Thanks for your response!
    • Admin
      Admin over 9 years
      Obviously there is little focus to my questions... This is the first time I need to deal with planning ahead of a complete IT system growth, and the reality of the facts are overwhelming me. I do this on my own time and budget, and my employer also has little experience in this particular issue - they are in manufacturing, not in IT. We know how to build PCBs and all sort of machines - we just do not know how to purpose them.
    • Admin
      Admin over 9 years
      After some testing, any swap on any kinda USB turns out to be a bad idea, the delays during mount/dismount and other USB activities renders swap practically meaningless. Otherwise impressive transfer speeds in this case are also meaningless.
    • Admin
      Admin over 9 years
      @JPvRiel - Again, thanks! Your understaning of the issue and your capacity to test is far beyond my reach, and I appreciate the facts. I have a number of posts in superuser related to this project of mine, which I am going to conclude - thanks in large part to your responses. My server is now stable NOT using any RAID, running XUbuntu (not Debian after all), and all swap and page files for the host and for the guests are on the SSD. From here on I am working on to put all this to work and finally untilize the server for the purposes it was built for...
  • Prasanna
    Prasanna over 9 years
    A very detailed answer! Welcome to SU !
  • JPvRiel
    JPvRiel over 9 years
    @Prasanna, thanks. Premature optimization and tinkering is a habit I can't kick. My friends call me Captain '-vv' :-)
  • arch-abit
    arch-abit over 9 years
    More reading (sorry, would've posted more links, but I've just signed up and superuser doesn't trust me yet) - no need to trust anyone, the posts decide?
  • JPvRiel
    JPvRiel over 9 years
    @arch-abit thanks. There may be one or two mistakes (just spotted and corrected my err about RAID1, it'll actually have double the impact on shortening SSD lifespan compared to RAID0...). By the way, I'm busy testing and benchmarking this stuff on RAID0 SSDx2, RAID10 HDDx4 and a 32G Flash Drive. Will share results soon.
  • JPvRiel
    JPvRiel over 9 years
    @arch-abit. Note, corrected yet another err in my answer, as it sounds like VirtualBox locks the memory you allocate to VM guests, and so that can't be swapped out and in by the host OS. But also updated the answer to include a link to Memory Ballooning which more or less works around this issue... one curse of a long answer is the higher probability of getting several things wrong :-)
  • arch-abit
    arch-abit over 9 years
    @JPvRiel - If this post appears to be a duplicate forgive me, I am posting form a vbox client keeps crashing. Your understanding and capacity is far beyond my reach, facts appreciated. I now need to close a number of posts here on superuser as my server seems to be stable - much due to your posts. I decided to use XUbuntu inplace of Debian for the server software, and use the SSD only for swap and for Windows pagefiles (drive G:). I started on putting the server to use. On my end it is all good, thanks and much regards! -Sandor
  • flarn2006
    flarn2006 about 3 years
    One potential concern I don't think you mentioned is what happens if the removable drive is disconnected for whatever reason, and the system attempts to access swap. Would this result in a kernel panic? If so, that's a possibility that should be taken into account.