Copy large folders using command-line with progress indication

By many sources cp is mentioned as the fastest way to copy large filesets. And the only fault of it, is that it does not provide an easy way to see progress of copy process. This post discusses various ways of fixing this problem. Keep in mind, that this post is (again) QNAP-optimized, meaning that it mostly focuses on solutions available on my old, lame QNAP TS-210 NAS system. It only points out other methods, that are not available on this very limited edition of Linux.

Introduction

Of course, the easiest way, is to use some GUI program, even in command-line, for example mc (Midnight Commander) which will provide graphical progress representation for each copy process. But, assuming this is not an option (for example, when using screen command, you won’t be able to see running mc, when you attach to copy-screen from another session), we need to look for another solutions.

The other option is to run another session and compare destination dir (or file) size with source dir (or file). This is mentioned by many sources, but of course fails, when you’re dealing with complex folder structure, as using some recursive ways of getting destination folder total size could seriously slow down background (another session) copy process or even entire system. If you, though, interested in this option, then using simple watch ls -lh DIR (in another session) could be a solution. See here for more information.

If you’re IT geek and you’re interested in technical details, then here you’ll find a very nice explanation, why cp (and similar) doesn’t have progress bar or any similar progress-like functionality implemented by default.

Before you start

You can of course (as in case of many, many command-line commands) use -v option, when using cp, to make it more verbose. This will not give you percetnage value, speed of copy process or estimated time to end, but will at least print out to screen any file, that is actually being copied. So, this is the easiest “progress-like” solution, when copying entire folders:

[code language=”shell”]
cp -rv /path/to/source/ /path/to/destination/
[/code]

Only remember in this case (as in above example) to include -r option (recursive copy) and to terminate path with ending / (or else some strange symlinks will appear instead of real copy process).

If using cp is not mandatory, you may consider using other commands for copying large filesets, that provides visual indication of copy progress.

This AskUbuntu‘s answer gives you few alternatives to default cp command, including pv, gcp, dd, rsync and even curl. Out of this, I found only last three (dd, rsync and curl) available at QNAP TS-210.

Using curl for local file transfers (quite contrary, to what it was designed for, don’t you think?) is discussed here. Although this command is available on TS-210 I didn’t manage to get it to work. Seems, that it fails, when you’re specyfing files and folder paths containing spaces, even if you use "". On the other hand, if you just want to copy one, large file this could be an interesting idea, as curl provides you with a lot of information about copy process (including three percentages, three speeds — current, average download and average upload — and three times — total, spent and left).

dd is available on QNAP, but example given here uses pv (which is missing on QNAP) and is clearly oriented on copying one large file instead of folder structure, so I skipped this part.

pv isn’t available on QNAP, but if you would consider using it on your system keep in mind, that using this method will cause to lose files’ permissions and ownership. Files copied this way will have the same permissions as if you would created them yourself and will belong to you. If this is not an option, then above method (rsync) is suggested as an alternative to pv.

More methods, examples and solution in mentioned AskUbuntu‘s answer.

Using rsync

As mentioned here using rsync may be the best option here, as it provides progress indication, speed of copy and similar things. Use for example:

[code language=”shell”]
rsync -a –append –progress /path/to/source/ /path/to/destination/
[/code]

to get detailed information about each file being copied from /path/to/source/ to /path/to/destination/. Again, remember about ending paths with /.

Side note: rsync always shows progress of file currently being copied, not the overall progress. This is mainly because, rsync is not doing recursive search of entire directory set before copying (as mc would do), so therefore, it does not know total size or amount of files it is about to copy. Instead, it is updating information as it progress through directories. You can see it, while obverving on-screen results:

[code language=”shell”]
(xfer#2744, to-check=1016/3858)
[/code]

Where number after xfer# is total number of files already processed and to-check=XXXX/YYYY stands for how many files are left to process out of total number of files quened for processing. As you observe copy process for some time, you may easily see that last number (total files to process) increases, as rsync progresses through your directory structure.

So, there is mainly no way to get rsync to display entire process progress. If you need that, then getting away from command-line and using tools like Midnight Commander might turn out to be the only option. Especially on QNAP! :]

After you start

OK, but what about checking progress of cp command, that have already been started? Well, that is also possible, and — surprisingly! — even one of them is available on QNAP! :]

Many of them are discussed in this Unix & Linux Stack Exchange question and especially in this and this answers. Among three mentioned there, lsof method seems to be working on QNAP.

All you need is to get PID of running cp command, which you can get by executing top, immediatelly stopping it with Ctrl+C and browsing generated table by comparing first (PID) and last (COMMAND) column. Usually it should be near the top or even listed first as in default call top lists all processes ordered by CPU consumption and copying large folderset is surely one of most CPU-consuming tasks. If there is not cp in this list, then copy process most likely has already ended up.

Note: Normal Linux users most likely uses pgrep -x cp or something similar, instead of playing with top. But, since we’re on QNAP, which means, we’re not normal Linux users, we simply doesn’t have the pgrep command, right? :]

Having cp PID allows you to execute:

[code language=”shell”]
lsof -p [cp’s PID]
[/code]

For example:

[code language=”shell”]
lsof -p 30773
[/code]

This will let you know, which file is being copied at the moment. You can also check (among bunch of other more or less useful information) OFFSET column, to see, how many bytes cp has already read and written for the current file.

More solutions and examples, of which most will most likely not work on QNAP, can be found here and here.

Follow to this answer or few examples given in this forumhere, here and here to get solutions using Bash and/or other scripting language, which will give you any information you want, but are far more coding solutions than just every-day-use presented above.

Leave a Reply