Starbeamrainbowlabs

Stardust
Blog

Pipes, /dev/shm, or a TCP socket: Which is faster?

I've been busy patching HAIL-CAESAR (a simplified 2D flood simulation program designed for HPC supercomputers) to make it more suitable for the scale of my PhD project, and as part of this I'm trying to use the standard input & output where possible to speed up data transfer for the pre and post-processing steps, since I need to convert the data to and from different formats.

As part of this, it crossed my mind that there are actually a number of different ways of getting data in and out of a program, so I decided to do a quick (relatively informal) test to see which was fastest.

In my actual project, I'm going to be doing the following data transfers:

That's a lot of transferring. In particular the output of HAIL-CAESAR, which I'm currently writing directly to disk, appears to be absolutely enormous - due mainly to the hugely inefficient storage format used.

Anyway, the 3 mechanisms I'm putting to the test here are:

If anyone can think of any other mechanisms for rapid inter-process communication, please do get in touch by leaving a comment below.

Pipe

I'm simulating a pipe with the following code:

timeout --signal=SIGINT 30s dd if=/dev/zero status=progress | cat >/dev/null

The timeout --signal=SIGINT 30s bit lets it run for 30 seconds before stopping it with a SIGINT (the same as Ctrl + C). I'm reading from /dev/zero here, because I want to test the performance of the pipe and not be limited by the speed of random number generation if I were to use /dev/urandom.

Running this on my laptop resulted in a speed of ~396 MB/s.

/dev/shm

/dev/shm is the shared memory area on Linux - and is usually backed by a tmpfs file system (i.e. an in-memory ramdisk).

Here are the command I'm using to test this:

dd if=/dev/zero of=/dev/shm/test-1gb bs=1024 count=1000000
dd if=/dev/shm/test-1gb of=/dev/null bs=1024 count=1000000

This writes a 1GB file to /dev/shm, and then reads it back again (to be consistent with the pipe test). To calculate the overall MB/s speed, we need to know the time it took to do the read and write operations. I observed the following:

Operation Speed Time
Write 692 MB/s 1.4788s
Read 890 MB/s 1.1501s

....so that's 2.6289s in total. Then, we can calculate the MB/s by dividing 1GB by the total time, giving us a total transfer speed of ~380 MB/s. This seemed quite variable though - as when I tested it the other day I got only ~273 MB/s.

TCP Socket

Finally, to test a TCP socket, I devised the following:

nc -l 8888 >/dev/null &
timeout --signal=SIGINT 30s dd status=progress if=/dev/zero | nc 127.0.0.1 8888

The first line sets up the listener, and the 2nd line is the sender. As before with the pipe test, I'm stopping it after 30 seconds. It took a moment to stabilise, but towards the end it levelled off at about ~360 MB/s.

Conclusion

After running the 3 tests, the results were as follows:

Test Speed
Pipe 396 MB/s
/dev/shm 380 MB/s
TCP Socket 360 MB/s

According to this, the pipe (i.e. writing to the standard output and reading from the standard input) is the fastest. This isn't particularly surprising (since the other methods have overhead), but interesting to test all the same. Here's a quick graph of that:

A quick bar chart of the above data

Of course, there are other considerations to take into account. For example, If you need scalable multi-core processing, then /dev/shm or TCP sockets (the latter especially since Linux has a special mechanism for multiple processes to listen on the same port and allow load-balancing between them) might be a better option - despite the additional overhead.

Other CPU architectures may have an effect on it too due to different CPU instructions being available - I ran these tests on Ubuntu 19.10 on the Intel Core i7-7500U in my laptop.

As of yet I'm unsure as to how much post-processing the data coming from HAIL-CAESAR will require - and whether it will require multiple processes to handle the load or not. I hope not - since HAIL-CAESAR is written in C++, and TCP sockets would be awkward and messy to implement (since you would probably have to use the low-level socket API, and I don't have any experience with networking in C++ yet) - and the HPC in question doesn't appear to have inotifywait installed to make listening for file writes on disk easier.

Tag Cloud

3d 3d printing account algorithms android announcement architecture archives arduino artificial intelligence artix assembly async audio automation backups bash batch blog bookmarklet booting bug hunting c sharp c++ challenge chrome os cluster code codepen coding conundrums coding conundrums evolved command line compilers compiling compression containerisation css dailyprogrammer data analysis debugging demystification distributed computing documentation downtime electronics email embedded systems encryption es6 features ethics event experiment external first impressions future game github github gist gitlab graphics hardware hardware meetup holiday holidays html html5 html5 canvas infrastructure interfaces internet interoperability io.js jabber jam javascript js bin labs learning library linux lora low level lua maintenance manjaro network networking nibriboard node.js operating systems own your code pepperminty wiki performance phd photos php pixelbot portable privacy problem solving programming problems project projects prolog protocol protocols pseudo 3d python reddit redis reference releases resource review rust searching secrets security series list server software sorting source code control statistics storage svg talks technical terminal textures thoughts three thing game three.js tool tutorial twitter ubuntu university update updates upgrade version control virtual reality virtualisation visual web website windows windows 10 xmpp xslt

Archive

Art by Mythdael