A quick sanity testing of PostgreSQL parallel query on Arm64

Query parallelism is supported in PostgreSQL since quite a while now. People in the PG community call it "Parallel query", but by now it is not limited to just SELECT queries. Index build leverages multiple cores; and even utilities like VACUUM now make use of parallelism. Furthermore, community is working on parallelizing COPY and INSERTs.                                                                                  I was interested to do kind-of "sanity" check of this capability specifically on ARM64 platform. Let's see how it goes. And also at the same time, we will try to understand little bit of how to interpret the parallelism part of the plan output. Subqueries and partitions are not covered in this blog; probably I will add it in another blog.                                                                                  For running the queries I generated a scale-5 TPC-H benchmark schema with the help of scripts taken from

Need for external compression methods in PostgreSQL

 Every modern database system has some way to compress its data at some level. The obvious reason for this feature is to reduce the size of it's database, especially in today's world where the data is growing exponentially. The less obvious reason is to improve query performance; the idea is: smaller data size means less data pages to scan, which means lesser disk i/o and faster data access. So, in any case, data de-compression should be fast enough so as not to hamper the query performance, if not improve it. Compression is offered at different levels : page compression, row compression, column compression, etc. Columnar databases have the advantage of a very high compression ratio of its column because of presence of a repetetive pattern of contiguous data in a column. Another case is when, in a row oriented database, the column values are so large that it makes sense to compress individual values of the column. Such values can even be kept separately if they do not fit in a

Backtraces in PostgreSQL

PostgreSQL 13 has introduced a simple but extremely useful capability to log a stack trace into the server logs when an error is reported. Let's see the details. There is a GUC to enable stacktrace generation : backtrace_functions. Set it to a comma-separated function names. SET backtrace_functions TO 'func1,func2'; If the error is thrown from one of these functions, a backtrace will be generated and logged into the server log. Note that only superusers can set the backtrace_functions GUC. It can be set locally in a session, or can be included in postgresql.conf file to globally set it. It's easy to see how it would help in a situation where a customer reports an error message. We can find from where it came from by grep'ing for it in the source code. But beyond that, it was all guess work. Not anymore. Now, you can ask the customer to set backtrace_functions to all such functions which are emitting this error message, and get the stack trace. In most cases, the roo

Leveraging SIMD Vectorization

With the advent of column store databases, there was an urge to make use of SIMD vector processing. It naturally fits into the way table data is arranged. Let's first briefly check what is SIMD. It stands for Single Instruction Multiple Data. Today, CPU instructions support this kind of mechanism where the same instruction can be executed simultaneously on multiple data elements. E.g. Say, you want to double all the column values. Or remove the red component of the RGB values of pixels of an image. For large data, these operations are CPU bottlenecks. So SIMD cuts the CPU time significantly by operating simultaneously on 2, 4, 8, 16 or 32 (or more) data elements depending on the size of each data element. So suppose we want to do "arr[i] *= 2" for each element of "int32 arr[]". Normally we would iterate through each of the elements for doing this operation. In the generated assembly code, MUL instruction will be run on each of the elements. With SIMD, we would

PostgreSQL on ARM

In my last blog , I wrote that applications that have been running on x86 might need to undergo some adaptation if they are to be run on a different architecture such as ARM. Let's see what it means exactly. Recently I have been playing around with PostgreSQL RDBMS using an ARM64 machine. A few months back, I even didn't know whether it can be compiled on ARM, being oblivious of the fact that we already have a regular build farm member for ARM64 for quite a while . And now even the PostgreSQL apt repository has started making PostgreSQL packages available for ARM64 architecture. But the real confidence on the reliability of PostgreSQL-on-ARM came after I tested it with different kinds of scenarios. I started with read-only pgbench tests and compared the results on the x86_64 and the ARM64 VMs available to me. The aim was not to compare any specific CPU implementation. The idea was to find out scenarios where PostgreSQL on ARM does not perform in one scenario as good as

ARM: Points to be noted

The story of ARM began in 1993 with a joint venture of Apple with ARM (then Acorn RISC Machines) to launch the "Apple Newton" handheld PC. And the story continues today with news that Apple is going to switch their MACs to ARM processors.  What has not changed in the story is ARM's reputation as a power-efficient processor. This is the primary reason why it is so popular in smarthphones, and why it has made its way into smart cars, drones and other internet-of-things devices where it is crucial to preserve battery life and minimize heat generation. Today even data centers can run on ARM. Due to such widespread market disruption happening, I thought about putting some specific points which I think are good-to-know for users and software developers who have just begun using the ARM ecosystem ... The reason why ARM power consumption is less has to do with the inherent nature of RISC architecture on which ARM is based. RISC instructions are so simple that each of them req