## Thursday, September 6, 2018

### What makes posits so good?

Computational science and all its applied forms, statistics, Deep Learning, computational finance, computational biology, etc., depend on a discrete model of equations that describe the system under study and some operator to extract information from that model. The vast majority of these operators are expressed in linear algebra form, and the dominant sub-operation in these operators is a dot product between two vectors. With the advent of big data and Deep Learning, tensor expressions have become wide-spread, but the basic operations remains squarely the dot product, as can be observed by the key operator that DL accelerators like the Google TPU implement in hardware.

Let's take a closer look at the numerical properties of that dot product. Let:

vector a = ( 3.2e8, 1.0, -1.0, 8.0e7)

vector = ( 4.0e7, 1.0, -1.0, -1.6e8)

what would be the value of dot(a, b)? Simple inspection confirms that ab cancels ab and thus the answer is 2.

Interestingly enough, neither 32-bit floats nor 64-bit double IEEE floating point produce the correct answer when calculating this dot product in order. In contrast, even a very small posit, a 16-bit posit with 2 exponent bits, is able to produce the correct result. What gives?

The individual element values of the vector are relatively ordinary numbers. But in dot products it is the dynamic range of products that the number system needs to capture, and this is where IEEE floating point fails us. IEEE floating point rounds after each product and thus the order of the computation can change the result. This is particularly troubling for concurrent systems, such as multi-core and many-core, where the programmer typically gives up control.

In our particular example, the products ab and ab represent 1.28e16 and -1.28e16 respectively, and to be able to represent the sum of 1.28e16 + 1 requires 53bits of fraction, which 64-bit IEEE doubles do not have, creating cancellation and in the end an incorrect sum.

Posits, on the other hand, have access to a fused dot product, which uses a quire, which can be thought of as a super accumulator that enables the computation to defer rounding till the end of the dot product. This makes it possible for a 16-bit posit to beat a 64-bit IEEE double.,

In general, posits with their finer control over precision and dynamic range, present an improvement over IEEE floating point for computational science applications. In particular, business intelligence and decision support systems benefit from posits and their quires, as statistics algorithms tend to aggregate lots of data to render an assessment on which a decision depends. As the example above demonstrates, that assessment can be very wrong when using the wrong approach.

## Wednesday, September 5, 2018

### Posit Number System: replacing IEEE floating point

The leaders in computing are now profiting from their investments in new number systems initiated half a decade ago. NVIDIA transformed the AI ecosystem with their 16-bit, half-precision float providing a boost over Intel enabling them to gain market share in the valuable data center market. Google designed the Tensor Processing Unit to accelerate AI cloud services; the TPU uses an 8-bit integer format to create a 100x benefit over its competitors. Microsoft is using an 8-bit floating point with 2-bit exponents for its Project Brainwave. And China’s Huawei is using a fixed-point format for its 4G/5G base stations to gain performance per Watt benefit over its US competitors who still use IEEE floating point.

All these companies realized that Moore’s Law and Denning’s scaling having reached a plateau, and the efficiency of computation is now a direct limiter on performance, scaling, and power. For Deep Learning specifically, and High-Performance Computing in general, IEEE floating point has shown its deficiencies in silicon efficiency, information density, and even mathematical accuracy.

The posit number system is positioned as a replacement of IEEE floating point, and offer significant improvements including performance per Watt, information density, and reproducibility. The posit number system is a tapered floating point with very efficient encoding of real numbers. It has only two exceptional values; zero and NaR (not-a-real). The posit encoding improves precision compared to floats of the same bit-width, which leads to higher performance and lower cost for big-data applications. Furthermore, the posit standard defines rules for reproducibility in concurrent environments enabling high-productivity and lower-cost for software application development for multi-core and many-core deployments.

In the following blog posts, we will introduce the posit number system format, and report on experiments that compare IEEE floats to posits in real applications. Here is a reference to a full software environment for you tinkerers: http://stillwater-sc.com/assets/content/stillwater-universal-sw.html