# Difference between revisions of "Paired single"

Paired singles are a unique part of the Gekko/Broadway processors used in the Gamecube and Wii. They provide fast vector math by keeping two single-precision floating point numbers in a single floating point register, and doing math across registers. This page will demonstrate how these instructions work.

## Quantization and Dequantization

All numbers must be quantized before being put into Paired Singles. For conversion from non-floats, in order to allow for greater flexibility, there is a form of scaling implemented. All quantization is controlled by the GQRs (Graphics Quantization Registers). The GQRs are 32bit registers containing the conversion types and scaling factors for storing and loading. (During loading, it dequantizes. During storing, it quantizes.)

 GQR 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 Access U R/W U R/W Field L_Scale L_Type 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 Access U R/W U R/W Field S_Scale S_Type
 Field Description L_* Values for dequantization. S_* Values for quantization. Scale Signed. During dequantization divide the number by (2^scale). During quantization, multiply the number by (2^scale). Type 0: Float (this does no scaling during de/quantization), 4: Unsigned 8bit, 5: Unsigned 16bit, 6: Signed 8bit, 7: Signed 16bit.

To load and store Paired-singles, one must use the psq_l and psq_st instructions respectively, or one of their variants.

### psq_l

```psq_l      frD, d(rA), W, I
```

This instruction dequantizes values from the memory address in d+(rA|0) and puts them into PS0 and PS1 in frD. If W is 1, however, it only dequantizes one number, and places that into PS0. PS1 is loaded with 1.0 always when W is 1. I specifies the GQR to use for dequantization parameters. The two numbers read from the memory are directly after each other, regardless of size (for example, if the GQR specified to load as a u16, you would have d+(rA|0) point to a two-element array of u16s)

##### psq_lx
```psq_lx     frD, rA, rB, W, I
```

This instruction acts exactly like psq_l, except instead of (rA) being offset by d, it is offset by (rB).

##### psq_lu
```psq_lu     frD, d(rA), W, I
```

This instruction acts exactly like psq_l, except rA cannot be 0, and d+(rA) is placed back into rA.

##### psq_lux
```psq_lux    frD, rA, rB, W, I
```

This instruction acts exactly like psq_lx, except rA cannot be 0, and d+(rA) is placed back into rA.

### psq_st

```psq_st     frD, d(rA), W, I
```

This instruction quantizes values from the Paired Singles in frD and places them in the memory address in d+(rA|0). If W is 1, however, it only quantizes PS0. I specifies the GQR to use for dequantization parameters. The two numbers written to memory are directly after each other, regardless of size (for example, if the GQR specified to store as a u16, d+(rA|0) would be treated as a two-element array of u16s)

##### psq_stx
```psq_stx    frD, rA, rB, W, I
```

This instruction acts exactly like psq_st, except instead of (rA) being offset by d, it is offset by (rB).

##### psq_stu
```psq_stu    frD, d(rA), W, I
```

This instruction acts exactly like psq_st, except rA cannot be 0, and d+(rA) is placed back into rA.

##### psq_stux
```psq_stux   frD, rA, rB, W, I
```

This instruction acts exactly like psq_stx, except rA cannot be 0, and d+(rA) is placed back into rA.

## Single Parameter Operations

These functions operate on one FPR.

### ps_abs

```ps_abs     frD, frB
```
```frD(ps0) = abs(frB(ps0))
frD(ps1) = abs(frB(ps1))
```

### ps_mr

```ps_mr      frD, frB
```
```frD(ps0) = frB(ps0)
frD(ps1) = frB(ps1)
```

### ps_nabs

```ps_nabs    frD, frB
```
```frD(ps0) = -abs(frB(ps0))
frD(ps1) = -abs(frB(ps1))
```

### ps_neg

```ps_neg     frD, frB
```
```frD(ps0) = -frB(ps0)
frD(ps1) = -frB(ps1)
```

### ps_res

```ps_res     frD, frB
```
```frD(ps0) = -1/frB(ps0)
frD(ps1) = -1/frB(ps1)
```

Accurate to a precision of 1/4096.

### ps_rsqrte

```ps_rsqrte  frD, frB
```
```frD(ps0) = -1/sqrt(frB(ps0))
frD(ps1) = -1/sqrt(frB(ps1))
```

Accurate to a precision of 1/4096.

## Basic Math

Simple everyday math.

```ps_add     frD, frA, frB
```
```frD(ps0) = frA(ps0) + frB(ps0)
frD(ps1) = frA(ps1) + frB(ps1)
```

### ps_div

```ps_div     frD, frA, frB
```
```frD(ps0) = frA(ps0) / frB(ps0)
frD(ps1) = frA(ps1) / frB(ps1)
```

### ps_mul

```ps_mul     frD, frA, frC
```
```frD(ps0) = frA(ps0) * frC(ps0)
frD(ps1) = frA(ps1) * frC(ps1)
```

### ps_sub

```ps_sub     frD, frA, frB
```
```frD(ps0) = frA(ps0) - frB(ps0)
frD(ps1) = frA(ps1) - frB(ps1)
```

## Comparison

### ps_cmpo0

```ps_cmpo0   crfD, frA, frB
ps_cmpu0   crfD, frA, frB
```
```cfrD = frA(ps0) compare frB(ps0)
```

### ps_cmpo1

```ps_cmpo1   crfD, frA, frB
ps_cmpu1   crfD, frA, frB
```
```cfrD = frA(ps1) compare frB(ps1)
```

## Complex Multiply

These instructions multiply in complex ways

```ps_madd    frD, frA, frC, frB
```
```frD(ps0) = frA(ps0) * frC(ps0) + frB(ps0)
frD(ps1) = frA(ps1) * frC(ps1) + frB(ps1)
```

```ps_madds0  frD, frA, frC, frB
```
```frD(ps0) = frA(ps0) * frC(ps0) + frB(ps0)
frD(ps1) = frA(ps1) * frC(ps0) + frB(ps1)
```

```ps_madds1  frD, frA, frC, frB
```
```frD(ps0) = frA(ps0) * frC(ps1) + frB(ps0)
frD(ps1) = frA(ps1) * frC(ps1) + frB(ps1)
```

### ps_msub

```ps_msub    frD, frA, frC, frB
```
```frD(ps0) = frA(ps0) * frC(ps0) - frB(ps0)
frD(ps1) = frA(ps1) * frC(ps1) - frB(ps1)
```

### ps_muls0

```ps_muls0   frD, frA, frC
```
```frD(ps0) = frA(ps0) * frC(ps0)
frD(ps1) = frA(ps1) * frC(ps0)
```

### ps_muls1

```ps_muls1   frD, frA, frC
```
```frD(ps0) = frA(ps0) * frC(ps1)
frD(ps1) = frA(ps1) * frC(ps1)
```

```ps_nmadd   frD, frA, frC, frB
```
```frD(ps0) = -(frA(ps0) * frC(ps0) + frB(ps0))
frD(ps1) = -(frA(ps1) * frC(ps1) + frB(ps1))
```

### ps_nmsub

```ps_nmsub   frD, frA, frC, frB
```
```frD(ps0) = -(frA(ps0) * frC(ps0) - frB(ps0))
frD(ps1) = -(frA(ps1) * frC(ps1) - frB(ps1))
```

## Miscellaneous

Whatever doesn't fit into the other categories

### ps_merge00

```ps_merge00 frD, frA, frB
```
```frD(ps0) = frA(ps0)
frD(ps1) = frB(ps0)
```

### ps_merge01

```ps_merge01 frD, frA, frB
```
```frD(ps0) = frA(ps0)
frD(ps1) = frB(ps1)
```

### ps_merge10

```ps_merge10 frD, frA, frB
```
```frD(ps0) = frA(ps1)
frD(ps1) = frB(ps0)
```

### ps_merge11

```ps_merge11 frD, frA, frB
```
```frD(ps0) = frA(ps1)
frD(ps1) = frB(ps1)
```

### ps_sel

```ps_sel     frD, frA, frC, frB
```
```if(frA(ps0) >= 0)
frD(ps0) = frC(ps0)
else
frD(ps0) = frB(ps0)
if(frA(ps1) >= 0)
frD(ps1) = frC(ps1)
else
frD(ps1) = frB(ps1)
```

### ps_sum0

```ps_sum0    frD, frA, frC, frB
```
```frD(ps0) = frA(ps0) + frB(ps1)
frD(ps1) = frC(ps1)
```

### ps_sum1

```ps_sum1    frD, frA, frC, frB
```
```frD(ps0) = frC(ps0)
frD(ps1) = frA(ps0) + frB(ps1)
```