View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0000328 | JVT JM H.264/AVC reference software | decoder | public | 2013-07-11 13:33 | 2013-09-02 21:22 |
Reporter | Dag Ågren | Assigned To | Karsten Suehring | ||
Priority | normal | Severity | crash | Reproducibility | always |
Status | resolved | Resolution | fixed | ||
Platform | ARM, DSP | OS | Bare metal | ||
Product Version | JM 18.5 | ||||
Fixed in Version | JM 18.6 | ||||
Summary | 0000328: Loop filter relies on undefined behavior | ||||
Description | The loop filter does a large number of casts from byte pointers to int and int64 pointers. It relies on ints containing four bytes in little-endian order, and unaligned reads and writes being possible. This is not the case on several platforms. It causes crashes on ARM, and incorrect decoding on TI TMS320C64+ DSPs. In many places, the code casts a byte pointer to an int pointer in order to check four bytes in parallel. This does not work if an int does not contain exactly four bytes. This seems to be the case at least on the TI C64+ DSP platform, where this will cause incorrect decoding. There are a few similar cases with casts to int64, to process eight bytes in parallel. In a few places, the code casts a byte pointer to an int pointer in order to write four bytes at a time. This also does not work under the same circumstances as above. And in a few other places, the parallel writing code mentioned above has apparently been incorrectly copied into places where only a single byte needs to be written. This causes the three following bytes to be zeroed, which will write outside the bounds of the buffer. It will also not work on a big-endian platform, where it will just fill the buffer with zeroes. Furthermore, it will do unaligned 32-bit writes, which ARM does nor support, causing a crash. (This crash seems not to happen when running under Linux on ARM, but only when we tried running on the bare metal. It may be that Linux is catching the exceptions caused by unaligned reads and emulating the reads in software. This would be very, very slow, however.) | ||||
Additional Information | A diff is included that removes all pointer casts, and also removes the use of memset and memcpy. We have not tested conclusively whether memset and memcpy cause problems, but these were changed just to be on the safe side. The code is also somewhat clearer without them, and a good compiler would optimize those anyway. The comments for some of the corrected lines were also incorrect, and we have attempted to fix them, too. | ||||
Tags | No tags attached. | ||||
|
loopfilter.diff (16,097 bytes)
diff -r 33318c0be44f ldecod/src/loopFilter.c --- a/ldecod/src/loopFilter.c Thu Jul 11 13:28:39 2013 +0300 +++ b/ldecod/src/loopFilter.c Thu Jul 11 14:11:06 2013 +0300 @@ -203,7 +203,6 @@ int edge; byte Strength[16]; - int64 *p_Strength64 = (int64 *) Strength; short mb_x, mb_y; int filterNon8x8LumaEdgesFlag[4] = {1,1,1,1}; @@ -266,7 +265,10 @@ // Strength for 4 blks in 1 stripe get_strength_ver_MBAff(Strength, MbQ, edge << 2, mvlimit, p); - if ((p_Strength64[0]) || (p_Strength64[1])) // only if one of the 16 Strength bytes is != 0 + if ( Strength[0] != 0 || Strength[1] != 0 || Strength[2] != 0 || Strength[3] !=0 || + Strength[4] != 0 || Strength[5] != 0 || Strength[6] != 0 || Strength[7] !=0 || + Strength[8] != 0 || Strength[9] != 0 || Strength[10] != 0 || Strength[11] !=0 || + Strength[12] != 0 || Strength[13] != 0 || Strength[14] != 0 || Strength[15] !=0 ) // only if one of the 16 Strength bytes is != 0 { if (filterNon8x8LumaEdgesFlag[edge]) { @@ -312,7 +314,10 @@ // Strength for 4 blks in 1 stripe get_strength_hor_MBAff(Strength, MbQ, edge << 2, mvlimit, p); - if ((p_Strength64[0]) || (p_Strength64[1])) // only if one of the 16 Strength bytes is != 0 + if ( Strength[0] != 0 || Strength[1] != 0 || Strength[2] != 0 || Strength[3] !=0 || + Strength[4] != 0 || Strength[5] != 0 || Strength[6] != 0 || Strength[7] !=0 || + Strength[8] != 0 || Strength[9] != 0 || Strength[10] != 0 || Strength[11] !=0 || + Strength[12] != 0 || Strength[13] != 0 || Strength[14] != 0 || Strength[15] !=0 ) // only if one of the 16 Strength bytes is != 0 { if (filterNon8x8LumaEdgesFlag[edge]) { @@ -554,7 +559,7 @@ { byte *Strength = MbQ->strength_ver[edge]; - if ((*((int *) Strength))) // only if one of the 16 Strength bytes is != 0 + if ( Strength[0] != 0 || Strength[1] != 0 || Strength[2] != 0 || Strength[3] != 0 ) // only if one of the 4 first Strength bytes is != 0 { if (filterNon8x8LumaEdgesFlag[edge]) { @@ -599,7 +604,10 @@ { byte *Strength = MbQ->strength_hor[edge]; - if ((*((int64 *) Strength)) || ((*(((int64 *) Strength) + 1)))) // only if one of the 16 Strength bytes is != 0 + if ( Strength[0] != 0 || Strength[1] != 0 || Strength[2] != 0 || Strength[3] !=0 || + Strength[4] != 0 || Strength[5] != 0 || Strength[6] != 0 || Strength[7] !=0 || + Strength[8] != 0 || Strength[9] != 0 || Strength[10] != 0 || Strength[11] !=0 || + Strength[12] != 0 || Strength[13] != 0 || Strength[14] != 0 || Strength[15] !=0 ) // only if one of the 16 Strength bytes is != 0 { if (filterNon8x8LumaEdgesFlag[edge]) { diff -r 33318c0be44f ldecod/src/loop_filter_mbaff.c --- a/ldecod/src/loop_filter_mbaff.c Thu Jul 11 13:28:39 2013 +0300 +++ b/ldecod/src/loop_filter_mbaff.c Thu Jul 11 14:11:06 2013 +0300 @@ -89,7 +89,7 @@ short blkP, blkQ, idx; //short blk_x, blk_x2, blk_y, blk_y2 ; - int StrValue; + int StrValue, i; short mb_x, mb_y; Macroblock *MbP; @@ -126,7 +126,7 @@ //printf("idx %d %d %d %d %d\n", idx, pixP.x, pixP.y, pixP.pos_x, pixP.pos_y); // Start with Strength=3. or Strength=4 for Mb-edge StrValue = (edge == 0) ? 4 : 3; - memset(Strength, (byte) StrValue, MB_BLOCK_SIZE * sizeof(byte)); + for( i = 0; i < BLOCK_SIZE; i ++ ) Strength[i] = StrValue; } else { @@ -189,7 +189,10 @@ StrValue = 1; } } - *(int*)(Strength+idx) = StrValue * 0x01010101; + Strength[idx] = StrValue; + Strength[idx + 1] = StrValue; + Strength[idx + 2] = StrValue; + Strength[idx + 3] = StrValue; pixP.y += 4; pixP.pos_y += 4; } @@ -295,7 +298,7 @@ short blkP, blkQ, idx; short blk_x, blk_x2, blk_y, blk_y2 ; - int StrValue; + int StrValue, i; int xQ, yQ = (edge < MB_BLOCK_SIZE ? edge : 1); short mb_x, mb_y; @@ -320,7 +323,10 @@ StrValue = (edge == 0 && (!MbP->mb_field && !MbQ->mb_field)) ? 4 : 3; - *(int*)(Strength+idx) = StrValue * 0x01010101; + Strength[idx] = StrValue; + Strength[idx+1] = StrValue; + Strength[idx+2] = StrValue; + Strength[idx+3] = StrValue; } } else @@ -333,7 +339,7 @@ if (MbQ->is_intra_block == TRUE || MbP->is_intra_block == TRUE) { StrValue = (edge == 0 && (!MbP->mb_field && !MbQ->mb_field)) ? 4 : 3; - memset(Strength, (byte) StrValue, MB_BLOCK_SIZE * sizeof(byte)); + for( i = 0; i < BLOCK_SIZE; i ++ ) Strength[i] = StrValue; } else { @@ -412,7 +418,10 @@ } } } - *(int*)(Strength+idx) = StrValue * 0x01010101; + Strength[idx] = StrValue; + Strength[idx + 1] = StrValue; + Strength[idx + 2] = StrValue; + Strength[idx + 3] = StrValue; } } } diff -r 33318c0be44f ldecod/src/loop_filter_normal.c --- a/ldecod/src/loop_filter_normal.c Thu Jul 11 13:28:39 2013 +0300 +++ b/ldecod/src/loop_filter_normal.c Thu Jul 11 14:11:06 2013 +0300 @@ -94,14 +94,14 @@ { byte *Strength = MbQ->strength_ver[edge]; Slice *currSlice = MbQ->p_Slice; - int StrValue; + int StrValue, i; BlockPos *PicPos = MbQ->p_Vid->PicPos; if ((currSlice->slice_type==SP_SLICE)||(currSlice->slice_type==SI_SLICE) ) { // Set strength to either 3 or 4 regardless of pixel position StrValue = (edge == 0) ? 4 : 3; - memset(Strength, (byte) StrValue, BLOCK_SIZE * sizeof(byte)); + for( i = 0; i < BLOCK_SIZE; i ++ ) Strength[i] = StrValue; } else { @@ -116,7 +116,7 @@ { if (edge && (currSlice->slice_type == P_SLICE && MbQ->mb_type == PSKIP)) { - memset(Strength, 0, BLOCK_SIZE * sizeof(byte)); + for( i = 0; i < BLOCK_SIZE; i ++ ) Strength[i] = 0; } else if (edge && ((MbQ->mb_type == P16x16) || (MbQ->mb_type == P16x8))) { @@ -202,14 +202,14 @@ { // Start with Strength=3. or Strength=4 for Mb-edge StrValue = (edge == 0) ? 4 : 3; - memset(Strength, (byte) StrValue, BLOCK_SIZE * sizeof(byte)); + for( i = 0; i < BLOCK_SIZE; i ++ ) Strength[i] = StrValue; } } else { // Start with Strength=3. or Strength=4 for Mb-edge StrValue = (edge == 0) ? 4 : 3; - memset(Strength, (byte) StrValue, BLOCK_SIZE * sizeof(byte)); + for( i = 0; i < BLOCK_SIZE; i ++ ) Strength[i] = StrValue; } } } @@ -223,7 +223,7 @@ static void get_strength_hor(Macroblock *MbQ, int edge, int mvlimit, StorablePicture *p) { byte *Strength = MbQ->strength_hor[edge]; - int StrValue; + int StrValue, i; Slice *currSlice = MbQ->p_Slice; BlockPos *PicPos = MbQ->p_Vid->PicPos; @@ -231,7 +231,7 @@ { // Set strength to either 3 or 4 regardless of pixel position StrValue = (edge == 0 && (((p->structure==FRAME)))) ? 4 : 3; - memset(Strength, (byte) StrValue, BLOCK_SIZE * sizeof(byte)); + for( i = 0; i < BLOCK_SIZE; i ++ ) Strength[i] = StrValue; } else { @@ -248,7 +248,7 @@ { if (edge && (currSlice->slice_type == P_SLICE && MbQ->mb_type == PSKIP)) { - memset(Strength, 0, BLOCK_SIZE * sizeof(byte)); + for( i = 0; i < BLOCK_SIZE; i ++ ) Strength[i] = 0; } else if (edge && ((MbQ->mb_type == P16x16) || (MbQ->mb_type == P8x16))) { @@ -264,7 +264,7 @@ else StrValue = 0; // if internal edge of certain types, we already know StrValue should be 0 - *(int*)(Strength + idx) = StrValue; + Strength[idx] = StrValue; } } else @@ -329,7 +329,7 @@ else StrValue = 1; } - *(int*)(Strength + idx) = StrValue; + Strength[idx] = StrValue; } } } @@ -337,14 +337,14 @@ { // Start with Strength=3. or Strength=4 for Mb-edge StrValue = (edge == 0 && (p->structure == FRAME)) ? 4 : 3; - memset(Strength, (byte) StrValue, BLOCK_SIZE * sizeof(byte)); + for( i = 0; i < BLOCK_SIZE; i ++ ) Strength[i] = StrValue; } } else { // Start with Strength=3. or Strength=4 for Mb-edge StrValue = (edge == 0 && (p->structure == FRAME)) ? 4 : 3; - memset(Strength, (byte) StrValue, BLOCK_SIZE * sizeof(byte)); + for( i = 0; i < BLOCK_SIZE; i ++ ) Strength[i] = StrValue; } } } @@ -983,7 +983,7 @@ { byte *Strength = MbQ->strength_ver[edge]; - if ((*((int *) Strength))) // only if one of the 16 Strength bytes is != 0 + if ( Strength[0] != 0 || Strength[1] != 0 || Strength[2] != 0 || Strength[3] != 0 ) // only if one of the first 4 Strength bytes is != 0 { edge_loop_luma_ver( PLANE_Y, imgY, Strength, MbQ, edge << 2); edge_loop_luma_ver(PLANE_U, imgUV[0], Strength, MbQ, edge << 2); @@ -1009,7 +1009,7 @@ { byte *Strength = MbQ->strength_hor[edge]; - if ((*((int *) Strength))) // only if one of the 16 Strength bytes is != 0 + if (Strength[0]!=0 || Strength[1]!=0 || Strength[2]!=0 || Strength[3]!=0) // only if one of the 16 Strength bytes is != 0 { edge_loop_luma_hor( PLANE_Y, imgY, Strength, MbQ, edge << 2, p) ; edge_loop_luma_hor(PLANE_U, imgUV[0], Strength, MbQ, edge << 2, p); @@ -1039,7 +1039,7 @@ { byte *Strength = MbQ->strength_ver[edge]; - if ((*((int *) Strength))) // only if one of the 16 Strength bytes is != 0 + if ( Strength[0] != 0 || Strength[1] != 0 || Strength[2] != 0 || Strength[3] != 0 ) // only if one of the first 4 Strength bytes is != 0 { edge_loop_luma_ver( PLANE_Y, imgY, Strength, MbQ, edge << 2); edge_loop_luma_ver(PLANE_U, imgUV[0], Strength, MbQ, edge << 2); @@ -1067,7 +1067,7 @@ { byte *Strength = MbQ->strength_hor[edge]; - if ((*((int *) Strength))) // only if one of the 16 Strength bytes is != 0 + if ( Strength[0] != 0 || Strength[1] != 0 || Strength[2] != 0 || Strength[3] != 0 ) // only if one of the first 4 Strength bytes is != 0 { edge_loop_luma_hor( PLANE_Y, imgY, Strength, MbQ, edge << 2, p) ; edge_loop_luma_hor(PLANE_U, imgUV[0], Strength, MbQ, edge << 2, p); @@ -1117,7 +1117,7 @@ { byte *Strength = MbQ->strength_ver[edge]; - if ((*((int *) Strength))) // only if one of the 16 Strength bytes is != 0 + if ( Strength[0] != 0 || Strength[1] != 0 || Strength[2] != 0 || Strength[3] != 0 ) // only if one of the first 4 Strength bytes is != 0 { edge_loop_luma_ver( PLANE_Y, imgY, Strength, MbQ, edge << 2); @@ -1141,7 +1141,7 @@ { byte *Strength = MbQ->strength_hor[edge]; - if ((*((int *) Strength))) // only if one of the 16 Strength bytes is != 0 + if ( Strength[0] != 0 || Strength[1] != 0 || Strength[2] != 0 || Strength[3] != 0 ) // only if one of the first 4 Strength bytes is != 0 { edge_loop_luma_hor( PLANE_Y, imgY, Strength, MbQ, edge << 2, p) ; @@ -1167,7 +1167,7 @@ { byte *Strength = MbQ->strength_ver[0]; - if ((*((int *) Strength))) // only if one of the 16 Strength bytes is != 0 + if ( Strength[0] != 0 || Strength[1] != 0 || Strength[2] != 0 || Strength[3] != 0 ) // only if one of the first 4 Strength bytes is != 0 { edge_loop_luma_ver( PLANE_Y, imgY, Strength, MbQ, 0); @@ -1188,7 +1188,7 @@ { byte *Strength = MbQ->strength_hor[0]; - if ((*((int *) Strength))) // only if one of the 16 Strength bytes is != 0 + if ( Strength[0] != 0 || Strength[1] != 0 || Strength[2] != 0 || Strength[3] != 0 ) // only if one of the first 4 Strength bytes is != 0 { edge_loop_luma_hor( PLANE_Y, imgY, Strength, MbQ, 0, p) ; @@ -1211,7 +1211,7 @@ { byte *Strength = MbQ->strength_ver[0]; - if ((*((int *) Strength))) // only if one of the 16 Strength bytes is != 0 + if ( Strength[0] != 0 || Strength[1] != 0 || Strength[2] != 0 || Strength[3] != 0 ) // only if one of the first 4 Strength bytes is != 0 { edge_loop_luma_ver( PLANE_Y, imgY, Strength, MbQ, 0); @@ -1233,7 +1233,7 @@ { byte *Strength = MbQ->strength_hor[edge]; - if ((*((int *) Strength))) // only if one of the 16 Strength bytes is != 0 + if ( Strength[0] != 0 || Strength[1] != 0 || Strength[2] != 0 || Strength[3] != 0 ) // only if one of the first 4 Strength bytes is != 0 { edge_loop_luma_hor( PLANE_Y, imgY, Strength, MbQ, edge << 2, p) ; @@ -1260,7 +1260,7 @@ { byte *Strength = MbQ->strength_ver[edge]; - if ((*((int *) Strength))) // only if one of the 16 Strength bytes is != 0 + if ( Strength[0] != 0 || Strength[1] != 0 || Strength[2] != 0 || Strength[3] != 0 ) // only if one of the first 4 Strength bytes is != 0 { edge_loop_luma_ver( PLANE_Y, imgY, Strength, MbQ, edge << 2); @@ -1282,7 +1282,7 @@ { byte *Strength = MbQ->strength_hor[0]; - if ((*((int *) Strength))) // only if one of the 16 Strength bytes is != 0 + if ( Strength[0] != 0 || Strength[1] != 0 || Strength[2] != 0 || Strength[3] != 0 ) // only if one of the first 4 Strength bytes is != 0 { edge_loop_luma_hor( PLANE_Y, imgY, Strength, MbQ, 0, p) ; @@ -1307,7 +1307,7 @@ { byte *Strength = MbQ->strength_ver[edge]; - if ((*((int *) Strength))) // only if one of the 16 Strength bytes is != 0 + if ( Strength[0] != 0 || Strength[1] != 0 || Strength[2] != 0 || Strength[3] != 0 ) // only if one of the first 4 Strength bytes is != 0 { edge_loop_luma_ver( PLANE_Y, imgY, Strength, MbQ, edge << 2); @@ -1331,7 +1331,7 @@ { byte *Strength = MbQ->strength_hor[edge]; - if ((*((int *) Strength))) // only if one of the 16 Strength bytes is != 0 + if ( Strength[0] != 0 || Strength[1] != 0 || Strength[2] != 0 || Strength[3] != 0 ) // only if one of the first 4 Strength bytes is != 0 { edge_loop_luma_hor( PLANE_Y, imgY, Strength, MbQ, edge << 2, p) ; @@ -1358,7 +1358,7 @@ { byte *Strength = MbQ->strength_ver[edge]; - if ((*((int *) Strength))) // only if one of the 16 Strength bytes is != 0 + if ( Strength[0] != 0 || Strength[1] != 0 || Strength[2] != 0 || Strength[3] != 0 ) // only if one of the first 4 Strength bytes is != 0 { edge_loop_luma_ver( PLANE_Y, imgY, Strength, MbQ, edge << 2); @@ -1382,7 +1382,7 @@ { byte *Strength = MbQ->strength_hor[edge]; - if ((*((int *) Strength))) // only if one of the 16 Strength bytes is != 0 + if ( Strength[0] != 0 || Strength[1] != 0 || Strength[2] != 0 || Strength[3] != 0 ) // only if one of the first 4 Strength bytes is != 0 { edge_loop_luma_hor( PLANE_Y, imgY, Strength, MbQ, edge << 2, p) ; |
|
I'm inclined to apply the patch to prefer platform independence over coding efficiency. The patch seems to be for the decoder only. I guess the same should be done for the encoder. (ideally the deblocking filter code should be there only once, but that's a different issue) |
|
I applied the patch. It seems there was only one related 64-bit cast in the encoder. |
Date Modified | Username | Field | Change |
---|---|---|---|
2013-07-11 13:33 | Dag Ågren | New Issue | |
2013-07-11 13:33 | Dag Ågren | File Added: loopfilter.diff | |
2013-09-02 21:04 | Karsten Suehring | Note Added: 0000592 | |
2013-09-02 21:04 | Karsten Suehring | Assigned To | => Karsten Suehring |
2013-09-02 21:04 | Karsten Suehring | Status | new => acknowledged |
2013-09-02 21:22 | Karsten Suehring | Note Added: 0000593 | |
2013-09-02 21:22 | Karsten Suehring | Status | acknowledged => resolved |
2013-09-02 21:22 | Karsten Suehring | Fixed in Version | => JM 18.6 |
2013-09-02 21:22 | Karsten Suehring | Resolution | open => fixed |