05292015, 11:30 PM  #1 (permalink) 
Guest
Posts: n/a

array of records of N bits, N not a multiple of 8
I have run into a case where a program needs to keep an array of M
records of N bits each in memory, and be able to read or write any particular record. The tricky parts: N is not usually a multiple of 8 and its value isn't known until run time. For this application N would typically be in the range 3335 bits  it would never be as much as 64 bits. Each record will hold one or more boolean values (as single bits) followed by one or more signed or unsigned integers of peculiar sizes (like 34 bits). Doing this using structs containing bitfields will result in an array with wasted space in it, since the structs will be padded to whole bytes. One could also explicitly add the requisite extra number of bits to the struct, in which case all the bits would be easily accessed, but again, wasting space on unneeded bits. It isn't a huge waste: 33>40 adds 7 extra bits, 7/40 is 17.5% wasted space. If need be I can live with that. However, since N isn't known until run time I don't see how one could use a structure definition at all, unless C now has some sort of dynamic structure method. That is, with this struct rec_type { unsigned char val1 : 1; unsigned char val2 : 1; unsigned long long count : 30; } one could later allocate space for an array of these and go on normally. But since N isn't known at compile time, the struct cannot be defined. The only workaround I could think of was to up a large set of structs (with count at 30,31,32...56, for instance), and then conditionally use the code path with the correct size structs in it. Anyway, the code for efficiently implementing this sort of "arbitrary bit length record" array must be out there somewhere. Unfortunately searches for "bits" and "records" and "not a multiple of 8" didn't turn up what I was looking for. Anybody know where one might find source code for doing this sort of thing? For now it only need run on x86 derivatives. Thanks, David Mathog 
05292015, 11:30 PM  #2 (permalink) 
Guest
Posts: n/a

array of records of N bits, N not a multiple of 8
mathog <dmathog@gmail.com> writes:
>Anyway, the code for efficiently implementing this sort of "arbitrary >bit length record" array must be out there somewhere. Unfortunately >searches for "bits" and "records" and "not a multiple of 8" didn't turn >up what I was looking for. Anybody know where one might find source >code for doing this sort of thing? For now it only need run on x86 >derivatives. For N entries with M bit, you allocate (N*M1)/CHAR_BIT+1 char objects. We first see this as an array of bits. The char object for bit #i is i/CHAR_BIT and the offset of bit #i within its char object is i%CHAR_BIT. So you can read it via a[ i/CHAR_BIT ]& (1<<(i%CHAR_BIT)). To set or clear such a bit you can »« with (1<<(i%CHAR_BIT)) or »&« with ~(1<<(i%CHAR_BIT)), respectively. Now, you have access to an array of N*M bits. The entry of size M at your offset j, 0 <= j < N, starts at the bit position # M*j and ands at bit position # M*j+(M1). So you can read or write to it with M bit operations as described in the preceding paragraph. 
05302015, 01:00 AM  #3 (permalink) 
Guest
Posts: n/a

array of records of N bits, N not a multiple of 8
ram@zedat.fuberlin.de (Stefan Ram) writes:
>mathog <dmathog@gmail.com> writes: >>Anyway, the code for efficiently implementing this sort of "arbitrary >>bit length record" array must be out there somewhere. Unfortunately >>searches for "bits" and "records" and "not a multiple of 8" didn't turn >>up what I was looking for. Anybody know where one might find source >>code for doing this sort of thing? For now it only need run on x86 >>derivatives. >For N entries with M bit, you allocate (N*M1)/CHAR_BIT+1 >char objects. >We first see this as an array of bits. The char object for >bit #i is i/CHAR_BIT and the offset of bit #i within its char >object is i%CHAR_BIT. So you can read it via a[ i/CHAR_BIT ]& >(1<<(i%CHAR_BIT)). To set or clear such a bit you can »« >with (1<<(i%CHAR_BIT)) or »&« with ~(1<<(i%CHAR_BIT)), >respectively. >Now, you have access to an array of N*M bits. The entry of >size M at your offset j, 0 <= j < N, starts at the bit >position # M*j and ands at bit position # M*j+(M1). So you can >read or write to it with M bit operations as described in the >preceding paragraph. I missed the word »efficiently« in your (mathog's) post. I did not want to claim that my suggestion was especially efficient! 
05302015, 01:00 AM  #4 (permalink) 
Guest
Posts: n/a

array of records of N bits, N not a multiple of 8
mathog wrote:
<snips> > Anyway, the code for efficiently implementing this sort of "arbitrary > bit length record" array must be out there somewhere. Unfortunately > searches for "bits" and "records" and "not a multiple of 8" didn't turn > up what I was looking for. Anybody know where one might find source > code for doing this sort of thing? For now it only need run on x86 > derivatives. If you are will to use it, the C++ library has a specialisation for vectors of bool, which appears to be exactly what you are looking for.  Ian Collins 
05302015, 01:00 AM  #5 (permalink) 
Guest
Posts: n/a

array of records of N bits, N not a multiple of 8
ram@zedat.fuberlin.de (Stefan Ram) wrote:
> mathog <dmathog@gmail.com> writes: >> Anyway, the code for efficiently implementing this sort of "arbitrary >> bit length record" array must be out there somewhere. Unfortunately >> searches for "bits" and "records" and "not a multiple of 8" didn't turn >> up what I was looking for. Anybody know where one might find source >> code for doing this sort of thing? For now it only need run on x86 >> derivatives. > > For N entries with M bit, you allocate (N*M1)/CHAR_BIT+1 > char objects. > > We first see this as an array of bits. The char object for > bit #i is i/CHAR_BIT and the offset of bit #i within its char > object is i%CHAR_BIT. So you can read it via a[ i/CHAR_BIT ]& > (1<<(i%CHAR_BIT)). To set or clear such a bit you can »« > with (1<<(i%CHAR_BIT)) or »&« with ~(1<<(i%CHAR_BIT)), > respectively. That much I understood. The problem is that it isn't going to be very efficient. To read out a 34 bit integer value it will take 34 separate bit tests followed by shifts and ORs into the integer one is building up. What I was looking for was code that would do this with many fewer operations. For now let's ignore the boolean fields and just say that N holds a single unsigned integer. It's bits fall in memory such that there is a core of C full bytes, a leading byte (sometimes) holding less than 8 bits (L, value 0 or 1), and a trailing byte sometimes holding less than 8 bits (T, value 0 or 1). So that N is stored across C+L+T bytes. Presumably there is code out there that could retrieve that integer efficiently using a small set of operations along the lines of: unsigned long long value; memcpy(&value,array+offset,C+L+T); // bit twiddling with a shift and mask // value is now set writing would be less efficient since it needs to do much more: unsigned long long store_this; unsigned long long temp; memcpy(&temp,array+offset,C+L+T); // mask temp to clear the N bits // inverse bit twiddling acting on store_this to set // corresponding N bits which were masked in previous temp = store_this memcpy(array+offset,&temp,C+L+T); L, T, and C vary with the record number. Regards, David Mathog 
05302015, 01:00 AM  #6 (permalink) 
Guest
Posts: n/a

array of records of N bits, N not a multiple of 8
mathog <dmathog@gmail.com> writes:
> I have run into a case where a program needs to keep an array of M > records of N bits each in memory, and be able to read or write any > particular record. The tricky parts: N is not usually a multiple of 8 > and its value isn't known until run time. For this application N > would typically be in the range 3335 bits  it would never be as much > as 64 bits. Each record will hold one or more boolean values (as > single bits) followed by one or more signed or unsigned integers of > peculiar sizes (like 34 bits). The trouble here is that I have no information to help me decide on the tradeoffs that inevitably come into play with such a design. You can get zero wasted space at the expense of some rather fussy bit shifting to access the data, or you can get trivial access if you are prepared to waste a more space. But I'll make one observation: you don't need to store the data as a contiguous record, and that often makes the access simpler. For example: unsigned uint8_t *extra_data = malloc(M); unsigned uint32_t *main_data = malloc(M * sizeof *main_data); can store M single bits and and M values of up to 39 bits with relatively simply access: int get_flag(size_t n) { return extra_data[n] >> 7; } uint64_t get_value(size_t n) { return ((uint64_t)(extra_data[n] & 0x7f) << 32) + main_data[n]; } <snip>  Ben. 
05302015, 01:00 AM  #7 (permalink) 
Guest
Posts: n/a

array of records of N bits, N not a multiple of 8
mathog wrote:
> struct rec_type { > unsigned char val1 : 1; > unsigned char val2 : 1; > unsigned long long count : 30; > } It seems this doesn't work, or at least not well. I'm probably looking in the wrong place, but the standard appears to say that the longest integer type that may be used with a bit field is a long. While gcc accepts this struct it sets the size of the type to 8, not the expected 4. (Perhaps that can be reduced with some command line switches?) 
05302015, 06:30 AM  #8 (permalink) 
Guest
Posts: n/a

array of records of N bits, N not a multiple of 8
On Fri, 29 May 2015 16:01:38 0700, mathog <dmathog@gmail.com> wrote:
>I have run into a case where a program needs to keep an array of M >records of N bits each in memory, and be able to read or write any >particular record. The tricky parts: N is not usually a multiple of 8 >and its value isn't known until run time. For this application N would >typically be in the range 3335 bits  it would never be as much as 64 >bits. Each record will hold one or more boolean values (as single bits) >followed by one or more signed or unsigned integers of peculiar sizes >(like 34 bits). Let C = (N+7)/8. Then an array of C char will hold N bits and never waste more than 7. In your case, an average of 6. So char **array = malloc(M * sizeof *array); for ( i = 0; i < M; i++) array[i] = malloc(C); will allocate your array. You access the char containing bit n of row m with array[m][n/8] You turn the bit on with: array[m][n/8] = (0x80 >> (n%8)) There are similar expressions for turning the bit off, flipping the bit, and testing the bit. Some have even created macros with names like BITON, BITOFF, etc.  Remove del for email 
05302015, 08:30 AM  #9 (permalink) 
Guest
Posts: n/a

array of records of N bits, N not a multiple of 8
mathog <dmathog@gmail.com> writes:
> mathog wrote: >> struct rec_type { >> unsigned char val1 : 1; >> unsigned char val2 : 1; >> unsigned long long count : 30; >> } > > It seems this doesn't work, or at least not well. > > I'm probably looking in the wrong place, but the standard appears to > say that the longest integer type that may be used with a bit field is > a long. > > While gcc accepts this struct it sets the size of the type to 8, not > the expected 4. (Perhaps that can be reduced with some command line > switches?) The only types that the standard guarantees can be used a bit fields are _Bool, signed int, and unsigned int. It's implementationdefined whether a bit field declared as type int is actually signed int or unsigned int. Compilers are permitted, but not required, to support bit fields of other integer types. The width of a bit field is the number specified after the ":". Apparently some ABIs (perhaps most or all of them) require that the alignment of a struct containing a bit field depends on the bit field's declared type. There is, as far as I know, no hint of such a requirement in the C standard. So, for example, struct { int bf:1; } might have 4byte alignment, whereas struct { long long bf:1; } might have 8byte alignment, corresponding to the alignments of int and long long. If you want your structure above to have a size of 4 bytes, you should make count an unsigned int  and for portability, you should probably define it something like: struct rec_type { _Bool val1 : 1; _Bool val2: 1; unsigned int count: 30; }; Or you might make val1 and val2 unsigned ints, particularly if you care about portability to preC99 implementations.  Keith Thompson (The_Other_Keith) kstu@mib.org <http://www.ghoti.net/~kst> Working, but not speaking, for JetHead Development, Inc. "We must do something. This is something. Therefore, we must do this."  Antony Jay and Jonathan Lynn, "Yes Minister" 
05302015, 09:30 AM  #10 (permalink) 
Guest
Posts: n/a

array of records of N bits, N not a multiple of 8
mathog <dmathog@gmail.com> wrote:
> ram@zedat.fuberlin.de (Stefan Ram) wrote: >> mathog <dmathog@gmail.com> writes: >>> Anyway, the code for efficiently implementing this sort of "arbitrary >>> bit length record" array must be out there somewhere. Unfortunately >>> searches for "bits" and "records" and "not a multiple of 8" didn't turn >>> up what I was looking for. Anybody know where one might find source >>> code for doing this sort of thing? For now it only need run on x86 >>> derivatives. >> For N entries with M bit, you allocate (N*M1)/CHAR_BIT+1 >> char objects. >> We first see this as an array of bits. The char object for >> bit #i is i/CHAR_BIT and the offset of bit #i within its char >> object is i%CHAR_BIT. So you can read it via a[ i/CHAR_BIT ]& >> (1<<(i%CHAR_BIT)). To set or clear such a bit you can »« >> with (1<<(i%CHAR_BIT)) or »&« with ~(1<<(i%CHAR_BIT)), >> respectively. > That much I understood. The problem is that it isn't going to be very > efficient. To read out a 34 bit integer value it will take 34 separate > bit tests followed by shifts and ORs into the integer one is building up. No, you don't do it in 34 operations, unless your memory is one bit wide. With L bit wide memory access (and registers) you need, in most cases, int(M/L)+1 memory words, shifts, bitwise AND, and bitwise OR operations, along with simple calculations of the shifts and masks for each one. As one popular example, the LZW compression algorithms write 9 to 16 bit codes to an 8 bit byte file. > What I was looking for was code that would do this with many fewer > operations. For now let's ignore the boolean fields and just say that N > holds a single unsigned integer. It's bits fall in memory such that > there is a core of C full bytes, a leading byte (sometimes) holding less > than 8 bits (L, value 0 or 1), and a trailing byte sometimes holding > less than 8 bits (T, value 0 or 1). So that N is stored across C+L+T > bytes. Presumably there is code out there that could retrieve that > integer efficiently using a small set of operations along the lines of: Yes. You load the leading byte, intermediate bytes, and most often trailing byte, and shift each appropriately, AND if needed (usually not), then OR the result together. > unsigned long long value; > memcpy(&value,array+offset,C+L+T); > // bit twiddling with a shift and mask > // value is now set Note that the reason many processors require appropriate alignment is that otherwise they have to do this bit shifting on every memory access. Well, byte shifting in most cases, but it is still a lot of work on the critical path. > writing would be less efficient since it needs to do much more: > unsigned long long store_this; > unsigned long long temp; > memcpy(&temp,array+offset,C+L+T); > // mask temp to clear the N bits > // inverse bit twiddling acting on store_this to set > // corresponding N bits which were masked in previous > temp = store_this > memcpy(array+offset,&temp,C+L+T); > L, T, and C vary with the record number. Most processors now are pretty efficient with shifts. Many years ago, shifts were done one bit at a time, so the time was proportional to the shift amount. The 8086 allows shifts of 0 to 255 bits, and actually counts off each bit. Later processors use a barrel shifter, after masking off the appropriate bit count. Many 32 bit processors only use five bits of the shift count, such that 0 to 31 bits can be shifted. The C shift operations allow for this. In some cases, it might be more efficient to select a code sequence based on M, optimized for that case. So, for example, LZW compression algorithms take one byte of input data, generate a 9 to 16 bit output codeword, when needed, does appropriate shift and OR, and writes it out. Pretty fast. The JBIG2 algorithm, on the other hand, does process the input one bit at a time, and runs somewhat slower. But for appropriate input, it compresses much better.  glen 