SortTables Error Log
Inspired by Knuth’s example of “The Errors of TeX” in Software – Practice and Experience. This is an attempt to capture the errors I made in SortTables software; my most common problem is carelessness, especially in reorganizing data structures.
Date: date
Symptom: the-symptom
Cause: the-cause
Fix: the-fix
Lesson: the-lesson
Date: 8-2-95
Symptom: System printed endless series of blank lines.
Running in the debugger showed it failed in PatternIndex when it
tried to close the pattern file.
Cause: Didn’t note.
Fix: Didn’t note.
Lesson: Didn’t learn:)
Date: 7-31-95
Symptom: Assertion violation (iterator had pos==-1). Came
about after cutting one or more columns in a ListResult.
Cause: Had started to move from managing the ListResult as
an array of buckets to a tree of buckets. To do this, had an array
twice the size needed, with the leaves starting at
array[num_buckets]. Unfortunately, one of the routines was relying
on looking at the array starting at array[0].
Fix: Make it consistently start at 0. Put off tree stuff for
now.
Lesson: Don’t make partial changes in data structure –
either finish the new implementation or don’t put it in.
Date: 7-29-95
Symptom: “And”-ing together several lists showed the list
always being left all true. At first it looked like the data might
be that way, but checking more carefully showed that certain bits
should be left cleared.
Cause: The andWith() method was working properly to
clear bits but not set them. The value (0 or 1) was being
left-shifted and negated for and-ing in, but when the value was 0,
there was no bit set to shift, so nothing ever got left out.
Fix: Change formula to use “1” rather than “value”.
Lesson:
- Watch those formulas. Simple hand-testing of both cases would
have exposed this. - QA was left out. The BitVector module had a sample
main() routine, but that was never updated when the
andWith() method was added. The simplest test would have
exposed this.
Date: 7-17-95
Symptom: SortTables locks up – apparent infinite loop.
Cause: It was an infinite loop. Had added a
cList_freeObjects() method, that looked at the last object, for as
long as there were objects. Since it looked but didn’t remove the
object, list never shrank, and it just kept looking.
Fix: Don’t just look at the last object, remove it from the
list.
Lesson: Watch use of functions with similar names (here
cList_lastObject() vs. cList_removeLastObject().)
Date: 6-23-95
Symptom: After switching from “big” to “small”, the items in
the small set aren’t ordered properly.
Cause: Reset the column # in the new set to 0; should have
left it at same column # as old set (otherwise throws off display
and others about where things are; change in rep should be
transparent to clients).
Fix: Update that number to proper value.
Lesson: Make sure every data is set in constructors!
Date: 6-23-95
Symptom: After shift in representation from “big” to
“small”, items in the small set were not in order. Moving
cursor-down jumped around rather than going one at a time.
Investigation revealed the table structure in TableResult had
_ptable[0]==_ptable[1] (not good for a permutation). Added a check
for this, so it would die the first time that happened. However, in
the debugger, when TableResult’s ResultSet-based constructor is
called, it ends up with suspicious pointers.
Cause:
- TableResult had been using the cList class; modified to use a
simple array. Had assumed that a circular list was necessary, but
in fact this was never the case. Removed that assumption, so now it
just assumes an arbitrary starting point. - In the new constructor for TableResult – had converted it from
a for-loop-based iterator to a while-loop one. Had the “next” at
the bottom of the loop rather than the top.
Fix:
- Simplified _ptable to simple list.
- Structure the while-loop properly.
Lesson: Need to be very careful on manual code
transformations.
Date: 6-22-95
Symptom: Screen repeated blank lines (memory trap in
debugger).
Cause: Had set up system to start shifting representations
when size changed dramatically, by applying Coplien’s
letter-envelope idiom. Had changed uses of ResultSet so it was an
envelope. However – the result set has an associated iterator that
must be changed as well.
Fix: “Defined the problem away” by making a rule that Cut or
Match operations (the destructive ones) invalidate existing
iterators, thus pushing the problem onto their clients.
Lesson: Don’t switch things halfway. Remember that iterators
and collections are closely tied together. Be careful with
envelope-client idiom; it’s what’s needed here because we’re
switching types, but it brings a lot of baggage. Probably better
off with a separate “rep” hierarchy.
Date: 6-21-95
Symptom: Had converted a set of classes to Coplien’s
envelope-letter structure. After trying to do a reload of the data
(within running program), just started printing blank lines
“forever”. Debugger showed this as dying in the base class
destructor.
Cause: Needed to initialize _rep to NULL in “letter”
classes. Otherwise, it has garbage value even though it isn’t
used.
Fix: Set _rep=NULL.
Lesson: Watch effects on all parts of class data when
re-organizing.
Date: 6-19-96
Symptom: SortTables: deleting records before a point didn’t
delete any records.
Cause:
- Trying to move from a simple array (where deleting required
copying the array) to an array treated as a circular list. Made
“deleteBefore” parallel to “deleteAfter” only it updated the start
rather than the count. BUT – deleteBefore changes both start and
count where deleteAfter really does only affect count. - Also – when fetching the i-th item, didn’t take into account
the starting position. - Also when sorting the list.
Fix: Adjust count as well; return proper position; sort
proper list.
Lesson: Could have probably caught this with some invariant.
BUT – really need to walk through all class data and make sure of
which things an operation affects.
Date: 6-16-95
Symptom: Died in memory allocation. Said the “record_set”
field had a bad value. Examining *this showed it as being ok.
Cause: Had moved a field from child to parent, but forgot to
delete it from child.
Fix: Delete child version.
Lesson: Watch editing habits!!
Date: 6-16-95
Symptom: First record wasn’t shown, last record was blank.
When moving to other columns, items were scrambled.
Cause: Had messed up ListIterator to return record numbers
starting with 1 rather than 0, because the display routine used a
list that was designed for pointers and couldn’t store items with
value 0.
Fix: Not ideal, but moved the “encoding” up to the display
routine that wanted to use the list. Ideally, we’d use a list that
handled integers properly.
Lesson: Keep hacking isolated to the place where it’s
needed. Use proper tools. (Don’t want to change the List class
because it works as documented for the NeXT list class; but it
would be better to have a “real” list class for integers.)
Date: 6-15-95
Symptom: Reading records by looking at offset+length in
file. What should have been a short field was a long string
containing more than one record.
Cause: When writing the file originally, had cut-and-paste
code to write a proper-endian number. But – the value to be written
was in a different variable.
Fix: Change variable name to match.
Lesson: Bad symptom – spreading knowledge of endianness
around. Watch cut-and-paste.
Date: 6-15-95
Symptom: Debugging output – label appeared but rest of
information didn’t.
Cause: Left out a format specifier in printf.
Fix: Put it in.
Lesson: Watch printf (as usual). Surprising the compiler
didn’t catch it – it was a constant string, compiled with -Wall.
Date: 5-29-95
Symptom: Re-loading a file caused a trap.
Cause: Had created a bitvector from static storage to reduce
costs of initializing it to all 0’s. (static data defaults to 0.)
Had intended to re-clear this storage when object was deleted
(rather than created) – but never did.
Fix: Let “BitVector” object allocate the storage
dynamically; profile to see if it matters. (If so, go back to
“clear on delete”.)
Lesson: Don’t put off today… Profile before worrying about
efficiency. (Although I think I may have tagged that at one point
as being a valid efficiency concern.) Keep notes – use TBD comments
to record intentions.
Date: 5-12-95
Symptom: Getting wrong values for min & max
Cause: Moved min/max init. code around, but in one that sets
initial values didn’t properly change columns while setting them.
Some lower-level code relied on actual column number to know what
to fetch, so everything was based on column 0.
Fix: Add next_column() and reset_column() calls.
Lesson: Watch initialization.
Date: 5-8-95
Symptom: Triggered assertion violation in display.cc.
Cause: Added _col_is_restricted to decide whether I had an
exact or approximate count, set true whenever _bounds_xxx[]
changed, and tested on each loop over them. BUT: this precluded
adjusting the bounds when a column changed.
Fix: Change so that counting adjusted columns is independent
of bounds[] setting. Now _col_is_restricted[] tells whether there
was an explicit restriction on THIS column.
Lesson: Things keep getting caught in display.cc; but the
problem is happening in the interaction between ListResult.cc and
ListIterator.cc. Need something that can move the assertions back
down there to catch problem closer to point of call.
Date: 5-8-95
Symptom: Complaints about undefined externs
“read/open/etc”.
Cause: Choosing header files is tricky. Part of the problem
is inconsistency & non-standard choices.
Fix: Ensure that “Universal_include.h” is the first
inclusion in each .cc.
Date: 5-8-95
Symptom: Triggered an assertion violation in display.cc.
Cause: Adding PositionPool: set up a default size of
LISTRESULT_MAX_DIMS for size of cached items. Unfortunately, file
was organized by size being the current actual number of
columns.
Fix: Make PositionPool request FileArray of the proper
size.
Lesson: Tough one. Need fixed-size entities for performance
reasons, but variable-sized ones out on disk. Need to think this
through.
Date: 5-3-95
Symptom: Moved system to Alpha, didn’t work. Problems on
very large data set on ei.cs.
Cause: Same problem again: had a couple “long”s in place.
The real cause is poor configuration management: changes on fir/ST5
were not put back into jingluo/ST5, and jingluo/ST5 become
jingluo/ST6 with “old” code.
Fix: Redo repairs; check diffs.
Lesson: Better config mgmt.
Date: 3-2-95
Symptom: Moved system to Alpha (64-bit int), didn’t
work.
Cause: Used “long” and “int” for types w/assumed 32-bit
sizes. Had also used “raw” machine order to write binary files.
Fix: Define length-specific types depending on system. Use
ntohl() and htonl() to define FROM_BIG_INT32() and TO_BIG_INT32()
macros. (Probably not good names).
Lesson: Ongoing problem with C types. Almost always get pain
from depending on raw types. Need smoother handling of binary I/O.
(Changed to convert on fly here, probably shouldn’t do unless
necessary.)
Date: 2-27-95
Symptom: Tried to substitute a “word wrapping” window for
item display in sorttables. Random words appeared everywhere in
window.
Cause: Had put a “break” in for debugging; this caused only
1 line to be attempted for display. Tried to display line via a
count “MAXLINES”, but didn’t check actual length of screen, so was
grabbing random characters.
Fix: Remove break. Check length first.
Lesson: Be careful with debugging code. Maybe always add a
comment “// DEBUG” to the end of the line. Be a little more ready
to use the debugger rather than dropping into code mods.
Date: 2-24-95
Symptom: Deleted all but those records starting with “c”.
Was ok. Moved to the end. Died when moving directly back to the
front.
Cause: “prevFor” routine computed a loop moving back through
list but tested the original argument rather than the working value
that was being modified in the loop. (Lucky it wasn’t an infinite
loop.)
Fix: Test right thing in loop.
Lesson: This came from leaving “dead” variables around from
the parameter. I’ve tended to treat parameters as “read-only”.
Maybe I should revise that. This is sort of a const-correctness
issue too – the parameter in question wasn’t marked const though
there was no intent to change it. Finally – it’s funny the compiler
didn’t notice and warn that loop control could never change.
Date: 2-24-95
Symptom: After deleting lines, the approximate factor went
up by a huge amount.
Cause: Used “%” instead of “/” to round off.
Fix: Use “/”.
Lesson: Check math functions manually.
Date: 2-24-95
Symptom: Searching for “1” in the year column, with a bunch
of blank entries first, it didn’t move until “199” was typed even
though “1888” was in the list first.
Cause: Comparison function in binary search – took min of
two lengths, when it really should have considered the whole
pattern.
Fix: Don’t take min.
Date: 2-24-95
Symptom: Final “statistics” output showed objects not
deleted. This happened when “@” was used to re-load a file.
Cause: The re-load routine created a new result set and
iterator but didn’t delete any old ones.
Fix: Use calloc of initial space to ensure empty pointers,
make re-load delete old before creating new.
Date: 2-13-95
Symptom: Computing min and max for a series of buckets which
are permutations of record numbers. But when running, got an error
about finding a record which was already in an unchanged list.
Checked min-max list, and found that after the first column’s worth
of buckets, all buckets had same min-max values. (Shouldn’t be
possible since it’s a permutation.)
Cause: Computed bucket number wrong (should have been
i*columns not i*columns*capacity).
Fix: Change computation.
Date: 2-13-95
Symptom: Computing min and max for a series of buckets which
are permutations of record numbers. But when running, got an error
about finding a record which was already in an unchanged list.
Checked min-max list, and found that after the first column’s worth
of buckets, all buckets had same min-max values. (Shouldn’t be
possible since it’s a permutation.)
Cause: Realized that if I had a bucket only partially
filled, I would be trying to look up bucket #-1 which can’t
exist.
Fix: Added assertion code to FileArray to ensure only valid
indexes used, changed calling code to detect “-1” and skip that
non-record. BUT: This doesn’t affect the original problem.
Date: 2-13-95
Symptom: Getting a bucket for which max < min; bucket is
completely filled with -1 values.
Cause: In figuring left-over “-1” values to write to bucket,
didn’t consider case that last bucket filled in perfectly.
Fix: Test for exactly-full bucket.
Date: 2-3-95
Symptom: Destructor didn’t run right.
Cause: Had added a new parent class with a definition
“virtual ~EnvoyFactory () = 0;”. But destructors should not be
abstract.
Fix: Change “=0;” to “{}”.
Lesson: Spend a little more time learning C++. Would have
been nice to have a compiler warning on that one.
Date: 2-2-95
Symptom: Line put on screen didn’t get put in list of
displayed lines. This caused the system to allow backward scrolling
(“haven’t hit that line yet”).
Cause: List object was designed for pointer objects, and
assumed NULL could never be put in list. Using it for int objects,
where value 0 was legal and occurred.
Fix: Don’t let 0 occur; add 1 to value before letting it out
to clients, subtract 1 on use.
Date: 1-20-95
Symptom: Screen just keeps dumping blank lines.
Cause: Trying to introduce a new “iter” variable. Had
prototyped it with a simple global. Then tried to pass right iter
around, but left it off an argument list, so it picked up the
defunct (un-initialized) global one.
Date: 1-20-95
Symptom: Lines not displaying right when moving just off
screen. Should scroll one line but didn’t scroll at all until two
down-arrows; might have corrupted some of screen data with wrong
line, too.
Cause: Deleted old line from list, used “count()-1” to tell
where new line should be, then added new line to list.
Fix: Since list was reduced in size already, should have
used “count()” instead.
Date: 1-4-95
Symptom: Program seemed to run OK but complained when
deleting an array at the end.
Cause: The sort comparison for a qsort was backwards; result
of the sort was compared assuming “ascending” but sort yielded
“descending”. This was causing storage to get trashed.
Date: 1-4-95
Symptom: Sorted items appeared out of order.
Cause: Sorting was done on an illegal field. The field
counter should have been wrapped to 0 when it reached the maximum,
but was incremented instead.