TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE
SUBTITLE: I WISH I WAS AN OSCCAL MYER WINNER
UPDATE APR 5/06: C VERSION ONLINE!
Many thanks to ARod(alejmrm) who ported and posted "C" version of the
following ASM OSCCAL routine. Two versions on third page, fourth post
down. Thanks a million ARod!
PURPOSE:
Discuss methods used to reduce size of typical OSCCAL routine using ARV machine language.
Quote:
Your code will not properly calibrate the oscillator!
Quote:
It
is clear from this that your code does not do the same operations that
the designers of the Butterfly thought were required for accurate
calibration!
Quote:
What value does TMP have when entering the fragment?
Quote:
My main question also revolves around TMP, indirectly.
Quote:
Seems
to be an interest in the way that you get the OSCCAL value, and I am
interested in the details... do you mind to move it to a new thread to
talk about it? As some body said, what kind of asumptions are you
taking?, etc...
Thanks.. really clever ideas!
PREAMBLE:
When Giorgos announced his latest version of a Butterfly Bootloader.
His first complaint about condensing its size was the length of the
OSCCAL routine. Since I had to deal with same problem myself doing
bootloaders, I thought I might help him out by giving him my much
smaller routine. At the time I did not expect the critical reaction and
wide interest in the routine.
Due to the many comments messing-up his original thread about the
routine and how it works, plus private mail request, I started this
thread to answer all those questions without hijacking his original
thread. I hope you find this discussion informative as well as
entertaining.
CAVEAT LECTOR:
Some of the techniques contained herein are outside normal coding practices and may be offensive to some programmers.
TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE
INTRODUCTION: THE HEART OF THE OSCCAL ROUTINE
The heart of the OSCCAL routine is very simple.The basic idea is to set
up a counter on the Oscillator, after a fixed amount of time you
examine the counter to see if Oscillator is running fast or slow. You
then adjust the OSCCAL register, then go back and check again. We
repeat this process until the Oscillator is running within an
acceptable range.
Because the Oscillator rate can change with temperature etc. I go
back and re-calibrate it each time the Butterfly wakes, something the
other Bootloaders seem to have missed.
I am going to dig up an older bootloader with original OSSCAL
routine and compare it to my current condensed version and try to
remember the steps and logic I used to get from one to the other.
CONTENTS AND CODE FRAGMENTS:
For the purpose of this discussion I am only focusing on the actual OSCCAL adjustment loop and not the pre-amble or set-up.
As you examine the code fragments and techniques used, please keep
in mind that the goal is to produce the smallest code possible and many
traditional and accepted coding practices may go out-the-window.
FINAL WARNING:
Make sure that all the necessary precautions have been taken... you are about to enter the mind of a certified Hack.
TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE
Here's a typical OSCCAL routine. This one clipped from George Kolovos' dis-assembly of the Atmel *.hex file.
The first thing that jumps-out-at-me are the two routines at the bottom
that adjust the OSCCAL register. To me they stand out like two warts on
an otherwise straight-forward routine. So I would focus on those first.
Code:
; //////////////////////////////////////////////////////////////////////////////
; // Description: An enhanced bootloader for the ATMEL Butterfly unit //
; // Copyright (C) 2006 George Kolovos //
; // //
; //////////////////////////////////////////////////////////////////////////////
-------------------------------------------------
; Test the internal oscillator speed
OSC_Test: ser tmp1 ; Reset any TC1 and TC2 flags
out TIFR1,tmp1
out TIFR2,tmp1
sts TCNT1H,Zero ; Reset TCNT1
sts TCNT1L,Zero
sts TCNT2,Zero ; Reset TCNT2
ldi tmp1,1 ; Start TC1
sts TCCR1B,tmp1
; Wait for the TC2 compare-match
sbis TIFR2,OCF2A ; 200 * 1/32768 = 6103.52us
rjmp PC-1
sts TCCR1B,Zero ; Time-out: Stop the TC1
sbic TIFR1,TOV1 ; TC1 overflowed?
rjmp OSC_Too_Fast
lds XL,TCNT1L ; X = TCNT1
lds XH,TCNT1H
; Is the oscillator slow?
cpi XL,Byte1(OSC_Lo) ; cpi TCNT1,6120
ldi tmp1,Byte2(OSC_Lo)
cpc XH,tmp1
brlo OSC_Too_Slow
; Is the oscillator fast?
cpi XL,Byte1(OSC_Hi) ; cpi TCNT1,6251
ldi tmp1,Byte2(OSC_Hi)
cpc XH,tmp1
brsh OSC_Too_Fast
; The oscillator frequency is within the acceptable limmits
OSC_Done: ret
; Decrease the oscillator frequency
OSC_Too_Fast: lds tmp1,OSCCAL ; OSCCAL--;
dec tmp1
sts OSCCAL,tmp1
rjmp OSC_Test
; Increase the oscillator frequency
OSC_Too_Slow: lds tmp1,OSCCAL ; OSCCAL++;
inc tmp1
sts OSCCAL,tmp1
rjmp OSC_Test
---------------------------------------------------------
ITLE: ADVENTURES IN SHRINKAGING THE OSCCAL ROUTINE
Here are the two program "warts" that bothered me.
I capitalized them for my own benefit.
Code:
OSC_Too_Fast:
LDS TMP1,OSCCAL ;DECREASE FREQ
DEC TMP1
STS OSCCAL,tmp1
RJMP OSC_Test
OSC_Too_Slow:
LDS TMP1,OSCCAL ;INCREASE FREQ
INC TMP1
STS OSCCAL,TMP1
RJMP OSC_Test
The first thing that strikes me is that with the exception of the INC
& DEC statement the two routines are identical. Perhaps we can
combine them somehow and turn two warts into just one.
We'll start by combining the two writes into one by using a programming technique I call "Sharing-a-Piece-of-Arse!"
Code:
OSC_FAST: LDS TMP1,OSCCAL
DEC TMP1 ;<=== STS OSCCAL REMOVED
RJMP OSC_ADJ ;<=== RE-ADJUSTED
OSC_SLOW: LDS TMP1,OSCCAL
INC TMP1
OSC_ADJ: STS OSCCAL,TMP1 ;<=== SHARED ARSE-END
RJMP OSC_TEST
The next thing I notice is that there are two reads of the OSCCAL
register. If we move that read back into the main program BEFORE these
rotuines are called we can eliminate another line of code:
Code:
OSC_FAST: DEC TMP1 ;<=== LDS REMOVED
RJMP OSC_ADJ
OSC_SLOW: INC TMP1
OSC_ADJ: STS OSCCAL,TMP1 ;<=== LDS REMOVED
RJMP OSC_TEST
Now we've really shrunk those two routines down into one rather small extention of the main program. |
|
TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE
Next I'd like to move that write (STS OSCCAL) out of this program
"extension" and back to the main routine. There's no savings in terms
of memory space for doing this so you'll just have to trust me on this
for the moment, I have something in mind for later.
Since the write is the last thing we do before jumping back to the
start of our calibration loop, the ideal place to move it would be
right at the start.
We're not saving anything by doing this, but I'm half-way to doing something that will. So just wait a bit.
Remember the LDS OSCALL statement that we removed earlier, well a
good place for it would be just before we enter our main calibration
loop, but we'll need to switch to another unused register so we can
hold that value undisturbed. Since tmp1 gets used inside the routine
I'll call this new register TMP and make equal to R0.
Remember we haven't saved anything yet, however I am working up to
something that requires that I make this move. So after moving the
write out and making the other changes, re-adjusting the RJMPs and
cleaning up, this is what we have:
Code:
; MAIN OSCCAL CALIBRATION LOOP
LDS OSCCAL,TMP ;<===MOVED HERE
OSC_TEST:
STS OSCCAL,TMP ;<===MOVED HERE
RET
; OSCCAL ADJUSTMENT ROUTINES
OSC_FAST: DEC TMP
RJMP OSC_TEST ;<===RE-ADJUSTED
OSC_SLOW: INC TMP
RJMP OSC_TEST ;<===RE-ADJUSTED
At this point I think I've answed some of the questions that were
quoted at the start of this tread concerning the initial value of TMP
prior to entering the calibration loop:
Quote:
What value does TMP have when entering the fragment?
Quote:
My main question also revolves around TMP, indirectly.
Quote:
As some body said, what kind of asumptions are you taking?
Well obviously from the code, the value of TMP prior to entering
the main loop is the current value of OSCCAL that we are going to
adjust within the routine. So I hope that answers the above questions
to everyone's satisfaction.
Code:
LDS OSCCAL,TMP
;MAIN OSCCAL CALIBRATION LOOP
OSC_TEST:
I should have posted this one extra line. I didn't expect to be
cross-examined on the routine, I was posting it to help someone who
knew exactly what TMPs value would be.
TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE
Well the two "warts" are down to a single line with a jump back to the
main routine. We've taken two warts, combined into a single wart, then
turn it into two small blemishes.
I don't think there is much more we can do with them at this point.
We're gonna' have to go back and re-examine the main routine. We'll
start by looking at where these two routines are called from:
Code:
sts TCCR1B,Zero
sbic TIFR1,TOV1
rjmp OSC_Too_Fast ;<== CALL TOO FAST
lds XL,TCNT1L
lds XH,TCNT1H
cpi XL,Byte1(OSC_Lo)
ldi tmp1,Byte2(OSC_Lo)
cpc XH,tmp1
brlo OSC_Too_Slow ;<== CALL TOO SLOW
cpi XL,Byte1(OSC_Hi)
ldi tmp1,Byte2(OSC_Hi)
cpc XH,tmp1
brsh OSC_Too_Fast ;<== CALL TOO FAST
The sequence is TOO_FAST / too_slow / TOO_FAST.
For reasons that will become apparent shortly I want the sequence
to change from 2FAST/2slow/2FAST to 2FAST/2FAST/2slow. We can do this
easily by just switching the last two tests around.
If you'd like to see a actual example of the routine we've worked
out so far, even though we're only half-way into my next reduction,
Giorgos somehow got this incomplete version into his latest Bootloader.
The following is a snippet from his latest source code.
I've highlighted the areas of interest for us: (Assume TMP=R0) and
notice that the sequence is now 2FAST/2FAST/2slow and it contains all
the exact modifications we've made so far.
Code:
lds r0,OSCCA ;<=== FETCHING OSCAL PRIOR TO ENTERING LOOP
OSC_Test: sts OSCCAL,r0 ;<=== SETTING OSCCAL AT START OF LOOP
ser tmp1
out TIFR1,tmp1
out TIFR2,tmp1
sbis TIFR2,TOV2
rjmp PC-1
lds XL,TCNT1L
lds XH,TCNT1H
sts TCNT1H,Zero
sts TCNT1L,Zero
sbic TIFR1,TOV1
rjmp OSC_Too_Fast ;<============== TOO_FAST
cpi XL,Byte1(Upper_Limmit)
ldi tmp1,Byte2(Upper_Limmit)
cpc XH,tmp1
brsh OSC_Too_Fast ;<============ TOO_FAST
cpi XL,Byte1(Lower_Limmit)
ldi tmp1,Byte2(Lower_Limmit)
cpc XH,tmp1
brlo OSC_Too_Slow ;<============ TOO_SLOW
OSC_Done: sts TCCR1B,Zero
sts TCCR2A,Zero
ret
OSC_Too_Fast: dec r0 ;<========== LDS & STS REMOVED
rjmp OSC_Test
OSC_Too_Slow: inc r0 ;<========== LDS & STS REMOVED
rjmp OSC_Test
TITLE: DANGEROUS DAN's SHRINKAGE OF THE OSCCAL ROUTINE
One programming "Trick" when you have an A-or-B option like we have
above with the TOO_FAST-or-TOO_SLOW option, is to ASSUME that one of
them is true, and later if it turns out not to be true, you re-adjust.
Code:
DANGEROUS DAN PROGRAM TIP: USE ASSUMPTIONS TO SIMPLIFY YOUR CODE
Obviously it's best to assume the case that will be true the most
often, but here we don't have that option, but we do have TWO calls to
the TOO_FAST routine and only one to the TOO_SLOW so let's choose to
ASSUME that our oscillator is always running too fast.
I notice that when the Oscillator is too fast we decrement TMP=R0
so I add this as the first line of our routine. Now when the Oscillator
actually turns out to be too fast our assumption is correct so we just
jump back to the start of the main loop and totally by-pass the old
TOO_FAST routine.
Making this change and removing the TOO_FAST routine we end-up saving another program line and streamline our routine.
While we're at it another trick with the AVRs with so many
registers is to pre-define them to values that you might find handy.
Just about everyone has a ZERO, but I also define ONE, TWO, THREE,
FOUR, V128 and FF=255 because I find they come-in-handy.
I notice that in the 2nd line of the following code segment Giorgos
is setting the TMP1 to 255 using the SER command and writing it to the
TIFRn ports. If we use my pre-defined FF Register we can knock off
another word from this routine.
Code:
DANGEROUS DAN PROGRAM TIP: USE PRE-DEFINED REGISTERS
Code:
OSC_TEST: DEC R0 ;<======= NEW: ASSUME TOO FAST
STS OSCCAL,R0
;------------------------------
; ser TMP1
; out TIFR1,TMP1
; out TIFR2,TMP1
;------------------------------
OUT TIFR1,FF ;<======== CHANGED
OUT TIFR2,FF ;<======== CHANGED
sbis TIFR2,TOV2
rjmp PC-1
lds XL,TCNT1L
lds XH,TCNT1H
sts TCNT1H,Zero
sts TCNT1L,Zero
sbic TIFR1,TOV1
RJMP OSC_TEST ;<=== RE-ADJUSTED
cpi XL,Byte1(Upper_Limmit)
ldi tmp1,Byte2(Upper_Limmit)
cpc XH,tmp1
BRSH OSC_TEST ;<=== RE-ADJUSTED
cpi XL,Byte1(Lower_Limmit)
ldi tmp1,Byte2(Lower_Limmit)
cpc XH,tmp1
brlo OSC_Too_Slow
OSC_Done: sts TCCR1B,Zero
sts TCCR2A,Zero
RET
;--------------------------------------
; OSC_Too_Fast: dec r0 ;<=== UN-NEEDED!
; rjmp OSC_Test ;<=== UN-NEEDED!
;--------------------------------------
OSC_Too_Slow: inc r0
rjmp OSC_Test
TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE
We're assuming that the oscillator is running fast and DECrementing R0
without permission so if it turns out that we're wrong we need to
adjust for this.
The simplest solution is to add one to get us back to where we
should be; then we add another one to increment the OSCCAL value. My
first reaction is to simply add another INC R0 to the TOO_SLOW routine.
But we have a register called TWO=2 so instead of adding any more
program steps I simply change the INC R0 to ADD R0,TWO.
Now let's sweep away the discarded code fragment and see where we're at.
Code:
OSC_TEST: DEC R0
STS OSCCAL,R0
OUT TIFR1,FF
OUT TIFR2,FF
sbis TIFR2,TOV2
rjmp PC-1
lds XL,TCNT1L
lds XH,TCNT1H
sts TCNT1H,Zero
sts TCNT1L,Zero
SBIC TIFR1,TOV1
RJMP OSC_TEST
cpi XL,Byte1(Upper_Limmit)
ldi tmp1,Byte2(Upper_Limmit)
CPC XH,tmp1
BRSH OSC_TEST
cpi XL,Byte1(Lower_Limmit)
ldi tmp1,Byte2(Lower_Limmit)
cpc XH,tmp1
brlo OSC_Too_Slow
sts TCCR1B,Zero
sts TCCR2A,Zero
RET
OSC_Too_Slow: ADD R0,TWO :<=== MODIFIED!
RJMP OSC_Test
TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE
Earlier I wanted the structure of the main routine to be switched from
2FAST/2slow/2FAST to 2FAST/2FAST/2slow? This next move will explain
why.
If you remember we moved the DEC R0 out of our old OSC_TOO_FAST routine
and stuck in the main-line to eliminate the entire routine. We can do
the same with the ADD R0,TWO in the TOO_SLOW routine and remove it
also. This simplifies our code and eliminates another program
statement.
The reason I wanted the 2FAST/2FAST/2slow structure was to make
this move. If we left it as it was, we'de have to INC R0, then later
ADD R0,TWO then SUB R0,TWO.
Code:
OSC_TEST: DEC R0
STS OSCCAL,R0
OUT TIFR1,FF
OUT TIFR2,FF
sbis TIFR2,TOV2
rjmp PC-1
lds XL,TCNT1L
lds XH,TCNT1H
sts TCNT1H,Zero
sts TCNT1L,Zero
SBIC TIFR1,TOV1
RJMP OSC_TEST
cpi XL,Byte1(Upper_Limmit)
ldi tmp1,Byte2(Upper_Limmit)
CPC XH,tmp1
BRSH OSC_TEST
ADD R0,TWO ;<=== NEW ADDITION!
cpi XL,Byte1(Lower_Limmit)
ldi tmp1,Byte2(Lower_Limmit)
cpc XH,tmp1
BRLO OSC_TEST ;<=== RE-ADJUSTED
sts TCCR1B,Zero
sts TCCR2A,Zero
RET
So we've finally removed those two blemishes and have a fairly decent piece of code now.
Two final things I like to do is remove the two lines that shut-off the timers, I deal with that in another part of my code.
The other thing is to get rid of that ugly rjmp PC-1 on line six.
No progammer I know ever uses this nomeclature for relative jumps. It's
a sure sign that this was a dis-assembly of someone's HEX file.
ITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE
What we have so far looks like this:
Code:
OSC_TEST: DEC R0 ;SET-UP TIMERS
STS OSCCAL,R0
OUT TIFR1,FF
OUT TIFR2,FF
W6103: SBIS TIFR2,TOV2 ;WAIT
RJMP W6103
lds XL,TCNT1L ;READ TIMER
lds XH,TCNT1H
sts TCNT1H,Zero
sts TCNT1L,Zero
SBIC TIFR1,TOV1 ;CHECK TOO FAST
RJMP OSC_TEST
cpi XL,Byte1(Upper_Limmit)
ldi tmp1,Byte2(Upper_Limmit)
CPC XH,tmp1
BRSH OSC_TEST
ADD R0,TWO ;CHECK TOO SLOW
cpi XL,Byte1(Lower_Limmit)
ldi tmp1,Byte2(Lower_Limmit)
cpc XH,tmp1
BRLO OSC_TEST
RET ;RETURN IF WE'RE JUST RIGHT
PRELIMINARY CONCLUSION:
So far we've taken a fairly large piece of code, straightened it out by
removed two ugly routines hanging off the end, and reduced it from
about 32 lines to 22, about 2/3rds it's original size.
Even if you're not hell-bent on reducing a routine's size, reducing
it's complexity is always a good thing. Bugs are directly proportional
to some power of the length and complexity of the code. With Microshaft
Windoze being millions of lines of code, is it any wonder it crashes on
a regular basis?
With less "going on" in the silicon, there's less chance of
anything going awry. Also, trying to debug a "flat, smooth" routine is
always far easier and much faster than trying to sort out some lumpy,
bent and twisted piece of "spaghetti code" from an unskilled
code-smithy.
Code:
BUGS ~= [ SIZE x COMPLEXITY ]**P, where P >1
PROGRAMING TIP: REDUCE SIZE & COMPLEXITY OF CODE
Now that we've removed all the ugliness from the code and reduced
it in size, Most people would expect that I stop at this point.
Obvioulsy don't know me very well, because this is the exact point
where I start pulling out my bag of "dirty" tricks and try to squeeze
the program down even further. I'm not happy until I've beaten a
routine down so far it changes from Coal to Diamond.
TO BE CONTINUED AFTER THE INTERMISSION...
|
|
TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE
THE SAGA CONTINUES: REDUCTUM AD ABSURDUM
From the time we are children, we are taught integers with the visual
aid of a ruler. So we normally think of a single computer integer byte
as running from the value of 0 at one end to "255" at the other end
running in a straight line, liek a ruler.
However, your microprocessor thinks of integers in a different way
than us humans. To a digital MCU a byte wraps around on itself. The
"distance" between 0 and 255 is not 255 but 1, if you subtract one from
zero you get 255 and if you add one to 255 you get zero. So it's best
to think of unsigned integers as little circles that wrap around on
themselves the way a "digital brain" does.
Code:
DAN's PROGRAM TIP: THINK OF UNSIGNED INTEGERS AS LITTLE CIRCLES
In our routine we are adjusting the OSCCAL value which is a single
byte that will wrap around on itself. So instead of incrementing it in
one direction when we are too slow, and decrmenting it when we are too
high, perhaps we can just take it in ONE direction knowing it will
eventually wrap-around to the value we need. Sure it will take a little
longer at the microprocessor level, but will that translate into any
real difference on a human scale?
Code:
OSC_TEST: DEC TMP ;SET-UP TIMERS
STS OSCCAL,TMP
OUT TIFR1,FF
OUT TIFR2,FF
W6103: SBIS TIFR2,TOV2 ;WAIT
RJMP W6103
lds XL,TCNT1L ;READ TIMER
lds XH,TCNT1H
sts TCNT1H,ZERO
sts TCNT1L,ZERO
SBIC TIFR1,TOV1 ;CHECK TOO FAST?
RJMP OSC_TEST
cpi XL,Byte1(Upper_Limmit)
ldi tmp1,Byte2(Upper_Limmit)
CPC XH,tmp1 ;CHEC TOO FAST?
BRSH OSC_TEST
ADD TMP,TWO ;<=============== CAN WE REMOVE THIS?
cpi XL,Byte1(Lower_Limmit)
ldi tmp1,Byte2(Lower_Limmit)
cpc XH,tmp1 ;CHECK TOO SLOW?
BRLO OSC_TEST
RET ;OSCCAL IS FINE
TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE
My latest Bootloader, the [CRICKET], will "chirp" when reset and
"chirp" again after the oscillator's been calibrated, then it waits a
while for you to start your upload and if nothing happens it will give
a final chirp before going to sleep. I did this because I got tired of
havng to press on the joystick each time I wanted to upload a new
program.
The time between the two initial "chirps" is the time that it takes to
calibrate the oscillator. So I loaded a version of the Bootloader with
the ADD TMP,TWO included into a Butterfly and another without it into
another and compared results. The one with it missing was slightly
slower, but without the audible clues, no one would ever notice. So on
a human scale there's not much difference.
By safely removing the ADD TMP, TWO line we can save another program step:
Code:
OSC_TEST: DEC TMP ;SET-UP TIMERS
STS OSCCAL,TMP
OUT TIFR1,FF
OUT TIFR2,FF
W6103: SBIS TIFR2,TOV2 ;WAIT
RJMP W6103
lds XL,TCNT1L ;READ TIMER
lds XH,TCNT1H
sts TCNT1H,ZERO
sts TCNT1L,ZERO
SBIC TIFR1,TOV1 ;CHECK TOO FAST?
RJMP OSC_TEST
cpi XL,Byte1(Upper_Limmit)
ldi tmp1,Byte2(Upper_Limmit)
CPC XH,tmp1 ;CHECK TOO FAST?
BRSH OSC_TEST
;<===== ADD TMP,TWO REMOVED!
cpi XL,Byte1(Lower_Limmit)
ldi tmp1,Byte2(Lower_Limmit)
cpc XH,tmp1 ;CHECK TOO SLOW?
BRLO OSC_TEST
RET ;OSCCAL IS FINE
So the logic of our routine has changed: instead of incrementing or
decrementing based on whether the oscillator is fast or slow, now it
knows the oscillator is out, and decrements the OSCCAL register. If we
happen to be moving it in the "wrong" direction, no problem because,
once it hits ZERO it will wrap to 255 and start working downward
towards the correct setting. In fact the high bit is not used so it
will "wrap" at 127 so the entire process is very fast.
The fact that it takes a tiny bit longer is actually a bonus,
because when first started, it's best to let the oscillator "stablize"
and the more time that passes, the better. So not only have we removed
a program step, we've actually improved over the "standard" routine.
TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE
Since the entire logic has changed, We should re-writing the test part
of the routine. Now that we're not concerned if we're running fast or
slow, but only if we're outside the acceptable range, maybe there's
some savings to be had.
Since Upper and Lower Limits are both constants, perhaps we can
calculate the difference at assembler time. Then we just read the
Ocillator subtract the ideal speed of 6103 and see if difference is
within that range.
At this point I really did not expect to see much savings since a
compare is almost the same as a subtraction, so instead of testing the
Oscillator reading against an Upper_Limit and a Lower_Limit, we're
subtracting the Ideal_Limit and comparing the difference to the
difference between the Upper_Limit and the Lower_Limit.
Both approaches would take about the same number of program steps, two
16-bit compares is going to be the same as a 16-bit subtraction and a
16-bit compare. So I stop here for a while.
TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE
All that leave us with is that first test for the Oscillator running
too fast. I wonder if we could do something with these lines of code
that check if out timer has overflowed.
Code:
SBIC TIFR1,TOV1
RJMP OSC_TEST
I've seen OSCCAL routines with 6103 +/- 100uS but most seem to use +/- 50uS and I use a "tighter" +/- 40uS.
This means that our "correct" range is 80uS long. What are the
chances that when the timer overflows that it will co-incidentally fall
within this range and give us a "False Positive?"
Well based on pure randomness it would be:
Code:
Probability = Correct_Range/Total_Range x 100%
Probability = 80/65,536 x 100%
Probability = 0.12%
One in a thousand...Hmm, not too bad, however, the real probability is much, much lower than this.
There's a small chance that the Oscillator can be so fast that it's
outside our range test and overflows to 10. There's an even smaller
chance that it will be over by 100. There's an even micro-chance that
it will be out by 1000.
The probability that the oscillator could be that far out, by 6100,
AND still fall within my small range to give a false positive are
slim-to-none. We can safely eliminate this line from the code and save
ourselves two more program lines.
Code:
OSC_TEST: DEC TMP ;SET-UP TIMERS
STS OSCCAL,TMP
OUT TIFR1,FF
OUT TIFR2,FF
W6103: SBIS TIFR2,TOV2 ;WAIT
RJMP W6103
lds XL,TCNT1L ;READ TIMER
lds XH,TCNT1H
sts TCNT1H,ZERO
sts TCNT1L,ZERO
;<=== TIMER OVERFLOW TEST REMOVED!
cpi XL,Byte1(Upper_Limmit)
ldi tmp1,Byte2(Upper_Limmit)
CPC XH,tmp1 ;OSCILLATOR OUT?
BRSH OSC_TEST
cpi XL,Byte1(Lower_Limmit)
ldi tmp1,Byte2(Lower_Limmit)
cpc XH,tmp1 ;OCILLATOR OUT?
BRLO OSC_TEST
RET ;OSCCAL IS FINE
TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE
Then it dawned on me...
The reason that re-writing the comparision section wasn't a great idea
was because a 16-bit subtraction and a 16-bit compare are essentially
the same as two 16-bit compares.
However, from the calculations I just made I realize that my
"correct range" is only 80 and that will fit into a byte, so that
translates into a 16x8 bit compare not a 16x16. This may save us
another line of code:
So the "new" concept was to subtract the correct range from our readings and check if results fell within our range:
Code:
TST_OSC: DEC TMP ;SET-UP TIMERS
STS OSCCAL,TMP
OUT TIFR1,FF
OUT TIFR2,FF
W6103: SBIS TIFR2,TOV2 ;WAIT
RJMP W6103
lds XL,TCNT1L ;READ TIMER
lds XH,TCNT1H
sts TCNT1H,ZERO
sts TCNT1L,ZERO
SUBI XL,LOW(6103-40) ;CALC SAFE-RANGE
SBCI XH,HIGH(6103-40)
CPI XL,LOW(UP_LIMIT-LO_LIMIT)
CPC XH,ZERO ;WITHIN RANGE?
BRPL TST_OSC
RET
The above code looks great: nice, small, then I realize if clock
reading is less than 6,103 - 40 then I've got to deal with a "negative"
number. I like to avoid "signed" integers whenever I can because they
sometimes have a habit of coming back to bite-you-in-the-butt!
TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE
A SMALL DEVIATION: TOWARDS A SUPER-FAST CALIBRATION ROUTINE
This approach of checking if the size of the "Error" of the Oscillator
rather than using the traditional method of just seeing if it falls
between an upper and lower boundary and incrementding or decrementing
the OSCCAL register might be useful to some.
If you wanted a super-fast calibration, you could measure the size
of the error and adjust the OSCCAL register accordingly rather than in
small increments/decrments of one. The method which seems to be used by
most.
Since this particular routine is done shortly after power-up (or
reset) the longer it takes to adjust the OSCCAL register, the better,
because it gives the oscillator more time to "settle."
Okay, I think I've been a "deviant" long enough, time to get back
to the main topic of this thread...the shrinking of the OSCCAL Routine.
TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE
I went back and looked at the actual typical values of the Upper and Lower Limits.
Typical Lower-Limit:
Code:
LOWER = IDEALTIME - 50uS
LOWER = 6103 - 50
LOWER = 6053 = $17:A5
Typical Upper-Limit:
Code:
UPPER = IDEAL + 50uS
UPPER = 6103 + 50
UPPER = 6153 = $18:09
Notice that the high bytes are only out by one, and that if I reduced
the upper value by 10, then the high byte would drop to 17 and both
high bytes would be the same:
My Lower-Limit:
Code:
LOWER = IDEALTIME - 40uS
LOWER = 6103 - 40
LOWER = 6043 = $17:AF
My Upper-Limit:
Code:
UPPER = IDEAL + 40uS
UPPER = 6103 + 40
UPPER = 6143 = $17:FF
This means that if we use +/- 40uS instead of +/- 50uS we can
simplify part of our routine by simply checking if the high byte equals
$17. You'll learn in the next post why it is important that I use
exactly +/- 40uS.
Code:
TSTOSC: DEC TMP ;SET-UP TIMERS
STS OSCCAL,TMP
OUT TIFR1,FF
OUT TIFR2,FF
W6103: SBIS TIFR2,TOV2 ;WAIT
RJMP W6103
LDS XL,TCNT1L ;READ TIMER
LDS XH,TCNT1H
STS TCNT1H,ZERO
STS TCNT1L,ZERO
CPI XH,23 ;<===== CHECK IF HIGH BYTE IS $17
BRNE TSTOSC
(INCOMPLETE AS YET)
RET
TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE
Now that we've "taken-care-of" the high byte all we need to do now is
check if the lower byte is above or below our selected range? That's
amounts to just two single byte compares.
However, if we can "set-things-up" so that our "range" begins or
ends at ZERO or 255 then we need only check in "one direction." That
means a single compare statment instead of two.
You might have missed it first-time-around so look again at the Lower-Byte value of the Upper-Limit in hex notation, it's $FF:
My Upper-Limit:
Code:
UPPER = IDEAL + 40uS
UPPER = 6103 + 40
UPPER = 6143 = $17:FF
I chose to use +40uS so that the test for our lower-byte would fall
on the byte's upper "boundary" at $FF=255. This mean that all we need
to do is compare our lower-byte against our lower-boundary value and if
we fall below it then our oscillator is out-of-range.
Let me expand on this "trick" in case some readers don't get it, because it is a little hard to follow...
Remember earlier I said that single unsigned bytes are actually
like little circles and not little rulers. Well if we compare our timer
against our range, and the range just happend to end at $FF=255. If
we're over 255 we actually "wrap-around" to ZERO and now we're actually
UNDER, so that counts as a failure.
Also if we compare against our range and we actually do fall under,
then that's a failure also. So we've magically combined two tests, one
for over and another for under into a singe test for being under
because the over will "wrap" to being under.
So our final program looks like this:
Code:
TSTOSC: DEC TMP ;SET-UP TIMERS
STS OSCCAL,TMP
OUT TIFR1,FF
OUT TIFR2,FF
W6103: SBIS TIFR2,TOV2 ;WAIT
RJMP W6103
LDS XL,TCNT1L ;READ TIMER
LDS XH,TCNT1H
STS TCNT1H,ZERO
STS TCNT1L,ZERO
CPI XH,23 ;<===== CHECK IF HIGH BYTE IS $17
BRNE TSTOSC
CPI XL,175 ;<===== CHECK IF UNDER $AF
BRLO TSTOSC ;<=== ALSO SNEAKY TEST IF OVER $FF
RET
ITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE
So after we add a few lines of set-up the final routine looks like the
below, which should be fairly close to the routine I posted in Giorgos'
thread:
Code:
LDS TMP,OSCCAL
TSTOSC: DEC TMP ;SET-UP TIMERS
STS OSCCAL,TMP
OUT TIFR1,FF
OUT TIFR2,FF
STS TCNT1H,ZERO
STS TCNT1L ZERO
STS TCCR1B,ONE
STS TCNT2,ZERO
W6103: SBIS TIFR2,1 ;WAIT
RJMP W6103
LDS XL,TCNT1L ;READ TIMER
LDS XH,TCNT1H
CPI XH,23 ;<=== HIGH BYTE = $17?
BRNE TSTOSC
CPI XL,175 ;<=== LOW BETWEEN $AF-$FF?
BRLO TSTOSC
RET
IN CONCLUSION:
Well I hope I've answered all your questions about my condensed
OSCCAL Routine. How it works, why it works and how it got to it's
present form.
I certainly hope you found the trip entertaining as well as informative.
I've used this routine now for hundreds of uploads to Butteflies and
have not experienced a single problem. There have been hundreds of by
Bootloaders downloaded wich use this OSCAL Routine and have yet to
receive any reports of problems.
CONTINUING EDUCATION:
To learn more about the AVR Butterfly in gerneral, you can visit the Butterfly & Beginner's Web Site at:
or you can visit the Butterfly & Beginners Forum at: or the AVR Assembler Site at:
REQUEST FOR FEEDBACK:
If you found this tutorial discussion interesting and/or entertaining
and would like to see more like it, please let the moderator(s) know.
Thank you for your time and consideration.
Have a wonderful day!