Chinaunix首页 | 论坛 | 博客
  • 博客访问: 60995453
  • 博文数量: 173
  • 博客积分: 10400
  • 博客等级: 上将
  • 技术积分: 5973
  • 用 户 组: 普通用户
  • 注册时间: 2010-03-03 22:08
文章分类

全部博文(173)

文章存档

2013年(2)

2012年(1)

2011年(17)

2010年(28)

2009年(76)

2008年(49)

分类:

2008-03-26 09:11:41

TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

SUBTITLE: I WISH I WAS AN OSCCAL MYER WINNER

UPDATE APR 5/06: C VERSION ONLINE!

Many thanks to ARod(alejmrm) who ported and posted "C" version of the following ASM OSCCAL routine. Two versions on third page, fourth post down. Thanks a million ARod!

PURPOSE:

Discuss methods used to reduce size of typical OSCCAL routine using ARV machine language.

Quote:
Your code will not properly calibrate the oscillator!

Quote:
It is clear from this that your code does not do the same operations that the designers of the Butterfly thought were required for accurate calibration!

Quote:
What value does TMP have when entering the fragment?

Quote:
My main question also revolves around TMP, indirectly.

Quote:
Seems to be an interest in the way that you get the OSCCAL value, and I am interested in the details... do you mind to move it to a new thread to talk about it? As some body said, what kind of asumptions are you taking?, etc...

Thanks.. really clever ideas!



PREAMBLE:


When Giorgos announced his latest version of a Butterfly Bootloader. His first complaint about condensing its size was the length of the OSCCAL routine. Since I had to deal with same problem myself doing bootloaders, I thought I might help him out by giving him my much smaller routine. At the time I did not expect the critical reaction and wide interest in the routine.


Due to the many comments messing-up his original thread about the routine and how it works, plus private mail request, I started this thread to answer all those questions without hijacking his original thread. I hope you find this discussion informative as well as entertaining.


CAVEAT LECTOR:


Some of the techniques contained herein are outside normal coding practices and may be offensive to some programmers.




TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

INTRODUCTION: THE HEART OF THE OSCCAL ROUTINE


The heart of the OSCCAL routine is very simple.The basic idea is to set up a counter on the Oscillator, after a fixed amount of time you examine the counter to see if Oscillator is running fast or slow. You then adjust the OSCCAL register, then go back and check again. We repeat this process until the Oscillator is running within an acceptable range.


Because the Oscillator rate can change with temperature etc. I go back and re-calibrate it each time the Butterfly wakes, something the other Bootloaders seem to have missed.


I am going to dig up an older bootloader with original OSSCAL routine and compare it to my current condensed version and try to remember the steps and logic I used to get from one to the other.


CONTENTS AND CODE FRAGMENTS:


For the purpose of this discussion I am only focusing on the actual OSCCAL adjustment loop and not the pre-amble or set-up.


As you examine the code fragments and techniques used, please keep in mind that the goal is to produce the smallest code possible and many traditional and accepted coding practices may go out-the-window.


FINAL WARNING:


Make sure that all the necessary precautions have been taken... you are about to enter the mind of a certified Hack.



TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

Here's a typical OSCCAL routine. This one clipped from George Kolovos' dis-assembly of the Atmel *.hex file.


The first thing that jumps-out-at-me are the two routines at the bottom that adjust the OSCCAL register. To me they stand out like two warts on an otherwise straight-forward routine. So I would focus on those first.

Code:

; //////////////////////////////////////////////////////////////////////////////
; //    Description:   An enhanced bootloader for the ATMEL Butterfly unit    //
; //    Copyright (C) 2006  George Kolovos       //
; //                                                                          //
; //////////////////////////////////////////////////////////////////////////////

-------------------------------------------------


      ; Test the internal oscillator speed
 OSC_Test:   ser   tmp1      ; Reset any TC1 and TC2 flags
       out   TIFR1,tmp1
       out   TIFR2,tmp1
      sts   TCNT1H,Zero   ; Reset TCNT1
       sts   TCNT1L,Zero
      sts   TCNT2,Zero   ; Reset TCNT2
      ldi   tmp1,1      ; Start TC1
       sts   TCCR1B,tmp1
      ; Wait for the TC2 compare-match
      sbis   TIFR2,OCF2A   ; 200 *   1/32768   = 6103.52us
       rjmp   PC-1
      sts   TCCR1B,Zero   ; Time-out: Stop the TC1
      sbic   TIFR1,TOV1   ; TC1 overflowed?
       rjmp   OSC_Too_Fast
      lds   XL,TCNT1L   ; X = TCNT1
       lds   XH,TCNT1H
      ;  Is the oscillator slow?
      cpi   XL,Byte1(OSC_Lo) ; cpi TCNT1,6120
       ldi   tmp1,Byte2(OSC_Lo)
       cpc   XH,tmp1
       brlo   OSC_Too_Slow
      ;  Is the oscillator fast?
      cpi   XL,Byte1(OSC_Hi) ; cpi TCNT1,6251
       ldi   tmp1,Byte2(OSC_Hi)
       cpc   XH,tmp1
       brsh   OSC_Too_Fast
      ; The oscillator frequency is within the acceptable limmits
 OSC_Done:   ret

      ; Decrease the oscillator frequency
 OSC_Too_Fast:   lds   tmp1,OSCCAL   ; OSCCAL--;
       dec   tmp1
       sts   OSCCAL,tmp1
      rjmp   OSC_Test

      ; Increase the oscillator frequency
 OSC_Too_Slow:   lds   tmp1,OSCCAL   ; OSCCAL++;
       inc   tmp1
       sts   OSCCAL,tmp1
      rjmp   OSC_Test

---------------------------------------------------------




ITLE: ADVENTURES IN SHRINKAGING THE OSCCAL ROUTINE

Here are the two program "warts" that bothered me.

I capitalized them for my own benefit.
Code:

OSC_Too_Fast:
        LDS    TMP1,OSCCAL  ;DECREASE FREQ
        DEC    TMP1
        STS    OSCCAL,tmp1
         RJMP  OSC_Test

OSC_Too_Slow:

        LDS    TMP1,OSCCAL  ;INCREASE FREQ
        INC    TMP1
        STS    OSCCAL,TMP1
                 RJMP  OSC_Test

The first thing that strikes me is that with the exception of the INC & DEC statement the two routines are identical. Perhaps we can combine them somehow and turn two warts into just one.

We'll start by combining the two writes into one by using a programming technique I call "Sharing-a-Piece-of-Arse!"


Code:
OSC_FAST: LDS    TMP1,OSCCAL
          DEC    TMP1        ;<=== STS OSCCAL REMOVED
           RJMP  OSC_ADJ     ;<=== RE-ADJUSTED
OSC_SLOW: LDS    TMP1,OSCCAL
          INC    TMP1
OSC_ADJ:  STS    OSCCAL,TMP1 ;<===  SHARED ARSE-END
           RJMP  OSC_TEST


The next thing I notice is that there are two reads of the OSCCAL register. If we move that read back into the main program BEFORE these rotuines are called we can eliminate another line of code:


Code:
OSC_FAST: DEC    TMP1        ;<=== LDS REMOVED
           RJMP  OSC_ADJ
OSC_SLOW: INC    TMP1
OSC_ADJ:  STS    OSCCAL,TMP1 ;<=== LDS REMOVED
           RJMP  OSC_TEST

Now we've really shrunk those two routines down into one rather small extention of the main program.





TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

Next I'd like to move that write (STS OSCCAL) out of this program "extension" and back to the main routine. There's no savings in terms of memory space for doing this so you'll just have to trust me on this for the moment, I have something in mind for later.


Since the write is the last thing we do before jumping back to the start of our calibration loop, the ideal place to move it would be right at the start.


We're not saving anything by doing this, but I'm half-way to doing something that will. So just wait a bit.


Remember the LDS OSCALL statement that we removed earlier, well a good place for it would be just before we enter our main calibration loop, but we'll need to switch to another unused register so we can hold that value undisturbed. Since tmp1 gets used inside the routine I'll call this new register TMP and make equal to R0.


Remember we haven't saved anything yet, however I am working up to something that requires that I make this move. So after moving the write out and making the other changes, re-adjusting the RJMPs and cleaning up, this is what we have:

Code:

; MAIN OSCCAL CALIBRATION LOOP
       LDS OSCCAL,TMP    ;<===MOVED HERE
OSC_TEST:
       STS OSCCAL,TMP    ;<===MOVED HERE
     

       RET

; OSCCAL ADJUSTMENT ROUTINES

OSC_FAST: DEC    TMP
           RJMP  OSC_TEST  ;<===RE-ADJUSTED
OSC_SLOW: INC    TMP
           RJMP  OSC_TEST  ;<===RE-ADJUSTED


At this point I think I've answed some of the questions that were quoted at the start of this tread concerning the initial value of TMP prior to entering the calibration loop:


Quote:
What value does TMP have when entering the fragment?

Quote:
My main question also revolves around TMP, indirectly.

Quote:
As some body said, what kind of asumptions are you taking?


Well obviously from the code, the value of TMP prior to entering the main loop is the current value of OSCCAL that we are going to adjust within the routine. So I hope that answers the above questions to everyone's satisfaction.

Code:

       LDS OSCCAL,TMP

;MAIN OSCCAL CALIBRATION LOOP

OSC_TEST:

I should have posted this one extra line. I didn't expect to be cross-examined on the routine, I was posting it to help someone who knew exactly what TMPs value would be.



TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

Well the two "warts" are down to a single line with a jump back to the main routine. We've taken two warts, combined into a single wart, then turn it into two small blemishes.


I don't think there is much more we can do with them at this point. We're gonna' have to go back and re-examine the main routine. We'll start by looking at where these two routines are called from:

Code:

      sts   TCCR1B,Zero
      sbic   TIFR1,TOV1
       rjmp   OSC_Too_Fast   ;<== CALL TOO FAST

      lds   XL,TCNT1L
       lds   XH,TCNT1H
      cpi   XL,Byte1(OSC_Lo)
       ldi   tmp1,Byte2(OSC_Lo)
       cpc   XH,tmp1
       brlo   OSC_Too_Slow  ;<== CALL TOO SLOW

      cpi   XL,Byte1(OSC_Hi)
       ldi   tmp1,Byte2(OSC_Hi)
       cpc   XH,tmp1
       brsh   OSC_Too_Fast   ;<== CALL TOO FAST


The sequence is TOO_FAST / too_slow / TOO_FAST.


For reasons that will become apparent shortly I want the sequence to change from 2FAST/2slow/2FAST to 2FAST/2FAST/2slow. We can do this easily by just switching the last two tests around.


If you'd like to see a actual example of the routine we've worked out so far, even though we're only half-way into my next reduction, Giorgos somehow got this incomplete version into his latest Bootloader.


The following is a snippet from his latest source code.

I've highlighted the areas of interest for us: (Assume TMP=R0) and notice that the sequence is now 2FAST/2FAST/2slow and it contains all the exact modifications we've made so far.
Code:

       lds   r0,OSCCA ;<=== FETCHING OSCAL PRIOR TO ENTERING LOOP
      
      
 OSC_Test:   sts   OSCCAL,r0 ;<=== SETTING OSCCAL AT START OF LOOP
      ser   tmp1
       out   TIFR1,tmp1
       out   TIFR2,tmp1
      sbis   TIFR2,TOV2
       rjmp   PC-1
      lds   XL,TCNT1L
       lds   XH,TCNT1H
       sts   TCNT1H,Zero
       sts   TCNT1L,Zero
      sbic   TIFR1,TOV1
       rjmp OSC_Too_Fast ;<============== TOO_FAST
      cpi   XL,Byte1(Upper_Limmit)
       ldi   tmp1,Byte2(Upper_Limmit)
       cpc   XH,tmp1
       brsh   OSC_Too_Fast ;<============ TOO_FAST

      cpi   XL,Byte1(Lower_Limmit)
       ldi   tmp1,Byte2(Lower_Limmit)
       cpc   XH,tmp1
       brlo   OSC_Too_Slow ;<============ TOO_SLOW
      
 OSC_Done:   sts   TCCR1B,Zero
       sts   TCCR2A,Zero
      ret

 OSC_Too_Fast:   dec   r0 ;<========== LDS & STS REMOVED
       rjmp   OSC_Test

 OSC_Too_Slow:   inc   r0 ;<========== LDS & STS REMOVED
       rjmp   OSC_Test   


TITLE: DANGEROUS DAN's SHRINKAGE OF THE OSCCAL ROUTINE

One programming "Trick" when you have an A-or-B option like we have above with the TOO_FAST-or-TOO_SLOW option, is to ASSUME that one of them is true, and later if it turns out not to be true, you re-adjust.

Code:
DANGEROUS DAN PROGRAM TIP: USE ASSUMPTIONS TO SIMPLIFY YOUR CODE

Obviously it's best to assume the case that will be true the most often, but here we don't have that option, but we do have TWO calls to the TOO_FAST routine and only one to the TOO_SLOW so let's choose to ASSUME that our oscillator is always running too fast.


I notice that when the Oscillator is too fast we decrement TMP=R0 so I add this as the first line of our routine. Now when the Oscillator actually turns out to be too fast our assumption is correct so we just jump back to the start of the main loop and totally by-pass the old TOO_FAST routine.


Making this change and removing the TOO_FAST routine we end-up saving another program line and streamline our routine.


While we're at it another trick with the AVRs with so many registers is to pre-define them to values that you might find handy. Just about everyone has a ZERO, but I also define ONE, TWO, THREE, FOUR, V128 and FF=255 because I find they come-in-handy.


I notice that in the 2nd line of the following code segment Giorgos is setting the TMP1 to 255 using the SER command and writing it to the TIFRn ports. If we use my pre-defined FF Register we can knock off another word from this routine.

Code:
DANGEROUS DAN PROGRAM TIP: USE PRE-DEFINED REGISTERS

Code:

OSC_TEST: DEC R0   ;<======= NEW: ASSUME TOO FAST
          STS OSCCAL,R0
;------------------------------
;          ser TMP1
;          out TIFR1,TMP1
;          out  TIFR2,TMP1
;------------------------------
          OUT TIFR1,FF  ;<======== CHANGED
          OUT TIFR2,FF  ;<======== CHANGED
      sbis   TIFR2,TOV2
       rjmp   PC-1
      lds   XL,TCNT1L
       lds   XH,TCNT1H
       sts   TCNT1H,Zero
       sts   TCNT1L,Zero

      sbic   TIFR1,TOV1
       RJMP   OSC_TEST ;<=== RE-ADJUSTED

      cpi   XL,Byte1(Upper_Limmit)
       ldi   tmp1,Byte2(Upper_Limmit)
       cpc   XH,tmp1
       BRSH   OSC_TEST ;<=== RE-ADJUSTED

      cpi   XL,Byte1(Lower_Limmit)
       ldi   tmp1,Byte2(Lower_Limmit)
       cpc   XH,tmp1
       brlo   OSC_Too_Slow

 OSC_Done:   sts   TCCR1B,Zero
       sts   TCCR2A,Zero
       RET

;--------------------------------------

; OSC_Too_Fast:   dec   r0              ;<=== UN-NEEDED!
;       rjmp   OSC_Test  ;<=== UN-NEEDED!
;--------------------------------------

 OSC_Too_Slow:   inc   r0
       rjmp   OSC_Test



TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

We're assuming that the oscillator is running fast and DECrementing R0 without permission so if it turns out that we're wrong we need to adjust for this.


The simplest solution is to add one to get us back to where we should be; then we add another one to increment the OSCCAL value. My first reaction is to simply add another INC R0 to the TOO_SLOW routine. But we have a register called TWO=2 so instead of adding any more program steps I simply change the INC R0 to ADD R0,TWO.


Now let's sweep away the discarded code fragment and see where we're at.


Code:

OSC_TEST: DEC R0
          STS OSCCAL,R0
          OUT TIFR1,FF
          OUT TIFR2,FF
       sbis   TIFR2,TOV2
       rjmp   PC-1
       lds   XL,TCNT1L
       lds   XH,TCNT1H
       sts   TCNT1H,Zero
       sts   TCNT1L,Zero

       SBIC   TIFR1,TOV1
       RJMP   OSC_TEST

       cpi   XL,Byte1(Upper_Limmit)
       ldi   tmp1,Byte2(Upper_Limmit)
       CPC   XH,tmp1
       BRSH   OSC_TEST

       cpi   XL,Byte1(Lower_Limmit)
       ldi   tmp1,Byte2(Lower_Limmit)
       cpc   XH,tmp1
       brlo   OSC_Too_Slow

       sts   TCCR1B,Zero
       sts   TCCR2A,Zero
       RET

 OSC_Too_Slow:   ADD R0,TWO    :<=== MODIFIED!
       RJMP  OSC_Test


TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

Earlier I wanted the structure of the main routine to be switched from 2FAST/2slow/2FAST to 2FAST/2FAST/2slow? This next move will explain why.


If you remember we moved the DEC R0 out of our old OSC_TOO_FAST routine and stuck in the main-line to eliminate the entire routine. We can do the same with the ADD R0,TWO in the TOO_SLOW routine and remove it also. This simplifies our code and eliminates another program statement.


The reason I wanted the 2FAST/2FAST/2slow structure was to make this move. If we left it as it was, we'de have to INC R0, then later ADD R0,TWO then SUB R0,TWO.


Code:

OSC_TEST: DEC R0
          STS  OSCCAL,R0
          OUT  TIFR1,FF
          OUT  TIFR2,FF
          sbis TIFR2,TOV2
           rjmp PC-1
          lds  XL,TCNT1L
          lds  XH,TCNT1H
          sts  TCNT1H,Zero
          sts  TCNT1L,Zero

          SBIC TIFR1,TOV1
           RJMP OSC_TEST
          cpi  XL,Byte1(Upper_Limmit)
          ldi  tmp1,Byte2(Upper_Limmit)
          CPC  XH,tmp1
           BRSH  OSC_TEST

          ADD  R0,TWO   ;<=== NEW ADDITION!
          cpi  XL,Byte1(Lower_Limmit)
          ldi  tmp1,Byte2(Lower_Limmit)
          cpc  XH,tmp1
           BRLO  OSC_TEST  ;<=== RE-ADJUSTED

          sts  TCCR1B,Zero
          sts  TCCR2A,Zero
           RET


So we've finally removed those two blemishes and have a fairly decent piece of code now.


Two final things I like to do is remove the two lines that shut-off the timers, I deal with that in another part of my code.


The other thing is to get rid of that ugly rjmp PC-1 on line six. No progammer I know ever uses this nomeclature for relative jumps. It's a sure sign that this was a dis-assembly of someone's HEX file.


ITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

What we have so far looks like this:

Code:

OSC_TEST: DEC R0            ;SET-UP TIMERS
          STS  OSCCAL,R0
          OUT  TIFR1,FF
          OUT  TIFR2,FF
W6103:    SBIS TIFR2,TOV2   ;WAIT
           RJMP W6103
          lds  XL,TCNT1L    ;READ TIMER
          lds  XH,TCNT1H
          sts  TCNT1H,Zero
          sts  TCNT1L,Zero

          SBIC TIFR1,TOV1   ;CHECK TOO FAST
           RJMP OSC_TEST
          cpi  XL,Byte1(Upper_Limmit)
          ldi  tmp1,Byte2(Upper_Limmit)
          CPC  XH,tmp1
           BRSH  OSC_TEST

          ADD  R0,TWO       ;CHECK TOO SLOW
          cpi  XL,Byte1(Lower_Limmit)
          ldi  tmp1,Byte2(Lower_Limmit)
          cpc  XH,tmp1
           BRLO  OSC_TEST
            RET             ;RETURN IF WE'RE JUST RIGHT

PRELIMINARY CONCLUSION:


So far we've taken a fairly large piece of code, straightened it out by removed two ugly routines hanging off the end, and reduced it from about 32 lines to 22, about 2/3rds it's original size.


Even if you're not hell-bent on reducing a routine's size, reducing it's complexity is always a good thing. Bugs are directly proportional to some power of the length and complexity of the code. With Microshaft Windoze being millions of lines of code, is it any wonder it crashes on a regular basis?


With less "going on" in the silicon, there's less chance of anything going awry. Also, trying to debug a "flat, smooth" routine is always far easier and much faster than trying to sort out some lumpy, bent and twisted piece of "spaghetti code" from an unskilled code-smithy.


Code:

BUGS ~= [ SIZE x COMPLEXITY ]**P, where P >1

PROGRAMING TIP: REDUCE SIZE & COMPLEXITY OF CODE


Now that we've removed all the ugliness from the code and reduced it in size, Most people would expect that I stop at this point.


Obvioulsy don't know me very well, because this is the exact point where I start pulling out my bag of "dirty" tricks and try to squeeze the program down even further. I'm not happy until I've beaten a routine down so far it changes from Coal to Diamond.



TO BE CONTINUED AFTER THE INTERMISSION...






TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

THE SAGA CONTINUES: REDUCTUM AD ABSURDUM


From the time we are children, we are taught integers with the visual aid of a ruler. So we normally think of a single computer integer byte as running from the value of 0 at one end to "255" at the other end running in a straight line, liek a ruler.


However, your microprocessor thinks of integers in a different way than us humans. To a digital MCU a byte wraps around on itself. The "distance" between 0 and 255 is not 255 but 1, if you subtract one from zero you get 255 and if you add one to 255 you get zero. So it's best to think of unsigned integers as little circles that wrap around on themselves the way a "digital brain" does.


Code:
DAN's PROGRAM TIP: THINK OF UNSIGNED INTEGERS AS LITTLE CIRCLES


In our routine we are adjusting the OSCCAL value which is a single byte that will wrap around on itself. So instead of incrementing it in one direction when we are too slow, and decrmenting it when we are too high, perhaps we can just take it in ONE direction knowing it will eventually wrap-around to the value we need. Sure it will take a little longer at the microprocessor level, but will that translate into any real difference on a human scale?


Code:

OSC_TEST: DEC TMP           ;SET-UP TIMERS
          STS  OSCCAL,TMP
          OUT  TIFR1,FF
          OUT  TIFR2,FF
W6103:    SBIS TIFR2,TOV2   ;WAIT
           RJMP W6103
          lds  XL,TCNT1L    ;READ TIMER
          lds  XH,TCNT1H
          sts  TCNT1H,ZERO
          sts  TCNT1L,ZERO

          SBIC TIFR1,TOV1   ;CHECK TOO FAST?
           RJMP OSC_TEST
          cpi  XL,Byte1(Upper_Limmit)
          ldi  tmp1,Byte2(Upper_Limmit)
          CPC  XH,tmp1      ;CHEC TOO FAST?
           BRSH  OSC_TEST

          ADD  TMP,TWO ;<=============== CAN WE REMOVE THIS?
          cpi  XL,Byte1(Lower_Limmit)
          ldi  tmp1,Byte2(Lower_Limmit)
          cpc  XH,tmp1      ;CHECK TOO SLOW?
           BRLO  OSC_TEST
            RET             ;OSCCAL IS FINE


TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

My latest Bootloader, the [CRICKET], will "chirp" when reset and "chirp" again after the oscillator's been calibrated, then it waits a while for you to start your upload and if nothing happens it will give a final chirp before going to sleep. I did this because I got tired of havng to press on the joystick each time I wanted to upload a new program.


The time between the two initial "chirps" is the time that it takes to calibrate the oscillator. So I loaded a version of the Bootloader with the ADD TMP,TWO included into a Butterfly and another without it into another and compared results. The one with it missing was slightly slower, but without the audible clues, no one would ever notice. So on a human scale there's not much difference.


By safely removing the ADD TMP, TWO line we can save another program step:


Code:

OSC_TEST: DEC TMP           ;SET-UP TIMERS
          STS  OSCCAL,TMP
          OUT  TIFR1,FF
          OUT  TIFR2,FF
W6103:    SBIS TIFR2,TOV2   ;WAIT
           RJMP W6103
          lds  XL,TCNT1L    ;READ TIMER
          lds  XH,TCNT1H
          sts  TCNT1H,ZERO
          sts  TCNT1L,ZERO

          SBIC TIFR1,TOV1   ;CHECK TOO FAST?
           RJMP OSC_TEST
          cpi  XL,Byte1(Upper_Limmit)
          ldi  tmp1,Byte2(Upper_Limmit)
          CPC  XH,tmp1      ;CHECK TOO FAST?
           BRSH  OSC_TEST
                                     ;<===== ADD TMP,TWO REMOVED!
          cpi  XL,Byte1(Lower_Limmit)
          ldi  tmp1,Byte2(Lower_Limmit)
          cpc  XH,tmp1      ;CHECK TOO SLOW?
           BRLO  OSC_TEST
            RET             ;OSCCAL IS FINE


So the logic of our routine has changed: instead of incrementing or decrementing based on whether the oscillator is fast or slow, now it knows the oscillator is out, and decrements the OSCCAL register. If we happen to be moving it in the "wrong" direction, no problem because, once it hits ZERO it will wrap to 255 and start working downward towards the correct setting. In fact the high bit is not used so it will "wrap" at 127 so the entire process is very fast.


The fact that it takes a tiny bit longer is actually a bonus, because when first started, it's best to let the oscillator "stablize" and the more time that passes, the better. So not only have we removed a program step, we've actually improved over the "standard" routine.






TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

Since the entire logic has changed, We should re-writing the test part of the routine. Now that we're not concerned if we're running fast or slow, but only if we're outside the acceptable range, maybe there's some savings to be had.


Since Upper and Lower Limits are both constants, perhaps we can calculate the difference at assembler time. Then we just read the Ocillator subtract the ideal speed of 6103 and see if difference is within that range.


At this point I really did not expect to see much savings since a compare is almost the same as a subtraction, so instead of testing the Oscillator reading against an Upper_Limit and a Lower_Limit, we're subtracting the Ideal_Limit and comparing the difference to the difference between the Upper_Limit and the Lower_Limit.


Both approaches would take about the same number of program steps, two 16-bit compares is going to be the same as a 16-bit subtraction and a 16-bit compare. So I stop here for a while.



TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

All that leave us with is that first test for the Oscillator running too fast. I wonder if we could do something with these lines of code that check if out timer has overflowed.

Code:

          SBIC TIFR1,TOV1
           RJMP OSC_TEST


I've seen OSCCAL routines with 6103 +/- 100uS but most seem to use +/- 50uS and I use a "tighter" +/- 40uS.


This means that our "correct" range is 80uS long. What are the chances that when the timer overflows that it will co-incidentally fall within this range and give us a "False Positive?"


Well based on pure randomness it would be:

Code:

Probability = Correct_Range/Total_Range x 100%
Probability = 80/65,536 x 100%
Probability = 0.12%

One in a thousand...Hmm, not too bad, however, the real probability is much, much lower than this.

There's a small chance that the Oscillator can be so fast that it's outside our range test and overflows to 10. There's an even smaller chance that it will be over by 100. There's an even micro-chance that it will be out by 1000.


The probability that the oscillator could be that far out, by 6100, AND still fall within my small range to give a false positive are slim-to-none. We can safely eliminate this line from the code and save ourselves two more program lines.

Code:

OSC_TEST: DEC TMP         ;SET-UP TIMERS
          STS  OSCCAL,TMP
          OUT  TIFR1,FF
          OUT  TIFR2,FF
W6103:    SBIS TIFR2,TOV2 ;WAIT
           RJMP W6103
          lds  XL,TCNT1L  ;READ TIMER
          lds  XH,TCNT1H
          sts  TCNT1H,ZERO
          sts  TCNT1L,ZERO
                             ;<=== TIMER OVERFLOW TEST REMOVED!
          cpi  XL,Byte1(Upper_Limmit)
          ldi  tmp1,Byte2(Upper_Limmit)
          CPC  XH,tmp1      ;OSCILLATOR OUT?
           BRSH  OSC_TEST

          cpi  XL,Byte1(Lower_Limmit)
          ldi  tmp1,Byte2(Lower_Limmit)
          cpc  XH,tmp1      ;OCILLATOR OUT?
           BRLO  OSC_TEST
            RET           ;OSCCAL IS FINE

TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

Then it dawned on me...


The reason that re-writing the comparision section wasn't a great idea was because a 16-bit subtraction and a 16-bit compare are essentially the same as two 16-bit compares.


However, from the calculations I just made I realize that my "correct range" is only 80 and that will fit into a byte, so that translates into a 16x8 bit compare not a 16x16. This may save us another line of code:


So the "new" concept was to subtract the correct range from our readings and check if results fell within our range:


Code:

TST_OSC: DEC TMP         ;SET-UP TIMERS
          STS  OSCCAL,TMP
          OUT  TIFR1,FF
          OUT  TIFR2,FF
W6103:    SBIS TIFR2,TOV2 ;WAIT
           RJMP W6103
          lds  XL,TCNT1L  ;READ TIMER
          lds  XH,TCNT1H
          sts  TCNT1H,ZERO
          sts  TCNT1L,ZERO
         
          SUBI  XL,LOW(6103-40)  ;CALC SAFE-RANGE
          SBCI  XH,HIGH(6103-40)
          CPI   XL,LOW(UP_LIMIT-LO_LIMIT)
          CPC   XH,ZERO          ;WITHIN RANGE?
           BRPL  TST_OSC     
            RET


The above code looks great: nice, small, then I realize if clock reading is less than 6,103 - 40 then I've got to deal with a "negative" number. I like to avoid "signed" integers whenever I can because they sometimes have a habit of coming back to bite-you-in-the-butt!


TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

A SMALL DEVIATION: TOWARDS A SUPER-FAST CALIBRATION ROUTINE


This approach of checking if the size of the "Error" of the Oscillator rather than using the traditional method of just seeing if it falls between an upper and lower boundary and incrementding or decrementing the OSCCAL register might be useful to some.


If you wanted a super-fast calibration, you could measure the size of the error and adjust the OSCCAL register accordingly rather than in small increments/decrments of one. The method which seems to be used by most.


Since this particular routine is done shortly after power-up (or reset) the longer it takes to adjust the OSCCAL register, the better, because it gives the oscillator more time to "settle."


Okay, I think I've been a "deviant" long enough, time to get back to the main topic of this thread...the shrinking of the OSCCAL Routine.


TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

I went back and looked at the actual typical values of the Upper and Lower Limits.


Typical Lower-Limit:

Code:

LOWER = IDEALTIME - 50uS
LOWER = 6103 - 50
LOWER = 6053 = $17:A5

Typical Upper-Limit:
Code:

UPPER = IDEAL + 50uS
UPPER = 6103 + 50
UPPER = 6153 = $18:09


Notice that the high bytes are only out by one, and that if I reduced the upper value by 10, then the high byte would drop to 17 and both high bytes would be the same:


My Lower-Limit:

Code:

LOWER = IDEALTIME - 40uS
LOWER = 6103 - 40
LOWER = 6043 = $17:AF

My Upper-Limit:
Code:

UPPER = IDEAL + 40uS
UPPER = 6103 + 40
UPPER = 6143 = $17:FF


This means that if we use +/- 40uS instead of +/- 50uS we can simplify part of our routine by simply checking if the high byte equals $17. You'll learn in the next post why it is important that I use exactly +/- 40uS.

Code:

TSTOSC: DEC TMP         ;SET-UP TIMERS
        STS  OSCCAL,TMP
        OUT  TIFR1,FF
        OUT  TIFR2,FF
W6103:  SBIS TIFR2,TOV2 ;WAIT
         RJMP W6103
        LDS  XL,TCNT1L  ;READ TIMER
        LDS  XH,TCNT1H
        STS  TCNT1H,ZERO
        STS  TCNT1L,ZERO
        CPI  XH,23  ;<===== CHECK IF HIGH BYTE IS $17
         BRNE TSTOSC
        (INCOMPLETE AS YET)
          RET



TITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

Now that we've "taken-care-of" the high byte all we need to do now is check if the lower byte is above or below our selected range? That's amounts to just two single byte compares.


However, if we can "set-things-up" so that our "range" begins or ends at ZERO or 255 then we need only check in "one direction." That means a single compare statment instead of two.


You might have missed it first-time-around so look again at the Lower-Byte value of the Upper-Limit in hex notation, it's $FF:


My Upper-Limit:

Code:

UPPER = IDEAL + 40uS
UPPER = 6103 + 40
UPPER = 6143 = $17:FF


I chose to use +40uS so that the test for our lower-byte would fall on the byte's upper "boundary" at $FF=255. This mean that all we need to do is compare our lower-byte against our lower-boundary value and if we fall below it then our oscillator is out-of-range.


Let me expand on this "trick" in case some readers don't get it, because it is a little hard to follow...


Remember earlier I said that single unsigned bytes are actually like little circles and not little rulers. Well if we compare our timer against our range, and the range just happend to end at $FF=255. If we're over 255 we actually "wrap-around" to ZERO and now we're actually UNDER, so that counts as a failure.


Also if we compare against our range and we actually do fall under, then that's a failure also. So we've magically combined two tests, one for over and another for under into a singe test for being under because the over will "wrap" to being under.


So our final program looks like this:

Code:

TSTOSC: DEC TMP         ;SET-UP TIMERS
        STS  OSCCAL,TMP
        OUT  TIFR1,FF
        OUT  TIFR2,FF
W6103:  SBIS TIFR2,TOV2 ;WAIT
         RJMP W6103
        LDS  XL,TCNT1L  ;READ TIMER
        LDS  XH,TCNT1H
        STS  TCNT1H,ZERO
        STS  TCNT1L,ZERO
        CPI  XH,23  ;<===== CHECK IF HIGH BYTE IS $17
         BRNE TSTOSC
        CPI XL,175 ;<===== CHECK IF UNDER $AF
         BRLO TSTOSC ;<=== ALSO SNEAKY TEST IF OVER $FF
          RET


ITLE: ADVENTURES IN SHRINKING THE OSCCAL ROUTINE

So after we add a few lines of set-up the final routine looks like the below, which should be fairly close to the routine I posted in Giorgos' thread:


Code:

        LDS TMP,OSCCAL
TSTOSC: DEC TMP         ;SET-UP TIMERS
        STS  OSCCAL,TMP
        OUT  TIFR1,FF
        OUT  TIFR2,FF
        STS TCNT1H,ZERO
        STS TCNT1L ZERO
        STS TCCR1B,ONE
        STS TCNT2,ZERO
W6103:  SBIS TIFR2,1    ;WAIT
         RJMP W6103
        LDS  XL,TCNT1L  ;READ TIMER
        LDS  XH,TCNT1H
        CPI  XH,23  ;<=== HIGH BYTE = $17?
         BRNE TSTOSC
        CPI XL,175  ;<=== LOW BETWEEN $AF-$FF?
         BRLO TSTOSC
          RET

IN CONCLUSION:


Well I hope I've answered all your questions about my condensed OSCCAL Routine. How it works, why it works and how it got to it's present form.


I certainly hope you found the trip entertaining as well as informative.


I've used this routine now for hundreds of uploads to Butteflies and have not experienced a single problem. There have been hundreds of by Bootloaders downloaded wich use this OSCAL Routine and have yet to receive any reports of problems.



CONTINUING EDUCATION:


To learn more about the AVR Butterfly in gerneral, you can visit the Butterfly & Beginner's Web Site at:
or you can visit the Butterfly & Beginners Forum at: or the AVR Assembler Site at:


REQUEST FOR FEEDBACK:


If you found this tutorial discussion interesting and/or entertaining and would like to see more like it, please let the moderator(s) know.


Thank you for your time and consideration.

Have a wonderful day!




阅读(1440) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~