This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

How to prevent OT 8 from generating an LJMP

If Optimize 8 (Reuse Common Entry Code) detects void MyFunc() as the last statement within void OuterFunc() (for example), it will produce an LJMP to MyFunc() rather than an LCALL. (see examples)

Nice optimization, but MyFunc() is part of a library written in assembly and by design reuqires a call rather than a jump. I tried surrounding its prototype with a #pragma OT (7), but that doesn't work. Surrounding OuterFunc() with the OT (7) pragma works, but that's a bit cumbersome and runs the risk of someone forgetting to do it.

void OuterFunc( void )
{
   MyFunc();   // Compiler generates LJMP.  Not good :o(
}


#pragma OT (7)
void OuterFunc( void )
{
   MyFunc();   // Compiler now generates the necessary
               // LCALL, but programmers WILL forget to use
               // this construct and will blow the whistle
               // when their software bombs.
}
#pragma OT (8)

  • Why does the library routine require a call rather than a jump? The RET instruction handles both LCALL and ACALL cases.

    ; optimized
    CALL OuterFunc
    LJMP MyFunc()
        RET   ; returns to point of call of OuterFunc
    
    
    ; unoptimized
    CALL OuterFunc
    CALL MyFunc
        RET ; returns to following statement
    RET ; returns to point of call of OuterFunc
    

    The behavior of both routines is the same. The unoptimized version just pushes two addresses on the stack and does a back-to-back return.

    Normally, it would make no difference to the called code whether this optimization is performed. So are you doing some special stack hacking or something that makes the library routine dependent on the call sequence?

  • Why does the library routine require a call rather than a jump?
    ...
    So are you doing some special stack hacking or something that makes the library routine dependent on the call sequence?


    Yes, but I chose not to mention that as I was fairly certain possible solutions would get lost in debate over technique.

    Suffice it to say the function "must" be called.

  • As long as you're sure of what you want, it's fine with me.

    I don't see a way to control particular optimizations for particular symbols (or even globally, without shutting off entire levels above).

    Unfortunately, #pragma is a preprocessor directive, and #define macros can't generate more preprocessor stuff. So you can't help client routines out by wrapping the pragma yourself:

    #define MyCall() #pragma save     #pragma ot(7)    MyCall();        #pragma restore
    

    because you can't generate the #pragma's.

    A cheesy workaround would be
    MyCall.i:
    #pragma save
    #pragma ot(7)
    MyCall();
    #pragma restore

    OuterFunc.c:
    void OuterFunc(void)
        {
        // instead of MyCall()
    #include "MyCall.i"
        }
    

    Hideous, eh?

    Hopefully someone can top these suggestions.

  • For anyone interested, I found a rather simple solution to this problem. One only needs to make the compiler believe there is more code to execute after the function in question, and this can be done by wrapping it in a macro:

    // API Prototypes
    
    #define  JumpIndirect( bparm, handler )  { JmpInd( bparm, handler ); global_var = 0; }
    
    void JmpInd( byte b, code* handler );
    
    
    
    // When in use
    
    void IntHandler( byte source_vector )
    {
        if( IntNotForMe() )
            JumpIndirect( source_vector, old_handler );  // Chain to the next handler.
    }
    

    Within the macro, "global_var" is just a dummy variable (or any other that can be safely borrowed for the purpose) and the assignment to 0 creates very little overhead.

    Now, obviously the macro isn't absolutely necessary...you could just write both instructions inline or use a function pointer and an indirect call, but function pointers are too ugly and difficult to use. To a user, the macro is more of a WYSIWIG solution.

  • make the compiler believe there is more code

    But that's not what your hack does --- there's no make-believe involved: there is more code after the sub-function call. The macro you wrap that in is just window-dressing.

    But even so, I rest less than convinced that a function like your JmpInd

    a) should fail if it's jumped to instead of called --- the only difference is in stack contents it has no business meddling with, even given the job it wants to do

    b) is any more elegant or easier to use than a function pointer

  • "But that's not what your hack does --- there's no make-believe involved: there is more code after the sub-function call. The macro you wrap that in is just window-dressing."

    The above sort of nit-picking reminds me why I don't visit this board very often. You can ALWAYS count on this sort of reply here...ALWAYS.


    "...the only difference is in stack contents it has no business meddling with, even given the job it wants to do"

    Who says? If the stack contents and the associate SP sfr are so off-limits and taboo, then why have both been accessible since the inception of the 8051 and, for that matter, still accessible to C51 itself?


    b) is any more elegant or easier to use than a function pointer

    Quote from Keil C51 User's Guide 09.2001, pg 379:

    "Function pointers are one of the most difficult aspects of C to understand and to properly utilize. Most problems involving function pointers are caused by improper declaration of the function pointer, improper assignment, and improper dereferencing."

  • Hi Robert,

    You say "MyFunc() is part of a library written in assembly and by design requires a call rather than a jump", i dont think that in the example you have quoted that if the compiler generated LJMP rather than ACALL/LCALL that this matters and to illustrate this consider the following code snippet:-

    void main(void)
    {
        OuterFunc();   /* your OuterFunc that then uses MyFunc */
        Function();
        AnotherFunction();
        MyFunc();      /* this should generate ACALL/LCALL */
    
        for (;;)
    }
    
    void OuterFunc(void)
    {
       MyFunc();   // You say - Compiler generates LJMP.  Not good :o(
    }

    Within main(), if the compiler generates a LCALL to OuterFunc and within OuterFunc it generated a LJMP to MyFunc, then when your asm subroutine finishes and executes its RET instruction, then this will return back to main() ready to do Function() rather than OuterFunct() and it has saved an instruction?

    What I am trying to illustrate is that even though Myfunc() doesnt immedialtly look like it is being 'CALLed', it is and the compiler has seen fit to optimise out a CALL to a CALL so that it doesnt need to put a RET in OuterFunc() and your RET in Myfunc() will return it to the correct address after wherever OuterFunc() was 'CALLed' from!

    Hope this makes sense and I would say you would only need to worry about how the compiler treats MyFunc() if it tries to 'inline the code'

    Mark.

  • Thanks for the reply Mark.

    I agree that in any normal case, the compiler detemining and using an LJMP is just fine as the final return will be resolved correctly.

    In my case, however, the MyFunc() never returns...as it is in fact an indirect jump function for chaining interrupt handlers together. As stated earlier, I was hesitent on posting this at first in fear the topic would stray off course.

    Anyway, the actual indirect jump is coded as such (with the target jump address passed in R6R7 in this example)...

    _IndirectJump:
        dec    SP          ; Clean up the stack...
        dec    SP          ; ...we won't be returning.
        mov    dph, r6
        mov    dpl, r7
        clr    a
        jmp    @a+dptr     ; jump to the next vector.
    

    As you can see, the function isn't rocket science and when called, it serves its purpose just fine. When the compiler decides an LJMP is adequate, however, it misaligns the stack.

    The context in which the function is used is beyond the scope of the discussion, so I limited the discussion to the code the compiler generated and its effects on the function. As stated a couple of posts back, however, wrapping it into a macro that includes a trailing benign statement forces the compiler to always generate an LCALL. This is the method I'm going to use and its working just fine.

    Thanks for the input just the same.

  • In my case, however, the MyFunc() never returns...

    Which would mean my earlier statements hit the nail exactly on the head, you just refused to accept it: Such a function simply is not fit to be called from C, period. C functions return to their caller (possibly by a longjmp() skipping several stack frames upward), or terminate the entire program.

    But actually, this is not the case, because you're contradicting yourself: if this function never returns, then the stack cannot possibly be misaligned. It's never going to be referred to again, so whatever's on it doesn't matter.

    For the stack contents to matter at all, the function you ljmp @a+dptr to has to return. But at that point, there's exactly no difference between zero return addresses on the stack, and one return address on the stack that points directly at a "ret" instruction: either you directly return to never-never-land, or you return to a "ret" instruction, which in turn returns to never-never-land.

    Summing it all up: you're trying to solve a problem that doesn't exist.

  • Thank you for your insightful wisdom Hans.

    I wish you had simply told me right off that this problem didn't really exist, then I could have avoided all of this fuss.

    ( sorry troll boy, I'm not taking the bait )

  • Hans-Bernhard,

    For the stack contents to matter at all, the function you ljmp @a+dptr to has to return.

    That's patently untrue, and I think Robert did a pretty good job of showing why. The "function" he calls doesn't return... ever... period, so he is absolutely correct in saying so.

    From a pedantic point of view, it's not even a "function" at all at that point. He's kludged it into some inline assembly. The compiler just thinks it's going to be a function call. The stack contents matter because when the NEXT function in the chain "returns" it will actually leapfrog over this intermediate chaining function to whatever called it.

  • The stack contents matter because when the NEXT function in the chain "returns" it will actually leapfrog over this intermediate chaining function to whatever called it.

    Ding ding ding ding ding. That's right, and while I'm sure he's well aware of this, he'd rather...

    http://redwing.hutman.net/~mreed/warriorshtm/nitpick.htm

  • That's patently untrue, and I think Robert did a pretty good job of showing why. The "function" he calls doesn't return... ever... period, so he is absolutely correct in saying so.

    If you could be bothered to actually read what I wrote before firing replies from the hip, you might have noticed that the function he calls (i.e. "IndirecJmp") is not the one I was talking about in the snippet you replied to:

    For the stack contents to matter at all, the function you ljmp @a+dptr to has to return.

    When a CALLed subroutine LJMPs to some other one, and that one returns, that's effectively the same thing as the original called function returning. This idea can be applied recursively as many times as you like.

    Indeed that's exactly the idea the compiler's CALL+RET --> LJMP optimization itself is exploiting.

    And, at the risk of sounding repetitive: this is not what is causing the actual problem.

  • When a CALLed subroutine LJMPs to some other one, and that one returns, that's effectively the same thing as the original called function returning. This idea can be applied recursively as many times as you like.

    Indeed that's exactly the idea the compiler's CALL+RET --> LJMP optimization itself is exploiting.


    I'm not sure where all this confusion is coming in, but your statement is exactly what Robert is trying to overcome. The library function he's talking about is designed to assume that it has been CALLed, because it modifies the stack. So.. when a situation arises where the compiler decides to make this optimization, the library function destroys the stack.

    What Robert wanted to do (and I think succeeded about 10 messages ago in doing) is force the compiler not to perform this optimization in the case of this library routine.

    How we got off on some ridiculous tangent like he feared, I'm not certain, but I am quite sure that a programmer as competent as you understands what he's trying to do, so I don't know where the whole debate is coming from.

    I understand your main point: If you write the library funnction not to care whether it's called or ljmp'd to in the first place, then this problem is avoided, but I can think of plenty of reasons why he might not, at this point in time, be able to modify the assembly routine. I'm sure that's the case, and so debates over whether the original author of the assembly routine did a horrible job or not become moot.