Chinaunix首页 | 论坛 | 博客
  • 博客访问: 189726
  • 博文数量: 76
  • 博客积分: 2510
  • 博客等级: 少校
  • 技术积分: 831
  • 用 户 组: 普通用户
  • 注册时间: 2007-12-31 00:52
文章分类

全部博文(76)

文章存档

2010年(58)

2009年(18)

我的朋友

分类:

2009-10-26 13:17:35

Linux interrupt handling

The processing of linux interrupt is split into two parts: top half and bottom half.

Top half: also referred to interrupt handler. It runs immediately upon receipt of the interrupt and performs only the time critical work, such as reading/writing register, acknowledging receipt of the interrupt or resetting the hadware. At least, it runs with current interrupt line diabled. It may runs with all local interrupts disable if SA_INTERRUPT is set. Interrupt handlers were given their own stack, referred to interrupt stack, one stack per processor, one page in size.

Bottom half: Excute work that can be performed later or at a more convenient time. The key point varied from top half is that it runs with all interrupts enable. 

When executing an interrupt handler or bottom half, the kernel is in interrupt context. Since interrupt context is not associated with a process, withouting a backing process, interrupt context cannot sleep.Therefore, we cannot use functions that may sleeps in interrupt context.


1st part:  Softirqs

1. Statically created at compile-time.

2. Can run simultaneously on any processor, even two of the samy type. 

3. Rarely used. Reserved for the most timing-critical and important bottom-half processing on the system.

4. A softirq never preempts another softirq, the only event can preempt a softirq is an interrupt hanlder. However, another softirq - even the same one - can run on another processor.

5.Raising softirq: Usually, an interrupt handler marks its softirq for execution before returning and softirq will runs at a suitable time. In the case of interrupt handlers, the interrupt handler performs the basic hardwork-related work, raise the softirq and then exits. When processing interrupts, the kernel invokes do_softirq(). The softirq then runs and picks up where the interrupt handler left off.

6. The softirq hanlder runs with interrupts enabled and CANNOT sleep. While a handler runs, softirqs on the current processoer are disabled, however, another processor can execute other softirqs, even the same one. Thus, any shared data - even global data used only within the softirq handler itself - need proper locking. This is the important point why tasklets are usually preferred. Consequently, most softirq handlers resort to per-processor data(data unique to each processor and thus not requiring locking) or some other trichs to avoid explicit locking and provoide excellent scalability.


/*********************** sample code **************************/

softirq_action structure:

struct softirq_action
{
    void    (*action)(struct softirq_action *);
    void    *data;
};

A 32-entry array of softirq_action structure is declared in (softirq.c):

static struct softirq_action softirq_vec[32] __cacheline_aligned_in_smp;

Current softirq index(interrupt.h):

/* PLEASE, avoid to allocate new softirqs, if you need not _really_ high
   frequency threaded job scheduling. For almost all the purposes
   tasklets are more than enough. F.e. all serial device BHs et
   al. should be converted to tasklets, not to softirqs.
 */

enum
{
    HI_SOFTIRQ=0,
    TIMER_SOFTIRQ,
    NET_TX_SOFTIRQ,
    NET_RX_SOFTIRQ,
    BLOCK_SOFTIRQ,
    TASKLET_SOFTIRQ,
    SCHED_SOFTIRQ,
#ifdef CONFIG_HIGH_RES_TIMERS
    HRTIMER_SOFTIRQ,
#endif
    RCU_SOFTIRQ,     /* Preferable RCU should always be the last softirq */
};

Raising softirq:

void raise_softirq(unsigned int nr)
{
    unsigned long flags;

    local_irq_save(flags);
    raise_softirq_irqoff(nr);
    local_irq_restore(flags);
}

Registering softirq:

void open_softirq(int nr, void (*action)(struct softirq_action*), void *data)
{
    softirq_vec[nr].data = data;
    softirq_vec[nr].action = action;
}

Softirq execution occurs in(softirq.c): 

The following function also deal with the softirq reactivated issue. It guarantee the latency and fairness by waking up the ksoftirqd thread.

asmlinkage void do_softirq(void)
{
    __u32 pending;
    unsigned long flags;

    if (in_interrupt())
        return;

    local_irq_save(flags);

    pending = local_softirq_pending();

    if (pending)
        __do_softirq();

    local_irq_restore(flags);
}

/*
 * We restart softirq processing MAX_SOFTIRQ_RESTART times,
 * and we fall back to softirqd after that.
 *
 * This number has been established via experimentation.
 * The two things to balance is latency against fairness -
 * we want to handle softirqs as soon as possible, but they
 * should not be able to lock up the box.
 */
#define MAX_SOFTIRQ_RESTART 10

asmlinkage void __do_softirq(void)
{
    struct softirq_action *h;
    __u32 pending;
    int max_restart = MAX_SOFTIRQ_RESTART;
    int cpu;

    pending = local_softirq_pending();
    account_system_vtime(current);

    __local_bh_disable((unsigned long)__builtin_return_address(0));
    trace_softirq_enter();

    cpu = smp_processor_id();
restart:
    /* Reset the pending bitmask before enabling irqs */
    set_softirq_pending(0);

    local_irq_enable();

    h = softirq_vec;

    do {
        if (pending & 1) {
            h->action(h);
            rcu_bh_qsctr_inc(cpu);
        }
        h++;
        pending >>= 1;
    } while (pending);

    local_irq_disable();

    pending = local_softirq_pending();
    if (pending && --max_restart)
        goto restart;

    if (pending)
        wakeup_softirqd();

    trace_softirq_exit();

    account_system_vtime(current);
    _local_bh_enable();
}

/*
 * we cannot loop indefinitely here to avoid userspace starvation,
 * but we also don't want to introduce a worst case 1/HZ latency
 * to the pending events, so lets the scheduler to balance
 * the softirq load for us.
 */
static inline void wakeup_softirqd(void)
{
    /* Interrupts are disabled: no need to stop preemption */
    struct task_struct *tsk = __get_cpu_var(ksoftirqd);

    if (tsk && tsk->state != TASK_RUNNING)
        wake_up_process(tsk);
}

Pending softirqs are checked for and executed in the following places:
A. In the return from hardware interrupt code.
Irq.c (d:\eric\linux\linux-2.6.26\linux-2.6.26\arch\arm\kernel)    5283    2008-7-14
/*
 * do_IRQ handles all hardware IRQ's.  Decoded IRQs should not
 * come via this function.  Instead, they should provide their
 * own 'handler'
 */
asmlinkage void __exception asm_do_IRQ(unsigned int irq, struct pt_regs *regs)
{
    struct pt_regs *old_regs = set_irq_regs(regs);
    struct irq_desc *desc = irq_desc + irq;

    /*
     * Some hardware gives randomly wrong interrupts.  Rather
     * than crashing, do something sensible.
     */
    if (irq >= NR_IRQS)
        desc = &bad_irq_desc;

    irq_enter();

    desc_handle_irq(irq, desc);

    /* AT91 specific workaround */
    irq_finish(irq);

    irq_exit();
    set_irq_regs(old_regs);
}

Softirq.c (d:\eric\linux\linux-2.6.26\linux-2.6.26\kernel)    15904    2008-7-14
/*
 * Exit an interrupt context. Process softirqs if needed and possible:
 */
void irq_exit(void)
{
    account_system_vtime(current);
    trace_hardirq_exit();
    sub_preempt_count(IRQ_EXIT_OFFSET);
    if (!in_interrupt() && local_softirq_pending())
        invoke_softirq();

#ifdef CONFIG_NO_HZ
    /* Make sure that timer wheel updates are propagated */
    if (!in_interrupt() && idle_cpu(smp_processor_id()) && !need_resched())
        tick_nohz_stop_sched_tick();
    rcu_irq_exit();
#endif
    preempt_enable_no_resched();    //sub preempt count by 1 if CONFIG_PREEMPT set
}

B. In the ksoftirqd kernel thread.
static int ksoftirqd(void * __bind_cpu)
{
    set_current_state(TASK_INTERRUPTIBLE);

    while (!kthread_should_stop()) {
        preempt_disable();
        if (!local_softirq_pending()) {
            preempt_enable_no_resched();
            schedule();
            preempt_disable();
        }

        __set_current_state(TASK_RUNNING);

        while (local_softirq_pending()) {
            /* Preempt disable stops cpu going offline.
               If already offline, we'll be on wrong CPU:
               don't process */
            if (cpu_is_offline((long)__bind_cpu))
                goto wait_to_die;
            do_softirq();
            preempt_enable_no_resched();
            cond_resched();
            preempt_disable();
        }
        preempt_enable();
        set_current_state(TASK_INTERRUPTIBLE);
    }
    __set_current_state(TASK_RUNNING);
    return 0;

wait_to_die:
    preempt_enable();
    /* Wait for kthread_stop */
    set_current_state(TASK_INTERRUPTIBLE);
    while (!kthread_should_stop()) {
        schedule();
        set_current_state(TASK_INTERRUPTIBLE);
    }
    __set_current_state(TASK_RUNNING);
    return 0;
}

C. In any code that explicity checks for and excutes pending softirqs, such as networking subsystem.
int netif_rx_ni(struct sk_buff *skb)
{
    int err;

    preempt_disable();
    err = netif_rx(skb);
    if (local_softirq_pending())
        do_softirq();
    preempt_enable();

    return err;
}

D. In local_bh_enable(void) function.

Softirq.c (d:\eric\linux\linux-2.6.26\linux-2.6.26\kernel)    15904    2008-7-14

void local_bh_enable(void)
{
#ifdef CONFIG_TRACE_IRQFLAGS
    unsigned long flags;

    WARN_ON_ONCE(in_irq());
#endif
    WARN_ON_ONCE(irqs_disabled());

#ifdef CONFIG_TRACE_IRQFLAGS
    local_irq_save(flags);
#endif
    /*
     * Are softirqs going to be turned on now:
     */
    if (softirq_count() == SOFTIRQ_OFFSET)
        trace_softirqs_on((unsigned long)__builtin_return_address(0));
    /*
     * Keep preemption disabled until we are done with
     * softirq processing:
      */
     sub_preempt_count(SOFTIRQ_OFFSET - 1);

    if (unlikely(!in_interrupt() && local_softirq_pending()))
        do_softirq();

    dec_preempt_count();
#ifdef CONFIG_TRACE_IRQFLAGS
    local_irq_restore(flags);
#endif
    preempt_check_resched();
}

2nd part: Tasklet

1. Builds on top of Softirqs: HI_SOFTIRQ and TASKLET_SOFTIRQ

2. Can dynamically created and destroy

3. Two different tasklet can run concurrently on different processors, but two of the same type of tasklet cannot run simultaneously.

4. As with softirqs, tasklet CANNOT sleep. Thus, CANNOT use semaphores or other blocking fucntions in tasklet. 

5. Tasklet runs with all interrupt enable, so need to take precautions if our tasklet shares data with interrupt hanlder. Disable interrupt and obtain a lock is a solution.

6. Two different tasklets can run at the same time on two different CPU,
therefore proper locking need to used if our tasklet share data with anohter tasklet or softirq.

/*********************** sample code **************************/

tasklet structer defintion:

Interrupt.h (d:\eric\linux\linux-2.6.26\linux-2.6.26\include\linux)    13333    2008-7-14

struct tasklet_struct

{
    struct tasklet_struct *next;
    unsigned long state;
    atomic_t count;
    void (*func)(unsigned long);
    unsigned long data;
};

state definition:

Interrupt.h (d:\eric\linux\linux-2.6.26\linux-2.6.26\include\linux)    13333    2008-7-14

enum
{
    TASKLET_STATE_SCHED,    /* Tasklet is scheduled for execution */
    TASKLET_STATE_RUN    /* Tasklet is running (SMP only) */
};

#ifdef CONFIG_SMP
static inline int tasklet_trylock(struct tasklet_struct *t)
{
    return !test_and_set_bit(TASKLET_STATE_RUN, &(t)->state);
}

static inline void tasklet_unlock(struct tasklet_struct *t)
{
    smp_mb__before_clear_bit();
    clear_bit(TASKLET_STATE_RUN, &(t)->state);
}

static inline void tasklet_unlock_wait(struct tasklet_struct *t)
{
    while (test_bit(TASKLET_STATE_RUN, &(t)->state)) { barrier(); }
}
#else
#define tasklet_trylock(t) 1
#define tasklet_unlock_wait(t) do { } while (0)
#define tasklet_unlock(t) do { } while (0)
#endif

count: should be zero to enable tasklet to run

Declare tasklet_vec and tasklet_hi_vec list:
struct tasklet_head
{
    struct tasklet_struct *head;
    struct tasklet_struct **tail;
};

/* Some compilers disobey section attribute on statics when not
   initialized -- RR */
static DEFINE_PER_CPU(struct tasklet_head, tasklet_vec) = { NULL };
static DEFINE_PER_CPU(struct tasklet_head, tasklet_hi_vec) = { NULL };

Init tasklet_vec and tasklet_hi_vec and register related hanlder tasklet_action and tasklet_hi_action.

void __init softirq_init(void)
{
    int cpu;

    for_each_possible_cpu(cpu) {
        per_cpu(tasklet_vec, cpu).tail =
            &per_cpu(tasklet_vec, cpu).head;
        per_cpu(tasklet_hi_vec, cpu).tail =
            &per_cpu(tasklet_hi_vec, cpu).head;
    }

    open_softirq(TASKLET_SOFTIRQ, tasklet_action, NULL);
    open_softirq(HI_SOFTIRQ, tasklet_hi_action, NULL);
}


Softirq hanlder for tasklet, from this section code, we can see the tasklet handler
t->func(t->data) runs only when &t->count == 0 and its state is NOT TASKLET_STATE_RUN

Softirq.c (d:\eric\linux\linux-2.6.26\linux-2.6.26\kernel)    15904    2008-7-14

static void tasklet_action(struct softirq_action *a)

{
    struct tasklet_struct *list;

    local_irq_disable();
    list = __get_cpu_var(tasklet_vec).head;
    __get_cpu_var(tasklet_vec).head = NULL;
    __get_cpu_var(tasklet_vec).tail = &__get_cpu_var(tasklet_vec).head;
    local_irq_enable();

    while (list) {
        struct tasklet_struct *t = list;

        list = list->next;

        if (tasklet_trylock(t)) {
            if (!atomic_read(&t->count)) {
                if (!test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
                    BUG();
                t->func(t->data);
                tasklet_unlock(t);
                continue;
            }
            tasklet_unlock(t);
        }

        local_irq_disable();
        t->next = NULL;
        *__get_cpu_var(tasklet_vec).tail = t;
        __get_cpu_var(tasklet_vec).tail = &(t->next);
        __raise_softirq_irqoff(TASKLET_SOFTIRQ);
        local_irq_enable();
    }
}


Declaring Tasklet

Statically:

#define DECLARE_TASKLET(name, func, data) \
struct tasklet_struct name = { NULL, 0, ATOMIC_INIT(0), func, data }

#define DECLARE_TASKLET_DISABLED(name, func, data) \
struct tasklet_struct name = { NULL, 0, ATOMIC_INIT(1), func, data }

Dynamically:

void tasklet_init(struct tasklet_struct *t,
          void (*func)(unsigned long), unsigned long data)
{
    t->next = NULL;
    t->state = 0;
    atomic_set(&t->count, 0);
    t->func = func;
    t->data = data;
}

Scheduling Tasklets:

Interrupt.h (d:\eric\linux\linux-2.6.26\linux-2.6.26\include\linux)    13333    2008-7-14

static inline void tasklet_schedule(struct tasklet_struct *t)
{
    if (!test_and_set_bit(TASKLET_STATE_SCHED, &t->state))
        __tasklet_schedule(t);
}


static inline void tasklet_hi_schedule(struct tasklet_struct *t)
{
    if (!test_and_set_bit(TASKLET_STATE_SCHED, &t->state))
        __tasklet_hi_schedule(t);
}

void __tasklet_schedule(struct tasklet_struct *t)
{
    unsigned long flags;

    local_irq_save(flags);
    t->next = NULL;
    *__get_cpu_var(tasklet_vec).tail = t;
    __get_cpu_var(tasklet_vec).tail = &(t->next);
    raise_softirq_irqoff(TASKLET_SOFTIRQ);
    local_irq_restore(flags);
}

Disable or enable a tasklet:
static inline void tasklet_disable_nosync(struct tasklet_struct *t)
{
    atomic_inc(&t->count);
    smp_mb__after_atomic_inc();
}

static inline void tasklet_disable(struct tasklet_struct *t)
{
    tasklet_disable_nosync(t);
    tasklet_unlock_wait(t);
    smp_mb();
}

static inline void tasklet_enable(struct tasklet_struct *t)
{
    smp_mb__before_atomic_dec();
    atomic_dec(&t->count);
}

Kill a tasklet from a pending queue:

CANNOT used in interrupt context because it sleeps.

void tasklet_kill(struct tasklet_struct *t)
{
    if (in_interrupt())
        printk("Attempt to kill tasklet from interrupt\n");

    while (test_and_set_bit(TASKLET_STATE_SCHED, &t->state)) {
        do
            yield();
        while (test_bit(TASKLET_STATE_SCHED, &t->state));
    }
    tasklet_unlock_wait(t);
    clear_bit(TASKLET_STATE_SCHED, &t->state);
}


3rd part: Work Queues

1. The only bottom-half mechanisms runs in process context.

2. Scheduable and can sleeps, used in places such as need to allocate a lot of memory, obtain a semaphore or perform block I/O.

3. A work queue is a simple interface for deferring work to a generic kernel theread, called worker queue.

4. Each type of worker thread exits per CPU, which is represented by a workqueue_struct.


/*********************** sample code **************************/

Worker queue structure

Workqueue.c (d:\eric\linux\linux-2.6.26\linux-2.6.26\kernel)    22098    2008-7-14
/*
 * The per-CPU workqueue (if single thread, we always use the first
 * possible cpu).
 */
struct cpu_workqueue_struct {

    spinlock_t lock;

    struct list_head worklist;
    wait_queue_head_t more_work;
    struct work_struct *current_work;

    struct workqueue_struct *wq;
    struct task_struct *thread;

    int run_depth;        /* Detect run_workqueue() recursion depth */
} ____cacheline_aligned;

/*
 * The externally visible workqueue abstraction is an array of
 * per-CPU workqueues:
 */
struct workqueue_struct {
    struct cpu_workqueue_struct *cpu_wq;
    struct list_head list;
    const char *name;
    int singlethread;
    int freezeable;        /* Freeze threads during suspend */
#ifdef CONFIG_LOCKDEP
    struct lockdep_map lockdep_map;
#endif
};


Work structure

Workqueue.h (d:\eric\linux\linux-2.6.26\linux-2.6.26\include\linux)    6553    2008-7-14
struct work_struct {
    atomic_long_t data;
#define WORK_STRUCT_PENDING 0        /* T if work item pending execution */
#define WORK_STRUCT_FLAG_MASK (3UL)
#define WORK_STRUCT_WQ_DATA_MASK (~WORK_STRUCT_FLAG_MASK)
    struct list_head entry;
    work_func_t func;
#ifdef CONFIG_LOCKDEP
    struct lockdep_map lockdep_map;
#endif
};


worker_thread and run_workqueue process list work: 

Workqueue.c (d:\eric\linux\linux-2.6.26\linux-2.6.26\kernel)    22098    2008-7-14

static int worker_thread(void *__cwq)

{
    struct cpu_workqueue_struct *cwq = __cwq;
    DEFINE_WAIT(wait);

    if (cwq->wq->freezeable)
        set_freezable();

    set_user_nice(current, -5);

    for (;;) {
        prepare_to_wait(&cwq->more_work, &wait, TASK_INTERRUPTIBLE);
        if (!freezing(current) &&
            !kthread_should_stop() &&
            list_empty(&cwq->worklist))
            schedule();
        finish_wait(&cwq->more_work, &wait);

        try_to_freeze();

        if (kthread_should_stop())
            break;

        run_workqueue(cwq);
    }

    return 0;
}


static void run_workqueue(struct cpu_workqueue_struct *cwq)
{
    spin_lock_irq(&cwq->lock);
    cwq->run_depth++;
    if (cwq->run_depth > 3) {
        /* morton gets to eat his hat */
        printk("%s: recursion depth exceeded: %d\n",
            __func__, cwq->run_depth);
        dump_stack();
    }
    while (!list_empty(&cwq->worklist)) {
        struct work_struct *work = list_entry(cwq->worklist.next,
                        struct work_struct, entry);
        work_func_t f = work->func;
#ifdef CONFIG_LOCKDEP
        /*
         * It is permissible to free the struct work_struct
         * from inside the function that is called from it,
         * this we need to take into account for lockdep too.
         * To avoid bogus "held lock freed" warnings as well
         * as problems when looking into work->lockdep_map,
         * make a copy and use that here.
         */
        struct lockdep_map lockdep_map = work->lockdep_map;
#endif

        cwq->current_work = work;
        list_del_init(cwq->worklist.next);
        spin_unlock_irq(&cwq->lock);

        BUG_ON(get_wq_data(work) != cwq);
        work_clear_pending(work);
        lock_acquire(&cwq->wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
        lock_acquire(&lockdep_map, 0, 0, 0, 2, _THIS_IP_);
        f(work);
        lock_release(&lockdep_map, 1, _THIS_IP_);
        lock_release(&cwq->wq->lockdep_map, 1, _THIS_IP_);

        if (unlikely(in_atomic() || lockdep_depth(current) > 0)) {
            printk(KERN_ERR "BUG: workqueue leaked lock or atomic: "
                    "%s/0x%08x/%d\n",
                    current->comm, preempt_count(),
                           task_pid_nr(current));
            printk(KERN_ERR "    last function: ");
            print_symbol("%s\n", (unsigned long)f);
            debug_show_held_locks(current);
            dump_stack();
        }

        spin_lock_irq(&cwq->lock);
        cwq->current_work = NULL;
    }
    cwq->run_depth--;
    spin_unlock_irq(&cwq->lock);


4th part: Locking Between the Botom Halves

1. Two different tasklets sharing the same data requires proper locking.

2. Since softirqs provide no serialization, all shared data needs an appropriate lock.

3. If process context code and a bottom half share data, we need to disable bottom-half processing and obtain a lock before accessing data.

Use local_bh_enable(void) and local_bh_disable(void).

These two calls do NOT diable the execution of work queues, because work queue runs in process text and with synchronous execution. However, softirqs and tasklets can occur asynchronously(say, on return from hanlding an interrupt), so kernel code may need to diable them.

4. If interrupt context code and a bottom half share data, we need to disable interrupt and obtain a lock before accessing data.

5. Any shared data in a work queue requires locking, same as normal kernel code.


Softirq.c (d:\eric\linux\linux-2.6.26\linux-2.6.26\kernel)    15904    2008-7-14

void local_bh_enable(void)
{
#ifdef CONFIG_TRACE_IRQFLAGS
    unsigned long flags;

    WARN_ON_ONCE(in_irq());
#endif
    WARN_ON_ONCE(irqs_disabled());

#ifdef CONFIG_TRACE_IRQFLAGS
    local_irq_save(flags);
#endif
    /*
     * Are softirqs going to be turned on now:
     */
    if (softirq_count() == SOFTIRQ_OFFSET)
        trace_softirqs_on((unsigned long)__builtin_return_address(0));
    /*
     * Keep preemption disabled until we are done with
     * softirq processing:
      */
     sub_preempt_count(SOFTIRQ_OFFSET - 1);

    if (unlikely(!in_interrupt() && local_softirq_pending()))
        do_softirq();

    dec_preempt_count();
#ifdef CONFIG_TRACE_IRQFLAGS
    local_irq_restore(flags);
#endif
    preempt_check_resched();
}


void local_bh_disable(void)
{
    __local_bh_disable((unsigned long)__builtin_return_address(0));
}



Reference:

Linux Kernel Development,2st Edition

linux-2.6.26 source

http://blog.pfan.cn/ljqy/31891.html

阅读(1005) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~