Computer Science Notes

Thursday, May 23, 2013

Enums in Java

We all know the benefits of enums from programming languages like C/C++: the most known benefit is that you can assign values to a variable that must have a limited set of values by using an enum. Static type safety is ensured.

In most programming languages enums' types are primitives. In Java an enum is simply a class with the following characteristics:

it has no accessible constructors (constructors are private)

the enums' values is a list of final references with the same type as the enum's type

it can contain behavior

Design benefits

1. Avoid having a long list of constants defined in your code. In a class with lots of constants is hard to figure out what a constant may mean. Instead it's clearer to have the constants grouped together in several "types" of constants.

enums can't be extended
the only enum instances are those defined inside the enum definition

2. Enums extend java.lang.Object thus we can override any method we need (e.g. toString()). If enums were just primitive types we would have to implement this String representation outside of the enum code.

3. One can associate data with enum constants

4. Behavior can also be encapsulated inside an enum. This eliminates the need to implement logic somewhere else (worst is to have lots of constants with tons of conditional code spread inside various methods - with each conditional branch containing the logic for one or more constants)

5. Enums can't be extended, but they can implement interfaces

Small example

Suppose you'd like to implement a simple FSM. One possible way to do it is by using enums. The enums could hold the possible states of the FSM + the transition, which is a 3-tuple of (CurrentState, Event, NextState).

For fun, let's imagine a simple project scenario:

import static java.lang.System.out;

// this can be mapped to the project status (in our simple case)
enum CustomerSatisfaction {
PLEASED(10), SATISFIED(1000), UNPLEASED(50), FURIOUS(-1000);
private int money;
private CustomerSatisfaction(int money) { this. money = money; }
public int money() { return money; }
}

// for us, events are simply our ability to deliver
enum ProjectEvent {
ISSUE_SOLVED,
ELEGANT_SOLUTION,
SOLUTION_INTRODUCED_BUGS,
NASTY_WORKAROUND;
}

// the transition table
enum ProjectTransitions {
T1(CustomerSatisfaction.PLEASED, ProjectEvent.ISSUE_SOLVED,
CustomerSatisfaction.SATISFIED),
T2(CustomerSatisfaction.SATISFIED, ProjectEvent.NASTY_WORKAROUND,
CustomerSatisfaction.UNPLEASED),
T3(CustomerSatisfaction.UNPLEASED, ProjectEvent.NASTY_WORKAROUND,
CustomerSatisfaction.FURIOUS),
T4(CustomerSatisfaction.UNPLEASED, ProjectEvent.ELEGANT_SOLUTION,
CustomerSatisfaction.PLEASED);

private CustomerSatisfaction current, next;
private ProjectEvent event;

private ProjectTransitions (CustomerSatisfaction current, ProjectEvent event, CustomerSatisfaction next) {
this.current= current;
this.event = event;
this.next = next;
}

// in real-world scenario, a static enum map can be used here
public static CustomerSatisfaction next(CustomerSatisfaction current, ProjectEvent event) {
for (ProjectTransitions t : ProjectTransitions.values()) {
if (t.current == current && t.event == event)
return t.next;
}
// in case there is no transition
return current;
}
}

// dummy test client - in practice some observer + event generator would be in place
public class Project {
private volatile CustomerSatisfaction status;
private int earnings;

private final Object lock = new Object();
private final Object startLock = new Object();

private boolean eventReceived;
private boolean started;

public Project(CustomerSatisfaction initialStatus) {
status = initialStatus;
}

public void start() {
synchronized(startLock) {
started = true;
startLock.notifyAll();
}
while (status != CustomerSatisfaction.FURIOUS) {
synchronized(lock) {
while (!eventReceived) {
try {
lock.wait();
} catch (InterruptedException e) {
out.println("What's happening?");
}
}
earnings += status.money();
out.println("Earned " + status.money() + ". Customer is " + status.name());
eventReceived = false;
lock.notifyAll();
}
}
out.print("Project done! Earnings = " + earnings + " dollars. ");
out.println(earnings > 1000 ? "Good job!" : "Not very good");
}

public void onEvent(ProjectEvent event) {
synchronized (startLock) {
while (!started) {
try {
startLock.wait();
} catch (InterruptedException e) {
out.println("What's happening? :)");
}
}
}
synchronized(lock) {
while (eventReceived) {
try {
lock.wait();
} catch (InterruptedException e){}
}
out.println("New event: " + event.name());
status = ProjectTransitions.next(status, event);
eventReceived = true;
lock.notify();
}
}

public static void main (String... args) {
final Project productA = new Project(CustomerSatisfaction.UNPLEASED);
Thread t = new Thread(new Runnable() {
@Override public void run() {
productA.start();
}
}, "starter");
t.start();

// a simple event injector
productA.onEvent(ProjectEvent.ISSUE_SOLVED);
productA.onEvent(ProjectEvent.ELEGANT_SOLUTION);
productA.onEvent(ProjectEvent.ISSUE_SOLVED);
productA.onEvent(ProjectEvent.ELEGANT_SOLUTION);
productA.onEvent(ProjectEvent.NASTY_WORKAROUND);
productA.onEvent(ProjectEvent.NASTY_WORKAROUND);
productA.onEvent(ProjectEvent.NASTY_WORKAROUND);

try { t.join(); } catch (InterruptedException e) {}
}
}

Notice how easy it is to modify the state machine without modifying client code (no external logic in any place outside the FSM itself): states, events and transitions can be easily added / modified without affecting the other part of the system. This also makes it harder to introduce errors in the system.

Thursday, December 20, 2012

Syscall - part 2

Some argue that on x86 a syscall implies a context switch... Actually the kernel code runs in two possible contexts:

in kernel-space mode, in process context (on behalf of a specific process)
in kernel-space, in interrupt context (not bound to any process)

The CPU executes either userland code in user space, or kernel code in one of the two above-mentioned contexts.

By context switch we usually mean changing the current process; it is what happens when the current PID changes (when the scheduler preempts a process); In the following lines I'd like to stress that a syscall happens without this context switch.

_kernel_vsyscall() looks like this:

ffffe400 <__kernel_vsyscall>:
ffffe400: 51 push %ecx
ffffe401: 52 push %edx
ffffe402: 55 push %ebp
ffffe403: 89 e5 mov %esp,%ebp
ffffe405: 0f 34 sysenter
ffffe407: 90 nop
ffffe408: 90 nop
ffffe409: 90 nop
ffffe40a: 90 nop
ffffe40b: 90 nop
ffffe40c: 90 nop
ffffe40d: 90 nop
ffffe40e: eb f3 jmp ffffe403 <__kernel_vsyscall x3="x3">
ffffe410: 5d pop %ebp
ffffe411: 5a pop %edx
ffffe412: 59 pop %ecx
ffffe413: c3 ret

esp = stack pointer
eip = instruction pointer

Explanation:
- after moving to this address, registers %ecx, %edx and %ebp are saved on the user stack and %esp is copied to %ebp before executing sysenter (this %ebp later helps kernel in restoring userland stack)

- jmp __kernel_vsyscall+0x3 is just a trick made in order to be able to work with 6 arguments instead of 3; the standard max number of args for a syscall is 6:
- we make sysenter twice (the second sysenter has no impact: sysenter is just "restarted") - see https://lkml.org/lkml/2002/12/18/218 (Linus is a "disguisting pig") :)

- sysenter is executed; this will bring the CPU in Ring0 (a.k.a. CPL=0)

- sysenter (fast system call facility on x86) does the following:
- CS register set to the value of (SYSENTER_CS_MSR)
- EIP register set to the value of (SYSENTER_EIP_MSR)
- SS register set to the sum of (8 plus the value in SYSENTER_CS_MSR)
- ESP register set to the value of (SYSENTER_ESP_MSR)
- Intel defines these SFRs:
SYSENTER_CS_MSR=0x174
SYSENTER_ESP_MSR=0x175
SYSENTER_EIP_MSR=0x176
- these values are defined in linux in /usr/src/linux/include/asm/msr.h:
#define MSR_IA32_SYSENTER_CS 0x174
#define MSR_IA32_SYSENTER_ESP 0x175
#define MSR_IA32_SYSENTER_EIP 0x176

- at bootup, linux sets this values in a special page (/usr/src/linux/arch/i386/kernel/sysenter.c):
wrmsr(MSR_IA32_SYSENTER_CS, __KERNEL_CS, 0);
wrmsr(MSR_IA32_SYSENTER_ESP, tss->esp1, 0);
wrmsr(MSR_IA32_SYSENTER_EIP, (unsigned long) sysenter_entry, 0);

- wrmsr writes the contents of registers EDX:EAX into the 64-bit model specific register (MSR) specified in the ECX register

- 'tss' refers to the Task State Segment (TSS) and tss->esp1 thus points to the kernel mode stack

- Although Linux doesn't use hardware context switches, it is nonetheless forced to set up a TSS for each distinct CPU in the system. This is done for two main reasons:
- When an x86 CPU switches from User Mode to Kernel Mode, it fetches the address of the Kernel Mode stack from the TSS
- When a User Mode process attempts to access an I/O port by means of an in or out instruction, the CPU may need to access an I/O Permission Bitmap stored in the TSS to verify whether the process is allowed to address the port.

- So during initialization kernel sets up these registers such that after SYSENTER instruction, ESP is set to kernel mode stack and EIP is set to sysenter_entry

- ESP is set to kernel mode stack and EIP is set to sysenter_entry

- now the kernel executes the following code (we are in Ring0 and we are executing kernel code, but the current PID is the old PID!
- the calling user thread is still at the sysenter line;
- context switching is done before returning to user space
- in linux context switching is made in software, not in hardware
- however, linux uses TSS for every process it creates - it creates/stores the TSS entry for the process at process creation):

- When a transition between user mode and kernel mode is required in an operating system, a context switch is not necessary; a mode transition is not by itself a context switch.

However, depending on the operating system, a context switch may also take place at this time.(http://en.wikipedia.org/wiki/Context_switch#User_and_kernel_mode_switching)

179 ENTRY(sysenter_entry)
180 movl TSS_sysenter_esp0(%esp),%esp
181 sysenter_past_esp:
182 sti
183 pushl $(__USER_DS)
184 pushl %ebp [%ebp contains userland %esp]
185 pushfl
186 pushl $(__USER_CS)
187 pushl $SYSENTER_RETURN [%userland return addr]
188
...
201 pushl %eax
202 SAVE_ALL [pushes registers on to stack]
203 GET_THREAD_INFO(%ebp)
204
205 /* Note, _TIF_SECCOMP is bit number 8, and so it needs testw and not testb */
206 testw $(_TIF_SYSCALL_EMU|_TIF_SYSCALL_TRACE|_TIF_SECCOMP|_TIF_SYSCALL_AUDIT), TI_flags(%ebp)
207 jnz syscall_trace_entry
208 cmpl $(nr_syscalls), %eax
209 jae syscall_badsys

210 call *sys_call_table(,%eax,4)

211 movl %eax,EAX(%esp)

#define SAVE_ALL \
cld; \
pushl %es; \
pushl %ds; \
pushl %eax; \
pushl %ebp; \
pushl %edi; \
pushl %esi; \
pushl %edx; \
pushl %ecx; \
pushl %ebx; \
movl $(__USER_DS), %edx; \
movl %edx, %ds; \
movl %edx, %es;

In conclusion, what happens is not a context switch, but a mode transition. The current PID is the same. And, btw, user preemption can only happen in one of the 2 situations:

After syscall finishes and the need_resched flag is set
After interrupt finishes and need_resched flag is set

The need_resched flag is set by the scheduler tick when a thread needs to be preempted or by try_to_wake_up() when a higher priority process can be awakened.

Floating point operations: hardware vs software

This is a (maybe incomplete) list of floating point operations in GCC:
__udivsi3
_vfprintf_r
__eqdf2
__nedf2
__umoddi3
__udivdi3
__ltdf2
__eqdf2
__nedf2
__negdf2
__udivsi3
__eqdf2
__subdf3
__muldf3
__adddf3
__floatsidf
__muldf3
__adddf3
__fixdfsi
__ltdf2
__gtdf2
__divdf3
__muldf3
__divdf3
__ltdf2
__muldf3
__floatsidf
__muldf3
__adddf3
__subdf3
__gtdf2
__negdf2
__gtdf2
__muldf3
__gedf2
__divdf3

Depending on gcc/g++ compilation flag (e.g. -msoft-float), the linker will link these instructions to either calls to hardware FPU (x87 on x86 machines), or use user-defined libraries which implement the above-mentioned functions.

The only difference between hw versus sw FP operations should be speed, because in sw the same algorithms are applied as in hw.

An example of fp div operation (from Apple oss clang implementation): http://www.opensource.apple.com/source/clang/clang-163.7.1/src/projects/compiler-rt/lib/divdf3.c

In reality, some FPUs have certain features like not making the rounding until the operand is written to memory; this causes some rounding improvements over sw implementations.
However, for precise FP operations (e.g. like the ones used in financial applications), libraries exist which reduce even more the rounding errors: e.g. Java's BigDecimal, GMP (http://gmplib.org/), etc.