Surprisingly, when OS is loaded it maps whole physical memory to some virtual address space. It's important that the physical and virtual address spaces are both contiguous now. And OS kernel works with exactly this virtual address space. For example, kmalloc() in Linux kernel returns pointer from this virtual address space. However, vmalloc() maps physical pages to some other virtual address space, named vmalloc area.
Thus kernel addresses can be resolved by simple offsets (see linux/arch/x86/include/asm/page_64.h):
static inline unsigned long
__phys_addr_nodebug(unsigned long x)
{
unsigned long y = x - __START_KERNEL_map;
/* use the carry flag to determine if
x was < __START_KERNEL_map */
x = y + ((x > y)
? phys_base
: (__START_KERNEL_map - PAGE_OFFSET));
return x;
}
Since virtual to physical address translation is trivial, why in earth do we need to use page table and waste invaluable TLB entries for the translations? However, there is nothing like MIPS's direct mapped kernel space segments for x86-64. The sad story about x86-64 is that trivial mappings wastes TLD entries and require extra memory transfers. The only one thing which x86-64 does to optimize TLD usage is global address spaces, e.g. kernel space, which are never invalidated in TLB on context switches. But still if you switch to kernel, kernel mappings evict your user-space mappings from TLB. Meantime, if your application is memory greedy, then syscalls can take long time due to TLB cache misses.