Split x86_64_start_kernel() into two pieces:
The first essentially cleans up after head_64.S. It clears the
bss, zaps low identity mappings, sets up some early exception
handlers.
The second part preserves the boot data, reserves the kernel's
text/data/bss, pagetables and ramdisk, and then starts the kernel
proper.
This split is so that Xen can call the second part to do the set up it
needs done. It doesn't need any of the first part setups, because it
doesn't boot via head_64.S, and its redundant or actively damaging.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: xen-devel <xen-devel@lists.xensource.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Set __PAGE_OFFSET to the most negative possible address +
16*PGDIR_SIZE. The gap is to allow a space for a hypervisor to fit.
The gap is more or less arbitrary, but it's what Xen needs.
When booting native, kernel/head_64.S has a set of compile-time
generated pagetables used at boot time. This patch removes their
absolutely hard-coded layout, and makes it parameterised on
__PAGE_OFFSET (and __START_KERNEL_map).
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: xen-devel <xen-devel@lists.xensource.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
On 32-bit it's best to use a %cs: prefix to access memory where the
other segments may not bet set up properly yet. On 64-bit it's best
to use a rip-relative addressing mode. Define PARA_INDIRECT() to
abstract this and generate the proper addressing mode in each case.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: xen-devel <xen-devel@lists.xensource.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add hooks which are called at pgd_alloc/free time. The pgd_alloc hook
may return an error code, which if non-zero, causes the pgd allocation
to be failed. The hooks may be used to allocate/free auxillary
per-pgd information.
also fix:
> * Ingo Molnar <mingo@elte.hu> wrote:
>
> include/asm/pgalloc.h: In function ‘paravirt_pgd_free':
> include/asm/pgalloc.h:14: error: parameter name omitted
> arch/x86/kernel/entry_64.S: In file included from
> arch/x86/kernel/traps_64.c:51:include/asm/pgalloc.h: In function ‘paravirt_pgd_free':
> include/asm/pgalloc.h:14: error: parameter name omitted
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: xen-devel <xen-devel@lists.xensource.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This is a preparatory patch for the next patch in series.
Moves some code from e820_setup_gap to a new function e820_search_gap.
This patch is a part of a bug fix where we walk the ACPI table to calculate
a gap for PCI optional devices.
v1->v2: Patch on top of tip/master.
Fixes a bug introduced in the last patch about the typeof "last".
Also the new function e820_search_gap now returns if we found a gap in
e820_map.
Signed-off-by: Alok N Kataria <akataria@vmware.com>
Cc: lenb@kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
... so can we use mem below max_low_pfn earlier.
this allows us to move several functions more early instead of waiting
to after paging_init.
That includes moving relocate_initrd() earlier in the bootup, and kva
related early setup done in initmem_init. (in followup patches)
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
There is no need to keep NMI_DISABLED definition and use it
for nmi_watchdog by default. Here is the point why:
- IO-APIC and APIC chips are programmed for nmi_watchdog support at very
early stage of kernel booting and not having nmi_watchdog specified as
boot option lead only to nmi_watchdog becomes to NMI_NONE anyway
- enable nmi_watchdog thru /proc/sys/kernel/nmi if it was not specified at
boot is not possible too (even having this sysfs entry)
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: macro@linux-mips.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add support for overlapping early memory reservations.
In general, they still can't overlap, and will panic
with "Overlapping early reservations" if they do overlap.
But if a memory range is reserved with the new call:
reserve_early_overlap_ok()
rather than with the usual call:
reserve_early()
then subsequent early reservations are allowed to overlap.
This new reserve_early_overlap_ok() call is only used in one
place so far, which is the "BIOS reserved" reservation for the
the EBDA region, which out of Paranoia reserves more than what
the BIOS might have specified, and which thus might overlap with
another legitimate early memory reservation (such as, perhaps,
the EFI memmap.)
Signed-off-by: Paul Jackson <pj@sgi.com>
Cc: "Yinghai Lu" <yhlu.kernel@gmail.com>
Cc: "Jack Steiner" <steiner@sgi.com>
Cc: "Mike Travis" <travis@sgi.com>
Cc: "Huang
Cc: Ying" <ying.huang@intel.com>
Cc: "Andi Kleen" <andi@firstfloor.org>
Cc: "Andrew Morton" <akpm@linux-foundation.org>
Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
those function depend on paging setup pgtable, so they could access
the ram in bootmem region but just get mapped.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
so that max_low_pfn is not changed after it is set.
so we can move that early and out of initmem_init.
could call find_low_pfn_range just after max_pfn is set.
also could move reserve_initrd out of setup_bootmem_allocator
so 32bit is more like 64bit.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
... so could add real hole in e820
agp check is using request_mem_region, and could fail if e820 is reserved...
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
want to remove arch_get_ram_range, and use early_node_map instead.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Take it out of smpboot.c, and move it to process_32.c, closer
to its only user.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We create a version of it for i386, and then take the CONFIG_X86_64
ifdef out of the game. We could create a __setup_vector_irq for i386,
but it would incur in an unnecessary lock taking. Moreover, it is better
practice to only export setup_vector_irq anyway.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
i386 and x86_64 used two different schemes for maintaining the gdt.
With this patch, x86_64 initial gdt table is defined in a .c file,
same way as i386 is now. Also, we call it "gdt_page", and the descriptor,
"early_gdt_descr". This way we achieve common naming, which can allow for
more code integration.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Making a variable page-aligned by using
__attribute__((section(".data.page_aligned"))) is fragile because if
sizeof(variable) is not also a multiple of page size, it leaves
variables in the remainder of the section unaligned.
This patch introduces two new qualifiers, __page_aligned_data and
__page_aligned_bss to set the section *and* the alignment of
variables. This makes page-aligned variables more robust because the
linker will make sure they're aligned properly. Unfortunately it
requires *all* page-aligned data to use these macros...
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch moves the reserve_crashkernel() to setup.c and removes the
architecture-specific version. Both versions were more or less the same.
I tested it on both x86-64 and i386, with CONFIG_KEXEC on and off (so
that it compiles).
Signed-off-by: Bernhard Walle <bwalle@suse.de>
Cc: yhlu.kernel@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The VLAN code contains multiple spots that use tag, id and tci as
identifiers for arguments and variables incorrectly and they actually
contain or are expected to contain something different. Additionally
types are used inconsistently (unsigned short vs u16) and identifiers
are sometimes capitalized.
- consistently use u16 for storing TCI, ID or QoS values
- consistently use vlan_id and vlan_tci for storing the respective values
- remove capitalization
- add kdoc comment to netif_hwaccel_{rx,receive_skb}
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hide struct vlan_dev_info from drivers to prevent them from growing
more creative ways to use it. Provide accessors for the two drivers
that currently use it.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function is huge and included at least once in every VLAN acceleration
capable driver. Uninline it; to avoid having drivers depend on the VLAN
module, the function is always built in statically when VLAN is enabled.
With all VLAN acceleration capable drivers that build on x86_64 enabled,
this results in:
text data bss dec hex filename
6515227 854044 343968 7713239 75b1d7 vmlinux.inlined
6505637 854044 343968 7703649 758c61 vmlinux.uninlined
----------------------------------------------------------
-9590
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Increase the maximum number of apics when running very large
configurations. This patch has no affect on most systems.
The patch has no effect on any 32-bit kernel. It adds ~4k to the size
of 64-bit kernels but only if NR_CPUS > 255.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>