Pull rdma updates from Doug Ledford:
- a large cleanup of how device capabilities are checked for various
features
- additional cleanups in the MAD processing
- update to the srp driver
- creation and use of centralized log message helpers
- add const to a number of args to calls and clean up call chain
- add support for extended cq create verb
- add support for timestamps on cq completion
- add support for processing OPA MAD packets
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (92 commits)
IB/mad: Add final OPA MAD processing
IB/mad: Add partial Intel OPA MAD support
IB/mad: Add partial Intel OPA MAD support
IB/core: Add OPA MAD core capability flag
IB/mad: Add support for additional MAD info to/from drivers
IB/mad: Convert allocations from kmem_cache to kzalloc
IB/core: Add ability for drivers to report an alternate MAD size.
IB/mad: Support alternate Base Versions when creating MADs
IB/mad: Create a generic helper for DR forwarding checks
IB/mad: Create a generic helper for DR SMP Recv processing
IB/mad: Create a generic helper for DR SMP Send processing
IB/mad: Split IB SMI handling from MAD Recv handler
IB/mad cleanup: Generalize processing of MAD data
IB/mad cleanup: Clean up function params -- find_mad_agent
IB/mlx4: Add support for CQ time-stamping
IB/mlx4: Add mmap call to map the hardware clock
IB/core: Pass hardware specific data in query_device
IB/core: Add timestamp_mask and hca_core_clock to query_device
IB/core: Extend ib_uverbs_create_cq
IB/core: Add CQ creation time-stamping flag
...
Pull power management and ACPI updates from Rafael Wysocki:
"The rework of backlight interface selection API from Hans de Goede
stands out from the number of commits and the number of affected
places perspective. The cpufreq core fixes from Viresh Kumar are
quite significant too as far as the number of commits goes and because
they should reduce CPU online/offline overhead quite a bit in the
majority of cases.
From the new featues point of view, the ACPICA update (to upstream
revision 20150515) adding support for new ACPI 6 material to ACPICA is
the one that matters the most as some new significant features will be
based on it going forward. Also included is an update of the ACPI
device power management core to follow ACPI 6 (which in turn reflects
the Windows' device PM implementation), a PM core extension to support
wakeup interrupts in a more generic way and support for the ACPI _CCA
device configuration object.
The rest is mostly fixes and cleanups all over and some documentation
updates, including new DT bindings for Operating Performance Points.
There is one fix for a regression introduced in the 4.1 cycle, but it
adds quite a number of lines of code, it wasn't really ready before
Thursday and you were on vacation, so I refrained from pushing it on
the last minute for 4.1.
Specifics:
- ACPICA update to upstream revision 20150515 including basic support
for ACPI 6 features: new ACPI tables introduced by ACPI 6 (STAO,
XENV, WPBT, NFIT, IORT), changes related to the other tables (DTRM,
FADT, LPIT, MADT), new predefined names (_BTH, _CR3, _DSD, _LPI,
_MTL, _PRR, _RDI, _RST, _TFP, _TSN), fixes and cleanups (Bob Moore,
Lv Zheng).
- ACPI device power management core code update to follow ACPI 6
which reflects the ACPI device power management implementation in
Windows (Rafael J Wysocki).
- rework of the backlight interface selection logic to reduce the
number of kernel command line options and improve the handling of
DMI quirks that may be involved in that and to make the code
generally more straightforward (Hans de Goede).
- fixes for the ACPI Embedded Controller (EC) driver related to the
handling of EC transactions (Lv Zheng).
- fix for a regression related to the ACPI resources management and
resulting from a recent change of ACPI initialization code ordering
(Rafael J Wysocki).
- fix for a system initialization regression related to ACPI
introduced during the 3.14 cycle and caused by running the code
that switches the platform over to the ACPI mode too early in the
initialization sequence (Rafael J Wysocki).
- support for the ACPI _CCA device configuration object related to
DMA cache coherence (Suravee Suthikulpanit).
- ACPI/APEI fixes and cleanups (Jiri Kosina, Borislav Petkov).
- ACPI battery driver cleanups (Luis Henriques, Mathias Krause).
- ACPI processor driver cleanups (Hanjun Guo).
- cleanups and documentation update related to the ACPI device
properties interface based on _DSD (Rafael J Wysocki).
- ACPI device power management fixes (Rafael J Wysocki).
- assorted cleanups related to ACPI (Dominik Brodowski, Fabian
Frederick, Lorenzo Pieralisi, Mathias Krause, Rafael J Wysocki).
- fix for a long-standing issue causing General Protection Faults to
be generated occasionally on return to user space after resume from
ACPI-based suspend-to-RAM on 32-bit x86 (Ingo Molnar).
- fix to make the suspend core code return -EBUSY consistently in all
cases when system suspend is aborted due to wakeup detection (Ruchi
Kandoi).
- support for automated device wakeup IRQ handling allowing drivers
to make their PM support more starightforward (Tony Lindgren).
- new tracepoints for suspend-to-idle tracing and rework of the
prepare/complete callbacks tracing in the PM core (Todd E Brandt,
Rafael J Wysocki).
- wakeup sources framework enhancements (Jin Qian).
- new macro for noirq system PM callbacks (Grygorii Strashko).
- assorted cleanups related to system suspend (Rafael J Wysocki).
- cpuidle core cleanups to make the code more efficient (Rafael J
Wysocki).
- powernv/pseries cpuidle driver update (Shilpasri G Bhat).
- cpufreq core fixes related to CPU online/offline that should reduce
the overhead of these operations quite a bit, unless the CPU in
question is physically going away (Viresh Kumar, Saravana Kannan).
- serialization of cpufreq governor callbacks to avoid race
conditions in some cases (Viresh Kumar).
- intel_pstate driver fixes and cleanups (Doug Smythies, Prarit
Bhargava, Joe Konno).
- cpufreq driver (arm_big_little, cpufreq-dt, qoriq) updates (Sudeep
Holla, Felipe Balbi, Tang Yuantian).
- assorted cleanups in cpufreq drivers and core (Shailendra Verma,
Fabian Frederick, Wang Long).
- new Device Tree bindings for representing Operating Performance
Points (Viresh Kumar).
- updates for the common clock operations support code in the PM core
(Rajendra Nayak, Geert Uytterhoeven).
- PM domains core code update (Geert Uytterhoeven).
- Intel Knights Landing support for the RAPL (Running Average Power
Limit) power capping driver (Dasaratharaman Chandramouli).
- fixes related to the floor frequency setting on Atom SoCs in the
RAPL power capping driver (Ajay Thomas).
- runtime PM framework documentation update (Ben Dooks).
- cpupower tool fix (Herton R Krzesinski)"
* tag 'pm+acpi-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (194 commits)
cpuidle: powernv/pseries: Auto-promotion of snooze to deeper idle state
x86: Load __USER_DS into DS/ES after resume
PM / OPP: Add binding for 'opp-suspend'
PM / OPP: Allow multiple OPP tables to be passed via DT
PM / OPP: Add new bindings to address shortcomings of existing bindings
ACPI: Constify ACPI device IDs in documentation
ACPI / enumeration: Document the rules regarding the PRP0001 device ID
ACPI / video: Make acpi_video_unregister_backlight() private
acpi-video-detect: Remove old API
toshiba-acpi: Port to new backlight interface selection API
thinkpad-acpi: Port to new backlight interface selection API
sony-laptop: Port to new backlight interface selection API
samsung-laptop: Port to new backlight interface selection API
msi-wmi: Port to new backlight interface selection API
msi-laptop: Port to new backlight interface selection API
intel-oaktrail: Port to new backlight interface selection API
ideapad-laptop: Port to new backlight interface selection API
fujitsu-laptop: Port to new backlight interface selection API
eeepc-laptop: Port to new backlight interface selection API
dell-wmi: Port to new backlight interface selection API
...
Pull PCI updates from Bjorn Helgaas:
"PCI changes for the v4.2 merge window:
Enumeration
- Move pci_ari_enabled() to global header (Alex Williamson)
- Account for ARI in _PRT lookups (Alex Williamson)
- Remove unused pci_scan_bus_parented() (Yijing Wang)
Resource management
- Use host bridge _CRS info on systems with >32 bit addressing (Bjorn Helgaas)
- Use host bridge _CRS info on Foxconn K8M890-8237A (Bjorn Helgaas)
- Fix pci_address_to_pio() conversion of CPU address to I/O port (Zhichang Yuan)
- Add pci_bus_addr_t (Yinghai Lu)
PCI device hotplug
- Wait for pciehp command completion where necessary (Alex Williamson)
- Drop pointless ACPI-based "slot detection" check (Rafael J. Wysocki)
- Check ignore_hotplug for all downstream devices (Rafael J. Wysocki)
- Propagate the "ignore hotplug" setting to parent (Rafael J. Wysocki)
- Inline pciehp "handle event" functions into the ISR (Bjorn Helgaas)
- Clean up pciehp debug logging (Bjorn Helgaas)
Power management
- Remove redundant PCIe port type checking (Yijing Wang)
- Add dev->has_secondary_link to track downstream PCIe links (Yijing Wang)
- Use dev->has_secondary_link to find downstream links for ASPM (Yijing Wang)
- Drop __pci_disable_link_state() useless "force" parameter (Bjorn Helgaas)
- Simplify Clock Power Management setting (Bjorn Helgaas)
Virtualization
- Add ACS quirks for Intel 9-series PCH root ports (Alex Williamson)
- Add function 1 DMA alias quirk for Marvell 9120 (Sakari Ailus)
MSI
- Disable MSI at enumeration even if kernel doesn't support MSI (Michael S. Tsirkin)
- Remove unused pci_msi_off() (Bjorn Helgaas)
- Rename msi_set_enable(), msix_clear_and_set_ctrl() (Michael S. Tsirkin)
- Export pci_msi_set_enable(), pci_msix_clear_and_set_ctrl() (Michael S. Tsirkin)
- Drop pci_msi_off() calls during probe (Michael S. Tsirkin)
APM X-Gene host bridge driver
- Add APM X-Gene v1 PCIe MSI/MSIX termination driver (Duc Dang)
- Add APM X-Gene PCIe MSI DTS nodes (Duc Dang)
- Disable Configuration Request Retry Status for v1 silicon (Duc Dang)
- Allow config access to Root Port even when link is down (Duc Dang)
Broadcom iProc host bridge driver
- Allow override of device tree IRQ mapping function (Hauke Mehrtens)
- Add BCMA PCIe driver (Hauke Mehrtens)
- Directly add PCI resources (Hauke Mehrtens)
- Free resource list after registration (Hauke Mehrtens)
Freescale i.MX6 host bridge driver
- Add speed change timeout message (Troy Kisky)
- Rename imx6_pcie_start_link() to imx6_pcie_establish_link() (Bjorn Helgaas)
Freescale Layerscape host bridge driver
- Use dw_pcie_link_up() consistently (Bjorn Helgaas)
- Factor out ls_pcie_establish_link() (Bjorn Helgaas)
Marvell MVEBU host bridge driver
- Remove mvebu_pcie_scan_bus() (Yijing Wang)
NVIDIA Tegra host bridge driver
- Remove tegra_pcie_scan_bus() (Yijing Wang)
Synopsys DesignWare host bridge driver
- Consolidate outbound iATU programming functions (Jisheng Zhang)
- Use iATU0 for cfg and IO, iATU1 for MEM (Jisheng Zhang)
- Add support for x8 links (Zhou Wang)
- Wait for link to come up with consistent style (Bjorn Helgaas)
- Use pci_scan_root_bus() for simplicity (Yijing Wang)
TI DRA7xx host bridge driver
- Use dw_pcie_link_up() consistently (Bjorn Helgaas)
Miscellaneous
- Include <linux/pci.h>, not <asm/pci.h> (Bjorn Helgaas)
- Remove unnecessary #includes of <asm/pci.h> (Bjorn Helgaas)
- Remove unused pcibios_select_root() (again) (Bjorn Helgaas)
- Remove unused pci_dma_burst_advice() (Bjorn Helgaas)
- xen/pcifront: Don't use deprecated function pci_scan_bus_parented() (Arnd Bergmann)"
* tag 'pci-v4.2-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (58 commits)
PCI: pciehp: Inline the "handle event" functions into the ISR
PCI: pciehp: Rename queue_interrupt_event() to pciehp_queue_interrupt_event()
PCI: pciehp: Make queue_interrupt_event() void
PCI: xgene: Allow config access to Root Port even when link is down
PCI: xgene: Disable Configuration Request Retry Status for v1 silicon
PCI: pciehp: Clean up debug logging
x86/PCI: Use host bridge _CRS info on systems with >32 bit addressing
PCI: imx6: Add #define PCIE_RC_LCSR
PCI: imx6: Use "u32", not "uint32_t"
PCI: Remove unused pci_scan_bus_parented()
xen/pcifront: Don't use deprecated function pci_scan_bus_parented()
PCI: imx6: Add speed change timeout message
PCI/ASPM: Simplify Clock Power Management setting
PCI: designware: Wait for link to come up with consistent style
PCI: layerscape: Factor out ls_pcie_establish_link()
PCI: layerscape: Use dw_pcie_link_up() consistently
PCI: dra7xx: Use dw_pcie_link_up() consistently
x86/PCI: Use host bridge _CRS info on Foxconn K8M890-8237A
PCI: pciehp: Wait for hotplug command completion where necessary
PCI: Remove unused pci_dma_burst_advice()
...
Pull gpio updates from Linus Walleij:
"This is the big bulk of GPIO changes queued for the v4.2 kernel
series:
- a big set of cleanups to the aged sysfs interface from Johan
Hovold. To get these in, v4.1-rc3 was merged into the tree as the
first patch in that series had to go into stable. This makes the
locking much more fine-grained (get rid of the "big GPIO lock(s)"
and store states in the GPIO descriptors.
- rename gpiod_[g|s]et_array() to gpiod_[g|s]et_array_value() to
avoid confusions.
- New drivers for:
* NXP LPC18xx (currently LPC1850)
* NetLogic XLP
* Broadcom STB SoC's
* Axis ETRAXFS
* Zynq Ultrascale+ (subdriver)
- ACPI:
* make it possible to retrieve GpioInt resources from a GPIO
device using acpi_dev_gpio_irq_get()
* merge some dependent I2C changes exploiting this.
* support the ARM X-Gene GPIO standby driver.
- make it possible for the generic GPIO driver to read back the value
set registers to reflect current status.
- loads of OMAP IRQ handling fixes.
- incremental improvements to Kona, max732x, OMAP, MXC, RCAR,
PCA953x, STP-XWAY, PCF857x, Crystalcove, TB10x.
- janitorial (constification, checkpatch cleanups)"
* tag 'gpio-v4.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (71 commits)
gpio: Fix checkpatch.pl issues
gpio: pcf857x: handle only enabled irqs
gpio / ACPI: Return -EPROBE_DEFER if the gpiochip was not found
GPIO / ACPI: export acpi_gpiochip_request(free)_interrupts for module use
gpio: improve error reporting on own descriptors
gpio: promote own request failure to pr_err()
gpio: Added support to Zynq Ultrascale+ MPSoC
gpio: add ETRAXFS GPIO driver
fix documentation after renaming gpiod_set_array to gpiod_set_array_value
gpio: Add GPIO support for Broadcom STB SoCs
gpio: xgene: add ACPI support for APM X-Gene GPIO standby driver
gpio: tb10x: Drop unneeded free_irq() call
gpio: crystalcove: set IRQCHIP_SKIP_SET_WAKE for the irqchip
gpio: stp-xway: Use the of_property_read_u32 helper
gpio: pcf857x: Check for irq_set_irq_wake() failures
gpio-stp-xway: Fix enabling the highest bit of the PHY LEDs
gpio: Prevent an integer overflow in the pca953x driver
gpio: omap: rework omap_gpio_irq_startup to handle current pin state properly
gpio: omap: rework omap_gpio_request to touch only gpio specific registers
gpio: omap: rework omap_x_irq_shutdown to touch only irqs specific registers
...
Pull x86 core updates from Ingo Molnar:
"There were so many changes in the x86/asm, x86/apic and x86/mm topics
in this cycle that the topical separation of -tip broke down somewhat -
so the result is a more traditional architecture pull request,
collected into the 'x86/core' topic.
The topics were still maintained separately as far as possible, so
bisectability and conceptual separation should still be pretty good -
but there were a handful of merge points to avoid excessive
dependencies (and conflicts) that would have been poorly tested in the
end.
The next cycle will hopefully be much more quiet (or at least will
have fewer dependencies).
The main changes in this cycle were:
* x86/apic changes, with related IRQ core changes: (Jiang Liu, Thomas
Gleixner)
- This is the second and most intrusive part of changes to the x86
interrupt handling - full conversion to hierarchical interrupt
domains:
[IOAPIC domain] -----
|
[MSI domain] --------[Remapping domain] ----- [ Vector domain ]
| (optional) |
[HPET MSI domain] ----- |
|
[DMAR domain] -----------------------------
|
[Legacy domain] -----------------------------
This now reflects the actual hardware and allowed us to distangle
the domain specific code from the underlying parent domain, which
can be optional in the case of interrupt remapping. It's a clear
separation of functionality and removes quite some duct tape
constructs which plugged the remap code between ioapic/msi/hpet
and the vector management.
- Intel IOMMU IRQ remapping enhancements, to allow direct interrupt
injection into guests (Feng Wu)
* x86/asm changes:
- Tons of cleanups and small speedups, micro-optimizations. This
is in preparation to move a good chunk of the low level entry
code from assembly to C code (Denys Vlasenko, Andy Lutomirski,
Brian Gerst)
- Moved all system entry related code to a new home under
arch/x86/entry/ (Ingo Molnar)
- Removal of the fragile and ugly CFI dwarf debuginfo annotations.
Conversion to C will reintroduce many of them - but meanwhile
they are only getting in the way, and the upstream kernel does
not rely on them (Ingo Molnar)
- NOP handling refinements. (Borislav Petkov)
* x86/mm changes:
- Big PAT and MTRR rework: making the code more robust and
preparing to phase out exposing direct MTRR interfaces to drivers -
in favor of using PAT driven interfaces (Toshi Kani, Luis R
Rodriguez, Borislav Petkov)
- New ioremap_wt()/set_memory_wt() interfaces to support
Write-Through cached memory mappings. This is especially
important for good performance on NVDIMM hardware (Toshi Kani)
* x86/ras changes:
- Add support for deferred errors on AMD (Aravind Gopalakrishnan)
This is an important RAS feature which adds hardware support for
poisoned data. That means roughly that the hardware marks data
which it has detected as corrupted but wasn't able to correct, as
poisoned data and raises an APIC interrupt to signal that in the
form of a deferred error. It is the OS's responsibility then to
take proper recovery action and thus prolonge system lifetime as
far as possible.
- Add support for Intel "Local MCE"s: upcoming CPUs will support
CPU-local MCE interrupts, as opposed to the traditional system-
wide broadcasted MCE interrupts (Ashok Raj)
- Misc cleanups (Borislav Petkov)
* x86/platform changes:
- Intel Atom SoC updates
... and lots of other cleanups, fixlets and other changes - see the
shortlog and the Git log for details"
* 'x86-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (222 commits)
x86/hpet: Use proper hpet device number for MSI allocation
x86/hpet: Check for irq==0 when allocating hpet MSI interrupts
x86/mm/pat, drivers/infiniband/ipath: Use arch_phys_wc_add() and require PAT disabled
x86/mm/pat, drivers/media/ivtv: Use arch_phys_wc_add() and require PAT disabled
x86/platform/intel/baytrail: Add comments about why we disabled HPET on Baytrail
genirq: Prevent crash in irq_move_irq()
genirq: Enhance irq_data_to_desc() to support hierarchy irqdomain
iommu, x86: Properly handle posted interrupts for IOMMU hotplug
iommu, x86: Provide irq_remapping_cap() interface
iommu, x86: Setup Posted-Interrupts capability for Intel iommu
iommu, x86: Add cap_pi_support() to detect VT-d PI capability
iommu, x86: Avoid migrating VT-d posted interrupts
iommu, x86: Save the mode (posted or remapped) of an IRTE
iommu, x86: Implement irq_set_vcpu_affinity for intel_ir_chip
iommu: dmar: Provide helper to copy shared irte fields
iommu: dmar: Extend struct irte for VT-d Posted-Interrupts
iommu: Add new member capability to struct irq_remap_ops
x86/asm/entry/64: Disentangle error_entry/exit gsbase/ebx/usermode code
x86/asm/entry/32: Shorten __audit_syscall_entry() args preparation
x86/asm/entry/32: Explain reloading of registers after __audit_syscall_entry()
...
Pull scheduler updates from Ingo Molnar:
"The main changes are:
- lockless wakeup support for futexes and IPC message queues
(Davidlohr Bueso, Peter Zijlstra)
- Replace spinlocks with atomics in thread_group_cputimer(), to
improve scalability (Jason Low)
- NUMA balancing improvements (Rik van Riel)
- SCHED_DEADLINE improvements (Wanpeng Li)
- clean up and reorganize preemption helpers (Frederic Weisbecker)
- decouple page fault disabling machinery from the preemption
counter, to improve debuggability and robustness (David
Hildenbrand)
- SCHED_DEADLINE documentation updates (Luca Abeni)
- topology CPU masks cleanups (Bartosz Golaszewski)
- /proc/sched_debug improvements (Srikar Dronamraju)"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (79 commits)
sched/deadline: Remove needless parameter in dl_runtime_exceeded()
sched: Remove superfluous resetting of the p->dl_throttled flag
sched/deadline: Drop duplicate init_sched_dl_class() declaration
sched/deadline: Reduce rq lock contention by eliminating locking of non-feasible target
sched/deadline: Make init_sched_dl_class() __init
sched/deadline: Optimize pull_dl_task()
sched/preempt: Add static_key() to preempt_notifiers
sched/preempt: Fix preempt notifiers documentation about hlist_del() within unsafe iteration
sched/stop_machine: Fix deadlock between multiple stop_two_cpus()
sched/debug: Add sum_sleep_runtime to /proc/<pid>/sched
sched/debug: Replace vruntime with wait_sum in /proc/sched_debug
sched/debug: Properly format runnable tasks in /proc/sched_debug
sched/numa: Only consider less busy nodes as numa balancing destinations
Revert 095bebf61a ("sched/numa: Do not move past the balance point if unbalanced")
sched/fair: Prevent throttling in early pick_next_task_fair()
preempt: Reorganize the notrace definitions a bit
preempt: Use preempt_schedule_context() as the official tracing preemption point
sched: Make preempt_schedule_context() function-tracing safe
x86: Remove cpu_sibling_mask() and cpu_core_mask()
x86: Replace cpu_**_mask() with topology_**_cpumask()
...
Currently, amd-xgbe driver has separate logic to determine device
coherency for DT vs. ACPI. This patch simplifies the code with
a call to device_dma_is_coherent().
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In order to read the HCA's cycle counter efficiently in
user space, we need to map the HCA's register.
This is done through mmap call.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
When programming the start of a periodic output, the code wrongly places
the seconds value into the "low" register and the nanoseconds into the
"high" register. Even though this is backwards, it slipped through my
testing, because the re-arming code in the interrupt service routine is
correct, and the signal does appear starting with the second edge.
This patch fixes the issue by programming the registers correctly.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enhance cxgb4_t4_bar2_sge_qregs() and cxgb4_bar2_sge_qregs() to support T4
user mode mappings. Update all the current users as well.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
We do not check the return value of enic_dev_stats_dump(). If allocation
fails, we will hit NULL pointer reference.
Return only if memory allocation fails. For other failures, we return the
previously recorded values.
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The RGMII block is currently only powered on when using RGMII or
RGMII_NO_ID, which is not correct when using the GENET interface in MII
or Reverse MII modes. We always need to power on the RGMII interface for
this block to properly work, regardless of the MII mode in which we
operate.
Fixes: aa09677cba ("net: bcmgenet: add MDIO routines")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
pci_dma_burst_advice() was added by e24c2d963a ("[PATCH] PCI: DMA
bursting advice") but apparently never used. Remove it.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Michal Simek <monstr@monstr.eu> # microblaze
CC: David S. Miller <davem@davemloft.net>
When the driver gets unregistered a call to netif_napi_del() was
missing, this all was also missing in the error paths of
b44_init_one().
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are several places in the driver (all in control paths) where
coherent dma memory is being allocated using either dma_alloc_coherent()
or the deprecated pci_alloc_consistent(). All these calls should be
changed to use dma_zalloc_coherent() to avoid uninitialized fields in
data structures backed by this memory.
Reported-by: Joerg Roedel <jroedel@suse.de>
Tested-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@avagotech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since the Tx timer function runs in softirq context the driver needs
to call disable_irq_nosync instead of a disable_irq.
Reported-by: Josh Stone <jistone@redhat.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If SRIOV is enabled we need to be in VEB mode not VEPA mode at probe.
This fixes an NPAR bug when SRIOV is enabled in the BIOS.
Change-ID: Ibf006abafd9a0ca3698ec24848cd771cf345cbbc
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The patch fixes a bug in the default configuration which
prevented a software bridge loaded on the PF interface from
working correctly because broadcast packets are incorrectly
looped back.
Fix the general case, by loading the driver in VEPA mode Until a
VF or VMDq VSI is added. This way loopback on the Main VSI is
turned off until needed and can resolve the issue of unnecessary
reflection for users that do not have VF or VMDq VSIs setup.
The driver must now coordinate the loopback setting for the Flow
Director (FDIR) VSI to make sure it is in sync with the current
VEB or VEPA mode setting.
The user can still switch bridge modes from the bridge commands and
choose to be in VEPA mode with VF VSIs. Because of hardware
requirements, the call to switch to VEB mode when no VF/VMDqs are
present will be rejected.
NOTE: This patch uses BIT_ULL as that is preferred going forward,
a followup patch in the lower priority queue to net-next will fix
up the remaining 1 << usages.
Change-ID: Ib121ddb18fe4b3c4f52e9deda6fcbeb9105683d1
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The process_mad device function declares some parameters as "in". Make those
parameters const and adjust the call tree under process_mad in the various
drivers accordingly.
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Hal Rosenstock <hal@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Pull networking fixes from David Miller:
1) Various VTI tunnel (mark handling, PMTU) bug fixes from Alexander
Duyck and Steffen Klassert.
2) Revert ethtool PHY query change, it wasn't correct. The PHY address
selected by the driver running the PHY to MAC connection decides
what PHY address GET ethtool operations return information from.
3) Fix handling of sequence number bits for encryption IV generation in
ESP driver, from Herbert Xu.
4) UDP can return -EAGAIN when we hit a bad checksum on receive, even
when there are other packets in the receive queue which is wrong.
Just respect the error returned from the generic socket recv
datagram helper. From Eric Dumazet.
5) Fix BNA driver firmware loading on big-endian systems, from Ivan
Vecera.
6) Fix regression in that we were inheriting the congestion control of
the listening socket for new connections, the intended behavior
always was to use the default in this case. From Neal Cardwell.
7) Fix NULL deref in brcmfmac driver, from Arend van Spriel.
8) OTP parsing fix in iwlwifi from Liad Kaufman.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (26 commits)
vti6: Add pmtu handling to vti6_xmit.
Revert "net: core: 'ethtool' issue with querying phy settings"
bnx2x: Move statistics implementation into semaphores
xen: netback: read hotplug script once at start of day.
xen: netback: fix printf format string warning
Revert "netfilter: ensure number of counters is >0 in do_replace()"
net: dsa: Properly propagate errors from dsa_switch_setup_one
tcp: fix child sockets to use system default congestion control if not set
udp: fix behavior of wrong checksums
sfc: free multiple Rx buffers when required
bna: fix soft lock-up during firmware initialization failure
bna: remove unreasonable iocpf timer start
bna: fix firmware loading on big-endian machines
bridge: fix br_multicast_query_expired() bug
via-rhine: Resigning as maintainer
brcmfmac: avoid null pointer access when brcmf_msgbuf_get_pktid() fails
mac80211: Fix mac80211.h docbook comments
iwlwifi: nvm: fix otp parsing in 8000 hw family
iwlwifi: pcie: fix tracking of cmd_in_flight
ip_vti/ip6_vti: Preserve skb->mark after rcv_cb call
...
Kalle Valo says:
====================
iwlwifi:
* fix OTP parsing 8260
* fix powersave handling for 8260
brcmfmac:
* fix null pointer crash
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit dff173de84 ("bnx2x: Fix statistics locking scheme") changed the
bnx2x locking around statistics state into using a mutex - but the lock
is being accessed via a timer which is forbidden.
[If compiled with CONFIG_DEBUG_MUTEXES, logs show a warning about
accessing the mutex in interrupt context]
This moves the implementation into using a semaphore [with size '1']
instead.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When we come to tear things down in netback_remove() and generate the
uevent it is possible that the xenstore directory has already been
removed (details below).
In such cases netback_uevent() won't be able to read the hotplug
script and will write a xenstore error node.
A recent change to the hypervisor exposed this race such that we now
sometimes lose it (where apparently we didn't ever before).
Instead read the hotplug script configuration during setup and use it
for the lifetime of the backend device.
The apparently more obvious fix of moving the transition to
state=Closed in netback_remove() to after the uevent does not work
because it is possible that we are already in state=Closed (in
reaction to the guest having disconnected as it shutdown). Being
already in Closed means the toolstack is at liberty to start tearing
down the xenstore directories. In principal it might be possible to
arrange to unregister the device sooner (e.g on transition to Closing)
such that xenstore would still be there but this state machine is
fragile and prone to anger...
A modern Xen system only relies on the hotplug uevent for driver
domains, when the backend is in the same domain as the toolstack it
will run the necessary setup/teardown directly in the correct sequence
wrt xenstore changes.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/xen-netback/netback.c: In function ‘xenvif_tx_build_gops’:
drivers/net/xen-netback/netback.c:1253:8: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 5 has type ‘int’ [-Wformat=]
(txreq.offset&~PAGE_MASK) + txreq.size);
^
PAGE_MASK's type can vary by arch, so a cast is needed.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
----
v2: Cast to unsigned long, since PAGE_MASK can vary by arch.
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There have been concerns that the function names gpiod_set_array() and
gpiod_get_array() might be confusing to users. One might expect
gpiod_get_array() to return array values, while it is actually the array
counterpart of gpiod_get(). To be consistent with the single descriptor API
we could rename gpiod_set_array() to gpiod_set_array_value(). This makes
some function names a bit lengthy: gpiod_set_raw_array_value_cansleep().
Signed-off-by: Rojhalat Ibrahim <imr@rtschenk.de>
Acked-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
When Rx packet data must be dropped, all the buffers
associated with that Rx packet must be freed. Extend
and rename efx_free_rx_buffer() to efx_free_rx_buffers()
and loop through all the fragments.
By doing so this patch fixes a possible memory leak.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bug in the driver initialization causes soft-lockup if firmware
initialization timeout is reached. Polling function bfa_ioc_poll_fwinit()
incorrectly calls bfa_nw_iocpf_timeout() when the timeout is reached.
The problem is that bfa_nw_iocpf_timeout() calls again
bfa_ioc_poll_fwinit()... etc. The bfa_ioc_poll_fwinit() should directly
send timeout event for iocpf and the same should be done if firmware
download into HW fails.
Cc: Rasesh Mody <rasesh.mody@qlogic.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Driver starts iocpf timer prior bnad_ioceth_enable() call and this is
unreasonable. This piece of code probably originates from Brocade/Qlogic
out-of-box driver during initial import into upstream. This driver uses
only one timer and queue to implement multiple timers and this timer is
started at this place. The upstream driver uses multiple timers instead
of this.
Cc: Rasesh Mody <rasesh.mody@qlogic.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Firmware required by bna is stored in appropriate files as sequence
of LE32 integers. After loading by request_firmware() they need to be
byte-swapped on big-endian arches. Without this conversion the NIC
is unusable on big-endian machines.
Cc: Rasesh Mody <rasesh.mody@qlogic.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull fixes for cpumask and modules from Rusty Russell:
"** NOW WITH TESTING! **
Two fixes which got lost in my recent distraction. One is a weird
cpumask function which needed to be rewritten, the other is a module
bug which is cc:stable"
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
cpumask_set_cpu_local_first => cpumask_local_spread, lament
module: Call module notifier on failure after complete_formation()
The radio cfg DWORD was taken from the wrong place in the
8000 HW family, after a line in the code was wrongly changed
by mistake. This broke several 8260 devices.
Fixes: 5dd9c68a85 ("iwlwifi: drop support for early versions of 8000")
Signed-off-by: Liad Kaufman <liad.kaufman@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
The cmd_in_flight tracking was introduced to workaround faulty
power management hardware, by having the driver keep the NIC
awake as long as there are commands in flight. However, some of
the code handling this workaround was unconditionally executed,
which resulted with an inconsistent state where the driver assumed
that the NIC was awake although it wasn't.
Fix this by renaming 'cmd_in_flight' to 'cmd_hold_nic_awake' and
handling the NIC requested awake state only for hardwares for
which the workaround is needed.
Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
da91309e0a (cpumask: Utility function to set n'th cpu...) created a
genuinely weird function. I never saw it before, it went through DaveM.
(He only does this to make us other maintainers feel better about our own
mistakes.)
cpumask_set_cpu_local_first's purpose is say "I need to spread things
across N online cpus, choose the ones on this numa node first"; you call
it in a loop.
It can fail. One of the two callers ignores this, the other aborts and
fails the device open.
It can fail in two ways: allocating the off-stack cpumask, or through a
convoluted codepath which AFAICT can only occur if cpu_online_mask
changes. Which shouldn't happen, because if cpu_online_mask can change
while you call this, it could return a now-offline cpu anyway.
It contains a nonsensical test "!cpumask_of_node(numa_node)". This was
drawn to my attention by Geert, who said this causes a warning on Sparc.
It sets a single bit in a cpumask instead of returning a cpu number,
because that's what the callers want.
It could be made more efficient by passing the previous cpu rather than
an index, but that would be more invasive to the callers.
Fixes: da91309e0a
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (then rebased)
Tested-by: Amir Vadai <amirv@mellanox.com>
Acked-by: Amir Vadai <amirv@mellanox.com>
Acked-by: David S. Miller <davem@davemloft.net>
xennet_remove() freed the queues before freeing the netdevice which
results in a use-after-free when free_netdev() tries to delete the
napi instances that have already been freed.
Fix this by fully destroy the queues (which includes deleting the napi
instances) before freeing the netdevice.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The test in mlx4_load_one() to remove MLX4_FLAG_MSI_X expects mlx4_NOP() to
fail with -EBUSY. It is also necessary to avoid the reset since the device
is not fully reinitialized before calling mlx4_start_hca() a second time.
Note that this will also affect mlx4_test_interrupts(), the only other user
of MLX4_CMD_NOP.
Fixes: f5aef5a ("net/mlx4_core: Activate reset flow upon fatal command cases")
Signed-off-by: Benjamin Poirier <bpoirier@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit e9ce7cb6b1 ("xen-netback: Factor queue-specific data into queue
struct") introduced a regression when moving queue-specific data into
the queue struct by failing to set the credit_bytes field. This
prevented bandwidth limiting from working. Initialize the field as it
was done before multiqueue support was added.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If read() syscall requests unexpected number of bytes from "dimm" binary
attribute file, return EINVAL instead of EPERM.
At the same time pin down sysfs file size to the fixed
sizeof(struct netxen_dimm_cfg), which allows to exploit some missing
sanity checks from kernfs (file boundary checks vs offset etc.)
Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the ethtool command is used to set the speed of the device while
the device is down, the check to set the initial mode may fail when
the device is brought up, causing failure to bring the device up.
Update the code to set the initial mode based on the desired speed if
auto-negotiation is disabled.
This patch fixes a bug introduced by:
d9663c8c21 ("amd-xgbe-phy: Use phydev advertising field vs supported")
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A pair of nested spin locks was introduced in commit 63502b8d0
"dp83640: Fix receive timestamp race condition".
Unfortunately the 'flags' parameter was reused for the inner lock,
clobbering the originally saved IRQ state. This patch fixes the issue
by changing the inner lock to plain spin_lock without irqsave.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Callers of the ext_write function are supposed to hold a mutex that
protects the state of the dialed page, but one caller was missing the
lock from the very start, and over time the code has been changed
without following the rule. This patch cleans up the call sites in
violation of the rule.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the calibration function that corrects the initial offsets
among multiple devices only works the first time. If the function is
called more than once, the calibration fails and bogus offsets will be
programmed into the devices.
In a well hidden spot, the device documentation tells that trigger indexes
0 and 1 are special in allowing the TRIG_IF_LATE flag to actually work.
This patch fixes the issue by using one of the special triggers during the
recalibration method.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>