muen/linux.git
2 months agomm/ksm.c: don't WARN if page is still mapped in remove_stable_node()
Andrey Ryabinin [Fri, 22 Nov 2019 01:54:01 +0000 (17:54 -0800)]
mm/ksm.c: don't WARN if page is still mapped in remove_stable_node()

It's possible to hit the WARN_ON_ONCE(page_mapped(page)) in
remove_stable_node() when it races with __mmput() and squeezes in
between ksm_exit() and exit_mmap().

  WARNING: CPU: 0 PID: 3295 at mm/ksm.c:888 remove_stable_node+0x10c/0x150

  Call Trace:
   remove_all_stable_nodes+0x12b/0x330
   run_store+0x4ef/0x7b0
   kernfs_fop_write+0x200/0x420
   vfs_write+0x154/0x450
   ksys_write+0xf9/0x1d0
   do_syscall_64+0x99/0x510
   entry_SYSCALL_64_after_hwframe+0x49/0xbe

Remove the warning as there is nothing scary going on.

Link: http://lkml.kernel.org/r/20191119131850.5675-1-aryabinin@virtuozzo.com
Fixes: cbf86cfe04a6 ("ksm: remove old stable nodes more thoroughly")
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 months agomm/memory_hotplug: don't access uninitialized memmaps in shrink_zone_span()
David Hildenbrand [Fri, 22 Nov 2019 01:53:56 +0000 (17:53 -0800)]
mm/memory_hotplug: don't access uninitialized memmaps in shrink_zone_span()

Let's limit shrinking to !ZONE_DEVICE so we can fix the current code.
We should never try to touch the memmap of offline sections where we
could have uninitialized memmaps and could trigger BUGs when calling
page_to_nid() on poisoned pages.

There is no reliable way to distinguish an uninitialized memmap from an
initialized memmap that belongs to ZONE_DEVICE, as we don't have
anything like SECTION_IS_ONLINE we can use similar to
pfn_to_online_section() for !ZONE_DEVICE memory.

E.g., set_zone_contiguous() similarly relies on pfn_to_online_section()
and will therefore never set a ZONE_DEVICE zone consecutive.  Stopping
to shrink the ZONE_DEVICE therefore results in no observable changes,
besides /proc/zoneinfo indicating different boundaries - something we
can totally live with.

Before commit d0dc12e86b31 ("mm/memory_hotplug: optimize memory
hotplug"), the memmap was initialized with 0 and the node with the right
value.  So the zone might be wrong but not garbage.  After that commit,
both the zone and the node will be garbage when touching uninitialized
memmaps.

Toshiki reported a BUG (race between delayed initialization of
ZONE_DEVICE memmaps without holding the memory hotplug lock and
concurrent zone shrinking).

  https://lkml.org/lkml/2019/11/14/1040

"Iteration of create and destroy namespace causes the panic as below:

      kernel BUG at mm/page_alloc.c:535!
      CPU: 7 PID: 2766 Comm: ndctl Not tainted 5.4.0-rc4 #6
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
      RIP: 0010:set_pfnblock_flags_mask+0x95/0xf0
      Call Trace:
       memmap_init_zone_device+0x165/0x17c
       memremap_pages+0x4c1/0x540
       devm_memremap_pages+0x1d/0x60
       pmem_attach_disk+0x16b/0x600 [nd_pmem]
       nvdimm_bus_probe+0x69/0x1c0
       really_probe+0x1c2/0x3e0
       driver_probe_device+0xb4/0x100
       device_driver_attach+0x4f/0x60
       bind_store+0xc9/0x110
       kernfs_fop_write+0x116/0x190
       vfs_write+0xa5/0x1a0
       ksys_write+0x59/0xd0
       do_syscall_64+0x5b/0x180
       entry_SYSCALL_64_after_hwframe+0x44/0xa9

  While creating a namespace and initializing memmap, if you destroy the
  namespace and shrink the zone, it will initialize the memmap outside
  the zone and trigger VM_BUG_ON_PAGE(!zone_spans_pfn(page_zone(page),
  pfn), page) in set_pfnblock_flags_mask()."

This BUG is also mitigated by this commit, where we for now stop to
shrink the ZONE_DEVICE zone until we can do it in a safe and clean way.

Link: http://lkml.kernel.org/r/20191006085646.5768-5-david@redhat.com
Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") [visible after d0dc12e86b319]
Signed-off-by: David Hildenbrand <david@redhat.com>
Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reported-by: Toshiki Fukasawa <t-fukasawa@vx.jp.nec.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Damian Tometzki <damian.tometzki@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jun Yao <yaojun8558363@gmail.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Pankaj Gupta <pagupta@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pavel Tatashin <pavel.tatashin@microsoft.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qian Cai <cai@lca.pw>
Cc: Rich Felker <dalias@libc.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Wei Yang <richardw.yang@linux.intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Yu Zhao <yuzhao@google.com>
Cc: <stable@vger.kernel.org> [4.13+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 months agoRevert "fs: ocfs2: fix possible null-pointer dereferences in ocfs2_xa_prepare_entry()"
Joseph Qi [Fri, 22 Nov 2019 01:53:52 +0000 (17:53 -0800)]
Revert "fs: ocfs2: fix possible null-pointer dereferences in ocfs2_xa_prepare_entry()"

This reverts commit 56e94ea132bb5c2c1d0b60a6aeb34dcb7d71a53d.

Commit 56e94ea132bb ("fs: ocfs2: fix possible null-pointer dereferences
in ocfs2_xa_prepare_entry()") introduces a regression that fail to
create directory with mount option user_xattr and acl.  Actually the
reported NULL pointer dereference case can be correctly handled by
loc->xl_ops->xlo_add_entry(), so revert it.

Link: http://lkml.kernel.org/r/1573624916-83825-1-git-send-email-joseph.qi@linux.alibaba.com
Fixes: 56e94ea132bb ("fs: ocfs2: fix possible null-pointer dereferences in ocfs2_xa_prepare_entry()")
Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reported-by: Thomas Voegtle <tv@lio96.de>
Acked-by: Changwei Ge <gechangwei@live.cn>
Cc: Jia-Ju Bai <baijiaju1990@gmail.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 months agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Thu, 21 Nov 2019 20:15:24 +0000 (12:15 -0800)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux

Pull arm64 fix from Will Deacon:
 "Ensure PAN is re-enabled following user fault in uaccess routines.

  After I thought we were done for 5.4, we had a report this week of a
  nasty issue that has been shown to leak data between different user
  address spaces thanks to corruption of entries in the TLB. In
  hindsight, we should have spotted this in review when the PAN code was
  merged back in v4.3, but hindsight is 20/20 and I'm trying not to beat
  myself up too much about it despite being fairly miserable.

  Anyway, the fix is "obvious" but the actual failure is more more
  subtle, and is described in the commit message. I've included a fairly
  mechanical follow-up patch here as well, which moves this checking out
  into the C wrappers which is what we do for {get,put}_user() already
  and allows us to remove these bloody assembly macros entirely. The
  patches have passed kernelci [1] [2] [3] and CKI [4] tests over night,
  as well as some targetted testing [5] for this particular issue.

  The first patch is tagged for stable and should be applied to 4.14,
  4.19 and 5.3. I have separate backports for 4.4 and 4.9, which I'll
  send out once this has landed in your tree (although the original
  patch applies cleanly, it won't build for those two trees).

  Thanks to Pavel Tatashin for reporting this and Mark Rutland for
  helping to diagnose the issue and review/test the solution"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: uaccess: Remove uaccess_*_not_uao asm macros
  arm64: uaccess: Ensure PAN is re-enabled after unhandled uaccess fault

2 months agoMerge tag 'for-linus-20191121' of git://git.kernel.dk/linux-block
Linus Torvalds [Thu, 21 Nov 2019 20:04:50 +0000 (12:04 -0800)]
Merge tag 'for-linus-20191121' of git://git.kernel.dk/linux-block

Pull block fix from Jens Axboe:
 "Just a single fix for an issue in nbd introduced in this cycle"

* tag 'for-linus-20191121' of git://git.kernel.dk/linux-block:
  nbd:fix memory leak in nbd_get_socket()

2 months agoMerge tag 'gpio-v5.4-5' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux...
Linus Torvalds [Thu, 21 Nov 2019 20:01:30 +0000 (12:01 -0800)]
Merge tag 'gpio-v5.4-5' of git://git./linux/kernel/git/linusw/linux-gpio

Pull GPIO fixes from Linus Walleij:
 "A last set of small fixes for GPIO, this cycle was quite busy.

   - Fix debounce delays on the MAX77620 GPIO expander

   - Use the correct unit for debounce times on the BD70528 GPIO expander

   - Get proper deps for parallel builds of the GPIO tools

   - Add a specific ACPI quirk for the Terra Pad 1061"

* tag 'gpio-v5.4-5' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpiolib: acpi: Add Terra Pad 1061 to the run_edge_events_on_boot_blacklist
  tools: gpio: Correctly add make dependencies for gpio_utils
  gpio: bd70528: Use correct unit for debounce times
  gpio: max77620: Fixup debounce delays

2 months agoMerge tag 'for-linus-2019-11-21' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 21 Nov 2019 19:51:49 +0000 (11:51 -0800)]
Merge tag 'for-linus-2019-11-21' of git://git./linux/kernel/git/brauner/linux

Pull pidfd fixlet from Christian Brauner:
 "This contains a simple fix for the pidfd poll method. In the original
  patchset pidfd_poll() was made to return an unsigned int. However, the
  poll method is defined to return a __poll_t. While the unsigned int is
  not a huge deal it's just nicer to return a __poll_t.

  I've decided to send it right before the 5.4 release mainly so that
  stable doesn't need to backport it to both 5.4 and 5.3"

* tag 'for-linus-2019-11-21' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  fork: fix pidfd_poll()'s return type

2 months agoarm64: uaccess: Remove uaccess_*_not_uao asm macros
Pavel Tatashin [Wed, 20 Nov 2019 17:07:40 +0000 (12:07 -0500)]
arm64: uaccess: Remove uaccess_*_not_uao asm macros

It is safer and simpler to drop the uaccess assembly macros in favour of
inline C functions. Although this bloats the Image size slightly, it
aligns our user copy routines with '{get,put}_user()' and generally
makes the code a lot easier to reason about.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
[will: tweaked commit message and changed temporary variable names]
Signed-off-by: Will Deacon <will@kernel.org>
2 months agoarm64: uaccess: Ensure PAN is re-enabled after unhandled uaccess fault
Pavel Tatashin [Tue, 19 Nov 2019 22:10:06 +0000 (17:10 -0500)]
arm64: uaccess: Ensure PAN is re-enabled after unhandled uaccess fault

A number of our uaccess routines ('__arch_clear_user()' and
'__arch_copy_{in,from,to}_user()') fail to re-enable PAN if they
encounter an unhandled fault whilst accessing userspace.

For CPUs implementing both hardware PAN and UAO, this bug has no effect
when both extensions are in use by the kernel.

For CPUs implementing hardware PAN but not UAO, this means that a kernel
using hardware PAN may execute portions of code with PAN inadvertently
disabled, opening us up to potential security vulnerabilities that rely
on userspace access from within the kernel which would usually be
prevented by this mechanism. In other words, parts of the kernel run the
same way as they would on a CPU without PAN implemented/emulated at all.

For CPUs not implementing hardware PAN and instead relying on software
emulation via 'CONFIG_ARM64_SW_TTBR0_PAN=y', the impact is unfortunately
much worse. Calling 'schedule()' with software PAN disabled means that
the next task will execute in the kernel using the page-table and ASID
of the previous process even after 'switch_mm()', since the actual
hardware switch is deferred until return to userspace. At this point, or
if there is a intermediate call to 'uaccess_enable()', the page-table
and ASID of the new process are installed. Sadly, due to the changes
introduced by KPTI, this is not an atomic operation and there is a very
small window (two instructions) where the CPU is configured with the
page-table of the old task and the ASID of the new task; a speculative
access in this state is disastrous because it would corrupt the TLB
entries for the new task with mappings from the previous address space.

As Pavel explains:

  | I was able to reproduce memory corruption problem on Broadcom's SoC
  | ARMv8-A like this:
  |
  | Enable software perf-events with PERF_SAMPLE_CALLCHAIN so userland's
  | stack is accessed and copied.
  |
  | The test program performed the following on every CPU and forking
  | many processes:
  |
  | unsigned long *map = mmap(NULL, PAGE_SIZE, PROT_READ|PROT_WRITE,
  |   MAP_SHARED | MAP_ANONYMOUS, -1, 0);
  | map[0] = getpid();
  | sched_yield();
  | if (map[0] != getpid()) {
  | fprintf(stderr, "Corruption detected!");
  | }
  | munmap(map, PAGE_SIZE);
  |
  | From time to time I was getting map[0] to contain pid for a
  | different process.

Ensure that PAN is re-enabled when returning after an unhandled user
fault from our uaccess routines.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Cc: <stable@vger.kernel.org>
Fixes: 338d4f49d6f7 ("arm64: kernel: Add support for Privileged Access Never")
Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
[will: rewrote commit message]
Signed-off-by: Will Deacon <will@kernel.org>
2 months agofork: fix pidfd_poll()'s return type
Luc Van Oostenryck [Wed, 20 Nov 2019 00:33:20 +0000 (01:33 +0100)]
fork: fix pidfd_poll()'s return type

pidfd_poll() is defined as returning 'unsigned int' but the
.poll method is declared as returning '__poll_t', a bitwise type.

Fix this by using the proper return type and using the EPOLL
constants instead of the POLL ones, as required for __poll_t.

Fixes: b53b0b9d9a61 ("pidfd: add polling support")
Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: stable@vger.kernel.org # 5.3
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com>
Link: https://lore.kernel.org/r/20191120003320.31138-1-luc.vanoostenryck@gmail.com
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2 months agoafs: Fix missing timeout reset
David Howells [Tue, 19 Nov 2019 21:00:36 +0000 (21:00 +0000)]
afs: Fix missing timeout reset

In afs_wait_for_call_to_complete(), rather than immediately aborting an
operation if a signal occurs, the code attempts to wait for it to
complete, using a schedule timeout of 2*RTT (or min 2 jiffies) and a
check that we're still receiving relevant packets from the server before
we consider aborting the call.  We may even ping the server to check on
the status of the call.

However, there's a missing timeout reset in the event that we do
actually get a packet to process, such that if we then get a couple of
short stalls, we then time out when progress is actually being made.

Fix this by resetting the timeout any time we get something to process.
If it's the failure of the call then the call state will get changed and
we'll exit the loop shortly thereafter.

A symptom of this is data fetches and stores failing with EINTR when
they really shouldn't.

Fixes: bc5e3a546d55 ("rxrpc: Use MSG_WAITALL to tell sendmsg() to temporarily ignore signals")
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 months agomdio_bus: Fix init if CONFIG_RESET_CONTROLLER=n
Geert Uytterhoeven [Tue, 19 Nov 2019 11:25:24 +0000 (12:25 +0100)]
mdio_bus: Fix init if CONFIG_RESET_CONTROLLER=n

Commit 1d4639567d97 ("mdio_bus: Fix PTR_ERR applied after initialization
to constant") accidentally changed a check from -ENOTSUPP to -ENOSYS,
causing failures if reset controller support is not enabled.  E.g. on
r7s72100/rskrza1:

    sh-eth e8203000.ethernet: MDIO init failed: -524
    sh-eth: probe of e8203000.ethernet failed with error -524

Seen on r8a7740/armadillo, r7s72100/rskrza1, and r7s9210/rza2mevb.

Fixes: 1d4639567d97 ("mdio_bus: Fix PTR_ERR applied after initialization to constant")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: YueHaibing <yuehaibing@huawei.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 months agonbd:fix memory leak in nbd_get_socket()
Sun Ke [Tue, 19 Nov 2019 06:09:11 +0000 (14:09 +0800)]
nbd:fix memory leak in nbd_get_socket()

Before returning NULL, put the sock first.

Cc: stable@vger.kernel.org
Fixes: cf1b2326b734 ("nbd: verify socket is supported during setup")
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Sun Ke <sunke32@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 months agoLinux 5.4-rc8 v5.4-rc8
Linus Torvalds [Sun, 17 Nov 2019 22:47:30 +0000 (14:47 -0800)]
Linux 5.4-rc8

2 months agoMerge tag 'iommu-fixes-v5.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 17 Nov 2019 19:27:44 +0000 (11:27 -0800)]
Merge tag 'iommu-fixes-v5.4-rc7' of git://git./linux/kernel/git/joro/iommu

Pull iommu fixes from Joerg Roedel:

 - Fix for Intel IOMMU to correct invalidation commands when in SVA
   mode.

 - Update MAINTAINERS entry for Intel IOMMU

* tag 'iommu-fixes-v5.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/vt-d: Fix QI_DEV_IOTLB_PFSID and QI_DEV_EIOTLB_PFSID macros
  MAINTAINERS: Update for INTEL IOMMU (VT-d) entry

2 months agoMerge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 17 Nov 2019 16:30:38 +0000 (08:30 -0800)]
Merge branch 'sched-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull misc scheduler fixes from Ingo Molnar:

 - Fix potential deadlock under CONFIG_DEBUG_OBJECTS=y

 - PELT metrics update ordering fix

 - uclamp logic fix

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/uclamp: Fix incorrect condition
  sched/pelt: Fix update of blocked PELT ordering
  sched/core: Avoid spurious lock dependencies

2 months agoMerge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa...
Linus Torvalds [Sun, 17 Nov 2019 16:15:41 +0000 (08:15 -0800)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:
 "An I2C core fix to prevent a use-after-free in a rare error path,
  and an I2C ACPI addition to work around broken HW/firmware related
  to touchscreens"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: core: fix use after free in of_i2c_notify
  i2c: acpi: Force bus speed to 400KHz if a Silead touchscreen is present

2 months agoMerge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Linus Torvalds [Sun, 17 Nov 2019 02:14:32 +0000 (18:14 -0800)]
Merge branch 'linus' of git://git./linux/kernel/git/herbert/crypto-2.6

Pull crypto fix from Herbert Xu:
 "This reverts a number of changes to the khwrng thread which feeds the
  kernel random number pool from hwrng drivers. They were trying to fix
  issues with suspend-and-resume but ended up causing regressions"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  Revert "hwrng: core - Freeze khwrng thread during suspend"

2 months agoRevert "hwrng: core - Freeze khwrng thread during suspend"
Herbert Xu [Sun, 17 Nov 2019 00:48:17 +0000 (08:48 +0800)]
Revert "hwrng: core - Freeze khwrng thread during suspend"

This reverts commit 03a3bb7ae631 ("hwrng: core - Freeze khwrng
thread during suspend"), ff296293b353 ("random: Support freezable
kthreads in add_hwgenerator_randomness()") and 59b569480dc8 ("random:
Use wait_event_freezable() in add_hwgenerator_randomness()").

These patches introduced regressions and we need more time to
get them ready for mainline.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2 months agoMerge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 17 Nov 2019 00:10:59 +0000 (16:10 -0800)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:
 "Two fixes: disable unreliable HPET on Intel Coffe Lake platforms, and
  fix a lockdep splat in the resctrl code"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/resctrl: Fix potential lockdep warning
  x86/quirks: Disable HPET on Intel Coffe Lake platforms

2 months agoMerge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 17 Nov 2019 00:08:46 +0000 (16:08 -0800)]
Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull timer fix from Ingo Molnar:
 "Fix integer truncation bug in __do_adjtimex()"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  ntp/y2038: Remove incorrect time_t truncation

2 months agoMerge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 16 Nov 2019 23:56:01 +0000 (15:56 -0800)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
 "Misc fixes: a handful of AUX event handling related fixes, a Sparse
  fix and two ABI fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Fix missing static inline on perf_cgroup_switch()
  perf/core: Consistently fail fork on allocation failures
  perf/aux: Disallow aux_output for kernel events
  perf/core: Reattach a misplaced comment
  perf/aux: Fix the aux_output group inheritance fix
  perf/core: Disallow uncore-cgroup events

2 months agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Sat, 16 Nov 2019 23:52:00 +0000 (15:52 -0800)]
Merge git://git./linux/kernel/git/netdev/net

Pull networking fixes from David Miller:

 1) Fix memory leak in xfrm_state code, from Steffen Klassert.

 2) Fix races between devlink reload operations and device
    setup/cleanup, from Jiri Pirko.

 3) Null deref in NFC code, from Stephan Gerhold.

 4) Refcount fixes in SMC, from Ursula Braun.

 5) Memory leak in slcan open error paths, from Jouni Hogander.

 6) Fix ETS bandwidth validation in hns3, from Yonglong Liu.

 7) Info leak on short USB request answers in ax88172a driver, from
    Oliver Neukum.

 8) Release mem region properly in ep93xx_eth, from Chuhong Yuan.

 9) PTP config timestamp flags validation, from Richard Cochran.

10) Dangling pointers after SKB data realloc in seg6, from Andrea Mayer.

11) Missing free_netdev() in gemini driver, from Chuhong Yuan.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (56 commits)
  ipmr: Fix skb headroom in ipmr_get_route().
  net: hns3: cleanup of stray struct hns3_link_mode_mapping
  net/smc: fix fastopen for non-blocking connect()
  rds: ib: update WR sizes when bringing up connection
  net: gemini: add missed free_netdev
  net: dsa: tag_8021q: Fix dsa_8021q_restore_pvid for an absent pvid
  seg6: fix skb transport_header after decap_and_validate()
  seg6: fix srh pointer in get_srh()
  net: stmmac: Use the correct style for SPDX License Identifier
  octeontx2-af: Use the correct style for SPDX License Identifier
  ptp: Extend the test program to check the external time stamp flags.
  mlx5: Reject requests to enable time stamping on both edges.
  igb: Reject requests that fail to enable time stamping on both edges.
  dp83640: Reject requests to enable time stamping on both edges.
  mv88e6xxx: Reject requests to enable time stamping on both edges.
  ptp: Introduce strict checking of external time stamp options.
  renesas: reject unsupported external timestamp flags
  mlx5: reject unsupported external timestamp flags
  igb: reject unsupported external timestamp flags
  dp83640: reject unsupported external timestamp flags
  ...

2 months agoipmr: Fix skb headroom in ipmr_get_route().
Guillaume Nault [Fri, 15 Nov 2019 17:29:52 +0000 (18:29 +0100)]
ipmr: Fix skb headroom in ipmr_get_route().

In route.c, inet_rtm_getroute_build_skb() creates an skb with no
headroom. This skb is then used by inet_rtm_getroute() which may pass
it to rt_fill_info() and, from there, to ipmr_get_route(). The later
might try to reuse this skb by cloning it and prepending an IPv4
header. But since the original skb has no headroom, skb_push() triggers
skb_under_panic():

skbuff: skb_under_panic: text:00000000ca46ad8a len:80 put:20 head:00000000cd28494e data:000000009366fd6b tail:0x3c end:0xec0 dev:veth0
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:108!
invalid opcode: 0000 [#1] SMP KASAN PTI
CPU: 6 PID: 587 Comm: ip Not tainted 5.4.0-rc6+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 04/01/2014
RIP: 0010:skb_panic+0xbf/0xd0
Code: 41 a2 ff 8b 4b 70 4c 8b 4d d0 48 c7 c7 20 76 f5 8b 44 8b 45 bc 48 8b 55 c0 48 8b 75 c8 41 54 41 57 41 56 41 55 e8 75 dc 7a ff <0f> 0b 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00
RSP: 0018:ffff888059ddf0b0 EFLAGS: 00010286
RAX: 0000000000000086 RBX: ffff888060a315c0 RCX: ffffffff8abe4822
RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff88806c9a79cc
RBP: ffff888059ddf118 R08: ffffed100d9361b1 R09: ffffed100d9361b0
R10: ffff88805c68aee3 R11: ffffed100d9361b1 R12: ffff88805d218000
R13: ffff88805c689fec R14: 000000000000003c R15: 0000000000000ec0
FS:  00007f6af184b700(0000) GS:ffff88806c980000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffc8204a000 CR3: 0000000057b40006 CR4: 0000000000360ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 skb_push+0x7e/0x80
 ipmr_get_route+0x459/0x6fa
 rt_fill_info+0x692/0x9f0
 inet_rtm_getroute+0xd26/0xf20
 rtnetlink_rcv_msg+0x45d/0x630
 netlink_rcv_skb+0x1a5/0x220
 rtnetlink_rcv+0x15/0x20
 netlink_unicast+0x305/0x3a0
 netlink_sendmsg+0x575/0x730
 sock_sendmsg+0xb5/0xc0
 ___sys_sendmsg+0x497/0x4f0
 __sys_sendmsg+0xcb/0x150
 __x64_sys_sendmsg+0x48/0x50
 do_syscall_64+0xd2/0xac0
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Actually the original skb used to have enough headroom, but the
reserve_skb() call was lost with the introduction of
inet_rtm_getroute_build_skb() by commit 404eb77ea766 ("ipv4: support
sport, dport and ip_proto in RTM_GETROUTE").

We could reserve some headroom again in inet_rtm_getroute_build_skb(),
but this function shouldn't be responsible for handling the special
case of ipmr_get_route(). Let's handle that directly in
ipmr_get_route() by calling skb_realloc_headroom() instead of
skb_clone().

Fixes: 404eb77ea766 ("ipv4: support sport, dport and ip_proto in RTM_GETROUTE")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agonet: hns3: cleanup of stray struct hns3_link_mode_mapping
Salil Mehta [Fri, 15 Nov 2019 11:52:32 +0000 (11:52 +0000)]
net: hns3: cleanup of stray struct hns3_link_mode_mapping

This patch cleans-up the stray left over code. It has no
functionality impact.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agonet/smc: fix fastopen for non-blocking connect()
Ursula Braun [Fri, 15 Nov 2019 11:39:30 +0000 (12:39 +0100)]
net/smc: fix fastopen for non-blocking connect()

FASTOPEN does not work with SMC-sockets. Since SMC allows fallback to
TCP native during connection start, the FASTOPEN setsockopts trigger
this fallback, if the SMC-socket is still in state SMC_INIT.
But if a FASTOPEN setsockopt is called after a non-blocking connect(),
this is broken, and fallback does not make sense.
This change complements
commit cd2063604ea6 ("net/smc: avoid fallback in case of non-blocking connect")
and fixes the syzbot reported problem "WARNING in smc_unhash_sk".

Reported-by: syzbot+8488cc4cf1c9e09b8b86@syzkaller.appspotmail.com
Fixes: e1bbdd570474 ("net/smc: reduce sock_put() for fallback sockets")
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agords: ib: update WR sizes when bringing up connection
Dag Moxnes [Fri, 15 Nov 2019 08:56:01 +0000 (09:56 +0100)]
rds: ib: update WR sizes when bringing up connection

Currently WR sizes are updated from rds_ib_sysctl_max_send_wr and
rds_ib_sysctl_max_recv_wr when a connection is shut down. As a result,
a connection being down while rds_ib_sysctl_max_send_wr or
rds_ib_sysctl_max_recv_wr are updated, will not update the sizes when
it comes back up.

Move resizing of WRs to rds_ib_setup_qp so that connections will be setup
with the most current WR sizes.

Signed-off-by: Dag Moxnes <dag.moxnes@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agonet: gemini: add missed free_netdev
Chuhong Yuan [Fri, 15 Nov 2019 06:24:54 +0000 (14:24 +0800)]
net: gemini: add missed free_netdev

This driver forgets to free allocated netdev in remove like
what is done in probe failure.
Add the free to fix it.

Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agonet: dsa: tag_8021q: Fix dsa_8021q_restore_pvid for an absent pvid
Vladimir Oltean [Sat, 16 Nov 2019 16:08:25 +0000 (18:08 +0200)]
net: dsa: tag_8021q: Fix dsa_8021q_restore_pvid for an absent pvid

This sequence of operations:
ip link set dev br0 type bridge vlan_filtering 1
bridge vlan del dev swp2 vid 1
ip link set dev br0 type bridge vlan_filtering 1
ip link set dev br0 type bridge vlan_filtering 0

apparently fails with the message:

[   31.305716] sja1105 spi0.1: Reset switch and programmed static config. Reason: VLAN filtering
[   31.322161] sja1105 spi0.1: Couldn't determine PVID attributes (pvid 0)
[   31.328939] sja1105 spi0.1: Failed to setup VLAN tagging for port 1: -2
[   31.335599] ------------[ cut here ]------------
[   31.340215] WARNING: CPU: 1 PID: 194 at net/switchdev/switchdev.c:157 switchdev_port_attr_set_now+0x9c/0xa4
[   31.349981] br0: Commit of attribute (id=6) failed.
[   31.354890] Modules linked in:
[   31.357942] CPU: 1 PID: 194 Comm: ip Not tainted 5.4.0-rc6-01792-gf4f632e07665-dirty #2062
[   31.366167] Hardware name: Freescale LS1021A
[   31.370437] [<c03144dc>] (unwind_backtrace) from [<c030e184>] (show_stack+0x10/0x14)
[   31.378153] [<c030e184>] (show_stack) from [<c11d1c1c>] (dump_stack+0xe0/0x10c)
[   31.385437] [<c11d1c1c>] (dump_stack) from [<c034c730>] (__warn+0xf4/0x10c)
[   31.392373] [<c034c730>] (__warn) from [<c034c7bc>] (warn_slowpath_fmt+0x74/0xb8)
[   31.399827] [<c034c7bc>] (warn_slowpath_fmt) from [<c11ca204>] (switchdev_port_attr_set_now+0x9c/0xa4)
[   31.409097] [<c11ca204>] (switchdev_port_attr_set_now) from [<c117036c>] (__br_vlan_filter_toggle+0x6c/0x118)
[   31.418971] [<c117036c>] (__br_vlan_filter_toggle) from [<c115d010>] (br_changelink+0xf8/0x518)
[   31.427637] [<c115d010>] (br_changelink) from [<c0f8e9ec>] (__rtnl_newlink+0x3f4/0x76c)
[   31.435613] [<c0f8e9ec>] (__rtnl_newlink) from [<c0f8eda8>] (rtnl_newlink+0x44/0x60)
[   31.443329] [<c0f8eda8>] (rtnl_newlink) from [<c0f89f20>] (rtnetlink_rcv_msg+0x2cc/0x51c)
[   31.451477] [<c0f89f20>] (rtnetlink_rcv_msg) from [<c1008df8>] (netlink_rcv_skb+0xb8/0x110)
[   31.459796] [<c1008df8>] (netlink_rcv_skb) from [<c1008648>] (netlink_unicast+0x17c/0x1f8)
[   31.468026] [<c1008648>] (netlink_unicast) from [<c1008980>] (netlink_sendmsg+0x2bc/0x3b4)
[   31.476261] [<c1008980>] (netlink_sendmsg) from [<c0f43858>] (___sys_sendmsg+0x230/0x250)
[   31.484408] [<c0f43858>] (___sys_sendmsg) from [<c0f44c84>] (__sys_sendmsg+0x50/0x8c)
[   31.492209] [<c0f44c84>] (__sys_sendmsg) from [<c0301000>] (ret_fast_syscall+0x0/0x28)
[   31.500090] Exception stack(0xedf47fa8 to 0xedf47ff0)
[   31.505122] 7fa0:                   00000002 b6f2e060 00000003 beabd6a4 00000000 00000000
[   31.513265] 7fc0: 00000002 b6f2e060 5d6e3213 00000128 00000000 00000001 00000006 000619c4
[   31.521405] 7fe0: 00086078 beabd658 0005edbc b6e7ce68

The reason is the implementation of br_get_pvid:

static inline u16 br_get_pvid(const struct net_bridge_vlan_group *vg)
{
if (!vg)
return 0;

smp_rmb();
return vg->pvid;
}

Since VID 0 is an invalid pvid from the bridge's point of view, let's
add this check in dsa_8021q_restore_pvid to avoid restoring a pvid that
doesn't really exist.

Fixes: 5f33183b7fdf ("net: dsa: tag_8021q: Restore bridge VLANs when enabling vlan_filtering")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agoMerge branch 'seg6-fixes-to-Segment-Routing-in-IPv6'
David S. Miller [Sat, 16 Nov 2019 20:18:32 +0000 (12:18 -0800)]
Merge branch 'seg6-fixes-to-Segment-Routing-in-IPv6'

Andrea Mayer says:

====================
seg6: fixes to Segment Routing in IPv6

This patchset is divided in 2 patches and it introduces some fixes
to Segment Routing in IPv6, which are:

- in function get_srh() fix the srh pointer after calling
  pskb_may_pull();

- fix the skb->transport_header after calling decap_and_validate()
  function;

Any comments on the patchset are welcome.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agoseg6: fix skb transport_header after decap_and_validate()
Andrea Mayer [Sat, 16 Nov 2019 15:05:53 +0000 (16:05 +0100)]
seg6: fix skb transport_header after decap_and_validate()

in the receive path (more precisely in ip6_rcv_core()) the
skb->transport_header is set to skb->network_header + sizeof(*hdr). As a
consequence, after routing operations, destination input expects to find
skb->transport_header correctly set to the next protocol (or extension
header) that follows the network protocol. However, decap behaviors (DX*,
DT*) remove the outer IPv6 and SRH extension and do not set again the
skb->transport_header pointer correctly. For this reason, the patch sets
the skb->transport_header to the skb->network_header + sizeof(hdr) in each
DX* and DT* behavior.

Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agoseg6: fix srh pointer in get_srh()
Andrea Mayer [Sat, 16 Nov 2019 15:05:52 +0000 (16:05 +0100)]
seg6: fix srh pointer in get_srh()

pskb_may_pull may change pointers in header. For this reason, it is
mandatory to reload any pointer that points into skb header.

Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agonet: stmmac: Use the correct style for SPDX License Identifier
Nishad Kamdar [Sat, 16 Nov 2019 09:40:59 +0000 (15:10 +0530)]
net: stmmac: Use the correct style for SPDX License Identifier

This patch corrects the SPDX License Identifier style in
header files related to STMicroelectronics based Multi-Gigabit
Ethernet driver. For C header files Documentation/process/license-rules.rst
mandates C-like comments (opposed to C source files where
C++ style should be used).

Changes made by using a script provided by Joe Perches here:
https://lkml.org/lkml/2019/2/7/46.

Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Nishad Kamdar <nishadkamdar@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agoocteontx2-af: Use the correct style for SPDX License Identifier
Nishad Kamdar [Sat, 16 Nov 2019 09:20:45 +0000 (14:50 +0530)]
octeontx2-af: Use the correct style for SPDX License Identifier

This patch corrects the SPDX License Identifier style in
header files related to Marvell OcteonTX2 network devices.
It uses an expilict block comment for the SPDX License
Identifier.

Changes made by using a script provided by Joe Perches here:
https://lkml.org/lkml/2019/2/7/46.

Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Nishad Kamdar <nishadkamdar@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Sat, 16 Nov 2019 16:20:43 +0000 (08:20 -0800)]
Merge branch 'akpm' (patches from Andrew)

Merge misc fixes from Andrew Morton:
 "11 fixes"

MM fixes and one xz decompressor fix.

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm/debug.c: PageAnon() is true for PageKsm() pages
  mm/debug.c: __dump_page() prints an extra line
  mm/page_io.c: do not free shared swap slots
  mm/memory_hotplug: fix try_offline_node()
  mm,thp: recheck each page before collapsing file THP
  mm: slub: really fix slab walking for init_on_free
  mm: hugetlb: switch to css_tryget() in hugetlb_cgroup_charge_cgroup()
  mm: memcg: switch to css_tryget() in get_mem_cgroup_from_mm()
  lib/xz: fix XZ_DYNALLOC to avoid useless memory reallocations
  mm: fix trying to reclaim unevictable lru page when calling madvise_pageout
  mm: mempolicy: fix the wrong return value and potential pages leak of mbind

2 months agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Linus Torvalds [Sat, 16 Nov 2019 02:37:20 +0000 (18:37 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input

Pull more input fixes from Dmitry Torokhov:
 "A couple of fixes in driver teardown paths and another ID for
  Synaptics RMI mode"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: synaptics - enable RMI mode for X1 Extreme 2nd Generation
  Input: synaptics-rmi4 - destroy F54 poller workqueue when removing
  Input: ff-memless - kill timer in destroy()

2 months agomm/debug.c: PageAnon() is true for PageKsm() pages
Ralph Campbell [Sat, 16 Nov 2019 01:35:07 +0000 (17:35 -0800)]
mm/debug.c: PageAnon() is true for PageKsm() pages

PageAnon() and PageKsm() use the low two bits of the page->mapping
pointer to indicate the page type.  PageAnon() only checks the LSB while
PageKsm() checks the least significant 2 bits are equal to 3.

Therefore, PageAnon() is true for KSM pages.  __dump_page() incorrectly
will never print "ksm" because it checks PageAnon() first.  Fix this by
checking PageKsm() first.

Link: http://lkml.kernel.org/r/20191113000651.20677-1-rcampbell@nvidia.com
Fixes: 1c6fb1d89e73 ("mm: print more information about mapping in __dump_page")
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 months agomm/debug.c: __dump_page() prints an extra line
Ralph Campbell [Sat, 16 Nov 2019 01:35:04 +0000 (17:35 -0800)]
mm/debug.c: __dump_page() prints an extra line

When dumping struct page information, __dump_page() prints the page type
with a trailing blank followed by the page flags on a separate line:

  anon
  flags: 0x100000000090034(uptodate|lru|active|head|swapbacked)

It looks like the intent was to use pr_cont() for printing "flags:" but
pr_cont() usage is discouraged so fix this by extending the format to
include the flags into a single line:

  anon flags: 0x100000000090034(uptodate|lru|active|head|swapbacked)

If the page is file backed, the name might be long so use two lines:

  shmem_aops name:"dev/zero"
  flags: 0x10000000008000c(uptodate|dirty|swapbacked)

Eliminate pr_conf() usage as well for appending compound_mapcount.

Link: http://lkml.kernel.org/r/20191112012608.16926-1-rcampbell@nvidia.com
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 months agomm/page_io.c: do not free shared swap slots
Vinayak Menon [Sat, 16 Nov 2019 01:35:00 +0000 (17:35 -0800)]
mm/page_io.c: do not free shared swap slots

The following race is observed due to which a processes faulting on a
swap entry, finds the page neither in swapcache nor swap.  This causes
zram to give a zero filled page that gets mapped to the process,
resulting in a user space crash later.

Consider parent and child processes Pa and Pb sharing the same swap slot
with swap_count 2.  Swap is on zram with SWP_SYNCHRONOUS_IO set.
Virtual address 'VA' of Pa and Pb points to the shared swap entry.

Pa                                       Pb

fault on VA                              fault on VA
do_swap_page                             do_swap_page
lookup_swap_cache fails                  lookup_swap_cache fails
                                         Pb scheduled out
swapin_readahead (deletes zram entry)
swap_free (makes swap_count 1)
                                         Pb scheduled in
                                         swap_readpage (swap_count == 1)
                                         Takes SWP_SYNCHRONOUS_IO path
                                         zram enrty absent
                                         zram gives a zero filled page

Fix this by making sure that swap slot is freed only when swap count
drops down to one.

Link: http://lkml.kernel.org/r/1571743294-14285-1-git-send-email-vinmenon@codeaurora.org
Fixes: aa8d22a11da9 ("mm: swap: SWP_SYNCHRONOUS_IO: skip swapcache only if swapped page has no other reference")
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
Suggested-by: Minchan Kim <minchan@google.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 months agomm/memory_hotplug: fix try_offline_node()
David Hildenbrand [Sat, 16 Nov 2019 01:34:57 +0000 (17:34 -0800)]
mm/memory_hotplug: fix try_offline_node()

try_offline_node() is pretty much broken right now:

 - The node span is updated when onlining memory, not when adding it. We
   ignore memory that was mever onlined. Bad.

 - We touch possible garbage memmaps. The pfn_to_nid(pfn) can easily
   trigger a kernel panic. Bad for memory that is offline but also bad
   for subsection hotadd with ZONE_DEVICE, whereby the memmap of the
   first PFN of a section might contain garbage.

 - Sections belonging to mixed nodes are not properly considered.

As memory blocks might belong to multiple nodes, we would have to walk
all pageblocks (or at least subsections) within present sections.
However, we don't have a way to identify whether a memmap that is not
online was initialized (relevant for ZONE_DEVICE).  This makes things
more complicated.

Luckily, we can piggy pack on the node span and the nid stored in memory
blocks.  Currently, the node span is grown when calling
move_pfn_range_to_zone() - e.g., when onlining memory, and shrunk when
removing memory, before calling try_offline_node().  Sysfs links are
created via link_mem_sections(), e.g., during boot or when adding
memory.

If the node still spans memory or if any memory block belongs to the
nid, we don't set the node offline.  As memory blocks that span multiple
nodes cannot get offlined, the nid stored in memory blocks is reliable
enough (for such online memory blocks, the node still spans the memory).

Introduce for_each_memory_block() to efficiently walk all memory blocks.

Note: We will soon stop shrinking the ZONE_DEVICE zone and the node span
when removing ZONE_DEVICE memory to fix similar issues (access of
garbage memmaps) - until we have a reliable way to identify whether
these memmaps were properly initialized.  This implies later, that once
a node had ZONE_DEVICE memory, we won't be able to set a node offline -
which should be acceptable.

Since commit f1dd2cd13c4b ("mm, memory_hotplug: do not associate
hotadded memory to zones until online") memory that is added is not
assoziated with a zone/node (memmap not initialized).  The introducing
commit 60a5a19e7419 ("memory-hotplug: remove sysfs file of node")
already missed that we could have multiple nodes for a section and that
the zone/node span is updated when onlining pages, not when adding them.

I tested this by hotplugging two DIMMs to a memory-less and cpu-less
NUMA node.  The node is properly onlined when adding the DIMMs.  When
removing the DIMMs, the node is properly offlined.

Masayoshi Mizuma reported:

: Without this patch, memory hotplug fails as panic:
:
:  BUG: kernel NULL pointer dereference, address: 0000000000000000
:  ...
:  Call Trace:
:   remove_memory_block_devices+0x81/0xc0
:   try_remove_memory+0xb4/0x130
:   __remove_memory+0xa/0x20
:   acpi_memory_device_remove+0x84/0x100
:   acpi_bus_trim+0x57/0x90
:   acpi_bus_trim+0x2e/0x90
:   acpi_device_hotplug+0x2b2/0x4d0
:   acpi_hotplug_work_fn+0x1a/0x30
:   process_one_work+0x171/0x380
:   worker_thread+0x49/0x3f0
:   kthread+0xf8/0x130
:   ret_from_fork+0x35/0x40

[david@redhat.com: v3]
Link: http://lkml.kernel.org/r/20191102120221.7553-1-david@redhat.com
Link: http://lkml.kernel.org/r/20191028105458.28320-1-david@redhat.com
Fixes: 60a5a19e7419 ("memory-hotplug: remove sysfs file of node")
Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") # visiable after d0dc12e86b319
Signed-off-by: David Hildenbrand <david@redhat.com>
Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Nayna Jain <nayna@linux.ibm.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 months agomm,thp: recheck each page before collapsing file THP
Song Liu [Sat, 16 Nov 2019 01:34:53 +0000 (17:34 -0800)]
mm,thp: recheck each page before collapsing file THP

In collapse_file(), for !is_shmem case, current check cannot guarantee
the locked page is up-to-date.  Specifically, xas_unlock_irq() should
not be called before lock_page() and get_page(); and it is necessary to
recheck PageUptodate() after locking the page.

With this bug and CONFIG_READ_ONLY_THP_FOR_FS=y, madvise(HUGE)'ed .text
may contain corrupted data.  This is because khugepaged mistakenly
collapses some not up-to-date sub pages into a huge page, and assumes
the huge page is up-to-date.  This will NOT corrupt data in the disk,
because the page is read-only and never written back.  Fix this by
properly checking PageUptodate() after locking the page.  This check
replaces "VM_BUG_ON_PAGE(!PageUptodate(page), page);".

Also, move PageDirty() check after locking the page.  Current khugepaged
should not try to collapse dirty file THP, because it is limited to
read-only .text.  The only case we hit a dirty page here is when the
page hasn't been written since write.  Bail out and retry when this
happens.

syzbot reported bug on previous version of this patch.

Link: http://lkml.kernel.org/r/20191106060930.2571389-2-songliubraving@fb.com
Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS")
Signed-off-by: Song Liu <songliubraving@fb.com>
Reported-by: syzbot+efb9e48b9fbdc49bb34a@syzkaller.appspotmail.com
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 months agomm: slub: really fix slab walking for init_on_free
Laura Abbott [Sat, 16 Nov 2019 01:34:50 +0000 (17:34 -0800)]
mm: slub: really fix slab walking for init_on_free

Commit 1b7e816fc80e ("mm: slub: Fix slab walking for init_on_free")
fixed one problem with the slab walking but missed a key detail: When
walking the list, the head and tail pointers need to be updated since we
end up reversing the list as a result.  Without doing this, bulk free is
broken.

One way this is exposed is a NULL pointer with slub_debug=F:

  =============================================================================
  BUG skbuff_head_cache (Tainted: G                T): Object already free
  -----------------------------------------------------------------------------

  INFO: Slab 0x000000000d2d2f8f objects=16 used=3 fp=0x0000000064309071 flags=0x3fff00000000201
  BUG: kernel NULL pointer dereference, address: 0000000000000000
  Oops: 0000 [#1] PREEMPT SMP PTI
  RIP: 0010:print_trailer+0x70/0x1d5
  Call Trace:
   <IRQ>
   free_debug_processing.cold.37+0xc9/0x149
   __slab_free+0x22a/0x3d0
   kmem_cache_free_bulk+0x415/0x420
   __kfree_skb_flush+0x30/0x40
   net_rx_action+0x2dd/0x480
   __do_softirq+0xf0/0x246
   irq_exit+0x93/0xb0
   do_IRQ+0xa0/0x110
   common_interrupt+0xf/0xf
   </IRQ>

Given we're now almost identical to the existing debugging code which
correctly walks the list, combine with that.

Link: https://lkml.kernel.org/r/20191104170303.GA50361@gandi.net
Link: http://lkml.kernel.org/r/20191106222208.26815-1-labbott@redhat.com
Fixes: 1b7e816fc80e ("mm: slub: Fix slab walking for init_on_free")
Signed-off-by: Laura Abbott <labbott@redhat.com>
Reported-by: Thibaut Sautereau <thibaut.sautereau@clip-os.org>
Acked-by: David Rientjes <rientjes@google.com>
Tested-by: Alexander Potapenko <glider@google.com>
Acked-by: Alexander Potapenko <glider@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <clipos@ssi.gouv.fr>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 months agomm: hugetlb: switch to css_tryget() in hugetlb_cgroup_charge_cgroup()
Roman Gushchin [Sat, 16 Nov 2019 01:34:46 +0000 (17:34 -0800)]
mm: hugetlb: switch to css_tryget() in hugetlb_cgroup_charge_cgroup()

An exiting task might belong to an offline cgroup.  In this case an
attempt to grab a cgroup reference from the task can end up with an
infinite loop in hugetlb_cgroup_charge_cgroup(), because neither the
cgroup will become online, neither the task will be migrated to a live
cgroup.

Fix this by switching over to css_tryget().  As css_tryget_online()
can't guarantee that the cgroup won't go offline, in most cases the
check doesn't make sense.  In this particular case users of
hugetlb_cgroup_charge_cgroup() are not affected by this change.

A similar problem is described by commit 18fa84a2db0e ("cgroup: Use
css_tryget() instead of css_tryget_online() in task_get_css()").

Link: http://lkml.kernel.org/r/20191106225131.3543616-2-guro@fb.com
Signed-off-by: Roman Gushchin <guro@fb.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 months agomm: memcg: switch to css_tryget() in get_mem_cgroup_from_mm()
Roman Gushchin [Sat, 16 Nov 2019 01:34:43 +0000 (17:34 -0800)]
mm: memcg: switch to css_tryget() in get_mem_cgroup_from_mm()

We've encountered a rcu stall in get_mem_cgroup_from_mm():

  rcu: INFO: rcu_sched self-detected stall on CPU
  rcu: 33-....: (21000 ticks this GP) idle=6c6/1/0x4000000000000002 softirq=35441/35441 fqs=5017
  (t=21031 jiffies g=324821 q=95837) NMI backtrace for cpu 33
  <...>
  RIP: 0010:get_mem_cgroup_from_mm+0x2f/0x90
  <...>
   __memcg_kmem_charge+0x55/0x140
   __alloc_pages_nodemask+0x267/0x320
   pipe_write+0x1ad/0x400
   new_sync_write+0x127/0x1c0
   __kernel_write+0x4f/0xf0
   dump_emit+0x91/0xc0
   writenote+0xa0/0xc0
   elf_core_dump+0x11af/0x1430
   do_coredump+0xc65/0xee0
   get_signal+0x132/0x7c0
   do_signal+0x36/0x640
   exit_to_usermode_loop+0x61/0xd0
   do_syscall_64+0xd4/0x100
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

The problem is caused by an exiting task which is associated with an
offline memcg.  We're iterating over and over in the do {} while
(!css_tryget_online()) loop, but obviously the memcg won't become online
and the exiting task won't be migrated to a live memcg.

Let's fix it by switching from css_tryget_online() to css_tryget().

As css_tryget_online() cannot guarantee that the memcg won't go offline,
the check is usually useless, except some rare cases when for example it
determines if something should be presented to a user.

A similar problem is described by commit 18fa84a2db0e ("cgroup: Use
css_tryget() instead of css_tryget_online() in task_get_css()").

Johannes:

: The bug aside, it doesn't matter whether the cgroup is online for the
: callers.  It used to matter when offlining needed to evacuate all charges
: from the memcg, and so needed to prevent new ones from showing up, but we
: don't care now.

Link: http://lkml.kernel.org/r/20191106225131.3543616-1-guro@fb.com
Signed-off-by: Roman Gushchin <guro@fb.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Shakeel Butt <shakeeb@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Koutn <mkoutny@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 months agolib/xz: fix XZ_DYNALLOC to avoid useless memory reallocations
Lasse Collin [Sat, 16 Nov 2019 01:34:39 +0000 (17:34 -0800)]
lib/xz: fix XZ_DYNALLOC to avoid useless memory reallocations

s->dict.allocated was initialized to 0 but never set after a successful
allocation, thus the code always thought that the dictionary buffer has
to be reallocated.

Link: http://lkml.kernel.org/r/20191104185107.3b6330df@tukaani.org
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Reported-by: Yu Sun <yusun2@cisco.com>
Acked-by: Daniel Walker <danielwa@cisco.com>
Cc: "Yixia Si (yisi)" <yisi@cisco.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 months agomm: fix trying to reclaim unevictable lru page when calling madvise_pageout
zhong jiang [Sat, 16 Nov 2019 01:34:36 +0000 (17:34 -0800)]
mm: fix trying to reclaim unevictable lru page when calling madvise_pageout

Recently, I hit the following issue when running upstream.

  kernel BUG at mm/vmscan.c:1521!
  invalid opcode: 0000 [#1] SMP KASAN PTI
  CPU: 0 PID: 23385 Comm: syz-executor.6 Not tainted 5.4.0-rc4+ #1
  RIP: 0010:shrink_page_list+0x12b6/0x3530 mm/vmscan.c:1521
  Call Trace:
   reclaim_pages+0x499/0x800 mm/vmscan.c:2188
   madvise_cold_or_pageout_pte_range+0x58a/0x710 mm/madvise.c:453
   walk_pmd_range mm/pagewalk.c:53 [inline]
   walk_pud_range mm/pagewalk.c:112 [inline]
   walk_p4d_range mm/pagewalk.c:139 [inline]
   walk_pgd_range mm/pagewalk.c:166 [inline]
   __walk_page_range+0x45a/0xc20 mm/pagewalk.c:261
   walk_page_range+0x179/0x310 mm/pagewalk.c:349
   madvise_pageout_page_range mm/madvise.c:506 [inline]
   madvise_pageout+0x1f0/0x330 mm/madvise.c:542
   madvise_vma mm/madvise.c:931 [inline]
   __do_sys_madvise+0x7d2/0x1600 mm/madvise.c:1113
   do_syscall_64+0x9f/0x4c0 arch/x86/entry/common.c:290
   entry_SYSCALL_64_after_hwframe+0x49/0xbe

madvise_pageout() accesses the specified range of the vma and isolates
them, then runs shrink_page_list() to reclaim its memory.  But it also
isolates the unevictable pages to reclaim.  Hence, we can catch the
cases in shrink_page_list().

The root cause is that we scan the page tables instead of specific LRU
list.  and so we need to filter out the unevictable lru pages from our
end.

Link: http://lkml.kernel.org/r/1572616245-18946-1-git-send-email-zhongjiang@huawei.com
Fixes: 1a4e58cce84e ("mm: introduce MADV_PAGEOUT")
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 months agomm: mempolicy: fix the wrong return value and potential pages leak of mbind
Yang Shi [Sat, 16 Nov 2019 01:34:33 +0000 (17:34 -0800)]
mm: mempolicy: fix the wrong return value and potential pages leak of mbind

Commit d883544515aa ("mm: mempolicy: make the behavior consistent when
MPOL_MF_MOVE* and MPOL_MF_STRICT were specified") fixed the return value
of mbind() for a couple of corner cases.  But, it altered the errno for
some other cases, for example, mbind() should return -EFAULT when part
or all of the memory range specified by nodemask and maxnode points
outside your accessible address space, or there was an unmapped hole in
the specified memory range specified by addr and len.

Fix this by preserving the errno returned by queue_pages_range().  And,
the pagelist may be not empty even though queue_pages_range() returns
error, put the pages back to LRU since mbind_range() is not called to
really apply the policy so those pages should not be migrated, this is
also the old behavior before the problematic commit.

Link: http://lkml.kernel.org/r/1572454731-3925-1-git-send-email-yang.shi@linux.alibaba.com
Fixes: d883544515aa ("mm: mempolicy: make the behavior consistent when MPOL_MF_MOVE* and MPOL_MF_STRICT were specified")
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Reported-by: Li Xinhai <lixinhai.lxh@gmail.com>
Reviewed-by: Li Xinhai <lixinhai.lxh@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: <stable@vger.kernel.org> [4.19 and 5.2+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 months agoInput: synaptics - enable RMI mode for X1 Extreme 2nd Generation
Lyude Paul [Fri, 15 Nov 2019 22:57:13 +0000 (14:57 -0800)]
Input: synaptics - enable RMI mode for X1 Extreme 2nd Generation

Just got one of these for debugging some unrelated issues, and noticed
that Lenovo seems to have gone back to using RMI4 over smbus with
Synaptics touchpads on some of their new systems, particularly this one.
So, let's enable RMI mode for the X1 Extreme 2nd Generation.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://lore.kernel.org/r/20191115221814.31903-1-lyude@redhat.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2 months agoMerge tag 'for-linus-20191115' of git://git.kernel.dk/linux-block
Linus Torvalds [Fri, 15 Nov 2019 21:02:34 +0000 (13:02 -0800)]
Merge tag 'for-linus-20191115' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "A few fixes that should make it into this release. This contains:

   - io_uring:
        - The timeout command assumes sequence == 0 means that we want
          one completion, but this kind of overloading is unfortunate as
          it prevents users from doing a pure time based wait. Since
          this operation was introduced in this cycle, let's correct it
          now, while we can. (me)
        - One-liner to fix an issue with dependent links and fixed
          buffer reads. The actual IO completed fine, but the link got
          severed since we stored the wrong expected value. (me)
        - Add TIMEOUT to list of opcodes that don't need a file. (Pavel)

   - rsxx missing workqueue destry calls. Old bug. (Chuhong)

   - Fix blk-iocost active list check (Jiufei)

   - Fix impossible-to-hit overflow merge condition, that still hit some
     folks very rarely (Junichi)

   - Fix bfq hang issue from 5.3. This didn't get marked for stable, but
     will go into stable post this merge (Paolo)"

* tag 'for-linus-20191115' of git://git.kernel.dk/linux-block:
  rsxx: add missed destroy_workqueue calls in remove
  iocost: check active_list of all the ancestors in iocg_activate()
  block, bfq: deschedule empty bfq_queues not referred by any process
  io_uring: ensure registered buffer import returns the IO length
  io_uring: Fix getting file for timeout
  block: check bi_size overflow before merge
  io_uring: make timeout sequence == 0 mean no sequence

2 months agoi2c: core: fix use after free in of_i2c_notify
Wen Yang [Fri, 8 Nov 2019 08:36:48 +0000 (16:36 +0800)]
i2c: core: fix use after free in of_i2c_notify

We can't use "adap->dev" after it has been freed.

Fixes: 5bf4fa7daea6 ("i2c: break out OF support into separate file")
Signed-off-by: Wen Yang <wenyang@linux.alibaba.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2 months agoi2c: acpi: Force bus speed to 400KHz if a Silead touchscreen is present
Hans de Goede [Wed, 13 Nov 2019 18:29:38 +0000 (19:29 +0100)]
i2c: acpi: Force bus speed to 400KHz if a Silead touchscreen is present

Many cheap devices use Silead touchscreen controllers. Testing has shown
repeatedly that these touchscreen controllers work fine at 400KHz, but for
unknown reasons do not work properly at 100KHz. This has been seen on
both ARM and x86 devices using totally different i2c controllers.

On some devices the ACPI tables list another device at the same I2C-bus
as only being capable of 100KHz, testing has shown that these other
devices work fine at 400KHz (as can be expected of any recent I2C hw).

This commit makes i2c_acpi_find_bus_speed() always return 400KHz if a
Silead touchscreen controller is present, fixing the touchscreen not
working on devices which ACPI tables' wrongly list another device on the
same bus as only being capable of 100KHz.

Specifically this fixes the touchscreen on the Jumper EZpad 6 m4 not
working.

Reported-by: youling 257 <youling257@gmail.com>
Tested-by: youling 257 <youling257@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
[wsa: rewording warning a little]
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@kernel.org
2 months agoMerge branch 'ptp-Validate-the-ancillary-ioctl-flags-more-carefully'
David S. Miller [Fri, 15 Nov 2019 20:48:33 +0000 (12:48 -0800)]
Merge branch 'ptp-Validate-the-ancillary-ioctl-flags-more-carefully'

Richard Cochran says:

====================
ptp: Validate the ancillary ioctl flags more carefully.

The flags passed to the ioctls for periodic output signals and
time stamping of external signals were never checked, and thus formed
a useless ABI inadvertently.  More recently, a version 2 of the ioctls
was introduced in order make the flags meaningful.  This series
tightens up the checks on the new ioctl flags.

- Patch 1 ensures at least one edge flag is set for the new ioctl.
- Patches 2-7 are Jacob's recent checks, picking up the tags.
- Patch 8 introduces a "strict" flag for passing to the drivers when the
  new ioctl is used.
- Patches 9-12 implement the "strict" checking in the drivers.
- Patch 13 extends the test program to exercise combinations of flags.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agoptp: Extend the test program to check the external time stamp flags.
Richard Cochran [Thu, 14 Nov 2019 18:45:07 +0000 (10:45 -0800)]
ptp: Extend the test program to check the external time stamp flags.

Because each driver and hardware has different capabilities, the test
cannot provide a simple pass/fail result, but it can at least show what
combinations of flags are supported.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agomlx5: Reject requests to enable time stamping on both edges.
Richard Cochran [Thu, 14 Nov 2019 18:45:06 +0000 (10:45 -0800)]
mlx5: Reject requests to enable time stamping on both edges.

This driver enables rising edge or falling edge, but not both, and so
this patch validates that the request contains only one of the two
edges.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agoigb: Reject requests that fail to enable time stamping on both edges.
Richard Cochran [Thu, 14 Nov 2019 18:45:05 +0000 (10:45 -0800)]
igb: Reject requests that fail to enable time stamping on both edges.

This hardware always time stamps rising and falling edges, and so this
patch validates that the request does contains both edges.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agodp83640: Reject requests to enable time stamping on both edges.
Richard Cochran [Thu, 14 Nov 2019 18:45:04 +0000 (10:45 -0800)]
dp83640: Reject requests to enable time stamping on both edges.

This driver enables rising edge or falling edge, but not both, and so
this patch validates that the request contains only one of the two
edges.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agomv88e6xxx: Reject requests to enable time stamping on both edges.
Richard Cochran [Thu, 14 Nov 2019 18:45:03 +0000 (10:45 -0800)]
mv88e6xxx: Reject requests to enable time stamping on both edges.

This driver enables rising edge or falling edge, but not both, and so
this patch validates that the request contains only one of the two
edges.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agoptp: Introduce strict checking of external time stamp options.
Richard Cochran [Thu, 14 Nov 2019 18:45:02 +0000 (10:45 -0800)]
ptp: Introduce strict checking of external time stamp options.

User space may request time stamps on rising edges, falling edges, or
both.  However, the particular mode may or may not be supported in the
hardware or in the driver.  This patch adds a "strict" flag that tells
drivers to ensure that the requested mode will be honored.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agorenesas: reject unsupported external timestamp flags
Jacob Keller [Thu, 14 Nov 2019 18:45:01 +0000 (10:45 -0800)]
renesas: reject unsupported external timestamp flags

Fix the renesas PTP support to explicitly reject any future flags that
get added to the external timestamp request ioctl.

In order to maintain currently functioning code, this patch accepts all
three current flags. This is because the PTP_RISING_EDGE and
PTP_FALLING_EDGE flags have unclear semantics and each driver seems to
have interpreted them slightly differently.

Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agomlx5: reject unsupported external timestamp flags
Jacob Keller [Thu, 14 Nov 2019 18:45:00 +0000 (10:45 -0800)]
mlx5: reject unsupported external timestamp flags

Fix the mlx5 core PTP support to explicitly reject any future flags that
get added to the external timestamp request ioctl.

In order to maintain currently functioning code, this patch accepts all
three current flags. This is because the PTP_RISING_EDGE and
PTP_FALLING_EDGE flags have unclear semantics and each driver seems to
have interpreted them slightly differently.

[ RC: I'm not 100% sure what this driver does, but if I'm not wrong it
      follows the dp83640:

  flags                                                 Meaning
  ----------------------------------------------------  --------------------------
  PTP_ENABLE_FEATURE                                    Time stamp rising edge
  PTP_ENABLE_FEATURE|PTP_RISING_EDGE                    Time stamp rising edge
  PTP_ENABLE_FEATURE|PTP_FALLING_EDGE                   Time stamp falling edge
  PTP_ENABLE_FEATURE|PTP_RISING_EDGE|PTP_FALLING_EDGE   Time stamp falling edge
]

Cc: Feras Daoud <ferasda@mellanox.com>
Cc: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agoigb: reject unsupported external timestamp flags
Jacob Keller [Thu, 14 Nov 2019 18:44:59 +0000 (10:44 -0800)]
igb: reject unsupported external timestamp flags

Fix the igb PTP support to explicitly reject any future flags that
get added to the external timestamp request ioctl.

In order to maintain currently functioning code, this patch accepts all
three current flags. This is because the PTP_RISING_EDGE and
PTP_FALLING_EDGE flags have unclear semantics and each driver seems to
have interpreted them slightly differently.

This HW always time stamps both edges:

  flags                                                 Meaning
  ----------------------------------------------------  --------------------------
  PTP_ENABLE_FEATURE                                    Time stamp both edges
  PTP_ENABLE_FEATURE|PTP_RISING_EDGE                    Time stamp both edges
  PTP_ENABLE_FEATURE|PTP_FALLING_EDGE                   Time stamp both edges
  PTP_ENABLE_FEATURE|PTP_RISING_EDGE|PTP_FALLING_EDGE   Time stamp both edges

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agodp83640: reject unsupported external timestamp flags
Jacob Keller [Thu, 14 Nov 2019 18:44:58 +0000 (10:44 -0800)]
dp83640: reject unsupported external timestamp flags

Fix the dp83640 PTP support to explicitly reject any future flags that
get added to the external timestamp request ioctl.

In order to maintain currently functioning code, this patch accepts all
three current flags. This is because the PTP_RISING_EDGE and
PTP_FALLING_EDGE flags have unclear semantics and each driver seems to
have interpreted them slightly differently.

For the record, the semantics of this driver are:

  flags                                                 Meaning
  ----------------------------------------------------  --------------------------
  PTP_ENABLE_FEATURE                                    Time stamp rising edge
  PTP_ENABLE_FEATURE|PTP_RISING_EDGE                    Time stamp rising edge
  PTP_ENABLE_FEATURE|PTP_FALLING_EDGE                   Time stamp falling edge
  PTP_ENABLE_FEATURE|PTP_RISING_EDGE|PTP_FALLING_EDGE   Time stamp falling edge

Cc: Stefan Sørensen <stefan.sorensen@spectralink.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agomv88e6xxx: reject unsupported external timestamp flags
Jacob Keller [Thu, 14 Nov 2019 18:44:57 +0000 (10:44 -0800)]
mv88e6xxx: reject unsupported external timestamp flags

Fix the mv88e6xxx PTP support to explicitly reject any future flags that
get added to the external timestamp request ioctl.

In order to maintain currently functioning code, this patch accepts all
three current flags. This is because the PTP_RISING_EDGE and
PTP_FALLING_EDGE flags have unclear semantics and each driver seems to
have interpreted them slightly differently.

For the record, the semantics of this driver are:

  flags                                                 Meaning
  ----------------------------------------------------  --------------------------
  PTP_ENABLE_FEATURE                                    Time stamp falling edge
  PTP_ENABLE_FEATURE|PTP_RISING_EDGE                    Time stamp rising edge
  PTP_ENABLE_FEATURE|PTP_FALLING_EDGE                   Time stamp falling edge
  PTP_ENABLE_FEATURE|PTP_RISING_EDGE|PTP_FALLING_EDGE   Time stamp rising edge

Cc: Brandon Streiff <brandon.streiff@ni.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agonet: reject PTP periodic output requests with unsupported flags
Jacob Keller [Thu, 14 Nov 2019 18:44:56 +0000 (10:44 -0800)]
net: reject PTP periodic output requests with unsupported flags

Commit 823eb2a3c4c7 ("PTP: add support for one-shot output") introduced
a new flag for the PTP periodic output request ioctl. This flag is not
currently supported by any driver.

Fix all drivers which implement the periodic output request ioctl to
explicitly reject any request with flags they do not understand. This
ensures that the driver does not accidentally misinterpret the
PTP_PEROUT_ONE_SHOT flag, or any new flag introduced in the future.

This is important for forward compatibility: if a new flag is
introduced, the driver should reject requests to enable the flag until
the driver has actually been modified to support the flag in question.

Cc: Felipe Balbi <felipe.balbi@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Christopher Hall <christopher.s.hall@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agoptp: Validate requests to enable time stamping of external signals.
Richard Cochran [Thu, 14 Nov 2019 18:44:55 +0000 (10:44 -0800)]
ptp: Validate requests to enable time stamping of external signals.

Commit 415606588c61 ("PTP: introduce new versions of IOCTLs")
introduced a new external time stamp ioctl that validates the flags.
This patch extends the validation to ensure that at least one rising
or falling edge flag is set when enabling external time stamps.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agonet: ep93xx_eth: fix mismatch of request_mem_region in remove
Chuhong Yuan [Thu, 14 Nov 2019 15:43:24 +0000 (23:43 +0800)]
net: ep93xx_eth: fix mismatch of request_mem_region in remove

The driver calls release_resource in remove to match request_mem_region
in probe, which is incorrect.
Fix it by using the right one, release_mem_region.

Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agoax88172a: fix information leak on short answers
Oliver Neukum [Thu, 14 Nov 2019 10:16:01 +0000 (11:16 +0100)]
ax88172a: fix information leak on short answers

If a malicious device gives a short MAC it can elicit up to
5 bytes of leaked memory out of the driver. We need to check for
ETH_ALEN instead.

Reported-by: syzbot+a8d4acdad35e6bbca308@syzkaller.appspotmail.com
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agoselftests: mlxsw: Adjust test to recent changes
Ido Schimmel [Thu, 14 Nov 2019 09:50:57 +0000 (11:50 +0200)]
selftests: mlxsw: Adjust test to recent changes

mlxsw does not support VXLAN devices with a physical device attached and
vetoes such configurations upon enslavement to an offloaded bridge.

Commit 0ce1822c2a08 ("vxlan: add adjacent link to limit depth level")
changed the VXLAN device to be an upper of the physical device which
causes mlxsw to veto the creation of the VXLAN device with "Unknown
upper device type".

This is OK as this configuration is not supported, but it prevents us
from testing bad flows involving the enslavement of VXLAN devices with a
physical device to a bridge, regardless if the physical device is an
mlxsw netdev or not.

Adjust the test to use a dummy device as a physical device instead of a
mlxsw netdev.

Fixes: 0ce1822c2a08 ("vxlan: add adjacent link to limit depth level")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agoInput: synaptics-rmi4 - destroy F54 poller workqueue when removing
Chuhong Yuan [Fri, 15 Nov 2019 19:32:36 +0000 (11:32 -0800)]
Input: synaptics-rmi4 - destroy F54 poller workqueue when removing

The driver forgets to destroy workqueue in remove() similarly to what is
done when probe() fails. Add a call to destroy_workqueue() to fix it.

Since unregistration will wait for the work to finish, we do not need to
cancel/flush the work instance in remove().

Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20191114023405.31477-1-hslester96@gmail.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2 months agoInput: ff-memless - kill timer in destroy()
Oliver Neukum [Fri, 15 Nov 2019 19:35:05 +0000 (11:35 -0800)]
Input: ff-memless - kill timer in destroy()

No timer must be left running when the device goes away.

Signed-off-by: Oliver Neukum <oneukum@suse.com>
Reported-and-tested-by: syzbot+b6c55daa701fc389e286@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/1573726121.17351.3.camel@suse.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2 months agoMerge tag 'ceph-for-5.4-rc8' of git://github.com/ceph/ceph-client
Linus Torvalds [Fri, 15 Nov 2019 18:30:24 +0000 (10:30 -0800)]
Merge tag 'ceph-for-5.4-rc8' of git://github.com/ceph/ceph-client

Pull ceph fixes from Ilya Dryomov:
 "Two fixes for the buffered reads and O_DIRECT writes serialization
  patch that went into -rc1 and a fixup for a bogus warning on older gcc
  versions"

* tag 'ceph-for-5.4-rc8' of git://github.com/ceph/ceph-client:
  rbd: silence bogus uninitialized warning in rbd_object_map_update_finish()
  ceph: increment/decrement dio counter on async requests
  ceph: take the inode lock before acquiring cap refs

2 months agoafs: Fix race in commit bulk status fetch
David Howells [Thu, 14 Nov 2019 18:41:03 +0000 (18:41 +0000)]
afs: Fix race in commit bulk status fetch

When a lookup is done, the afs filesystem will perform a bulk status-fetch
operation on the requested vnode (file) plus the next 49 other vnodes from
the directory list (in AFS, directory contents are downloaded as blobs and
parsed locally).  When the results are received, it will speculatively
populate the inode cache from the extra data.

However, if the lookup races with another lookup on the same directory, but
for a different file - one that's in the 49 extra fetches, then if the bulk
status-fetch operation finishes first, it will try and update the inode
from the other lookup.

If this other inode is still in the throes of being created, however, this
will cause an assertion failure in afs_apply_status():

BUG_ON(test_bit(AFS_VNODE_UNSET, &vnode->flags));

on or about fs/afs/inode.c:175 because it expects data to be there already
that it can compare to.

Fix this by skipping the update if the inode is being created as the
creator will presumably set up the inode with the same information.

Fixes: 39db9815da48 ("afs: Fix application of the results of a inline bulk status fetch")
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 months agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Fri, 15 Nov 2019 17:14:23 +0000 (09:14 -0800)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux

Pull arm64 fix from Will Deacon:
 "One trivial fix for -rc8/final that ensures that the script used to
  detect RELR relocation support in the toolchain works correctly when
  $CC contains quotes. Although it fails safely (by failing to detect
  the support when it exists), it would be nice to have this fixed in
  5.4 given that it was only introduced in the last merge window.

  Summary:

   - Handle CC variables containing quotes in tools-support-relr.sh
     script"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  scripts/tools-support-relr.sh: un-quote variables

2 months agoMerge tag 'mips_fixes_5.4_4' of git://git.kernel.org/pub/scm/linux/kernel/git/mips...
Linus Torvalds [Fri, 15 Nov 2019 17:10:13 +0000 (09:10 -0800)]
Merge tag 'mips_fixes_5.4_4' of git://git./linux/kernel/git/mips/linux

Pull MIPS fixes from Paul Burton:
 "A fix and simplification for SGI IP27 exception handlers, and a small
  MAINTAINERS update for Broadcom MIPS systems"

* tag 'mips_fixes_5.4_4' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  MAINTAINERS: Remove Kevin as maintainer of BMIPS generic platforms
  MIPS: SGI-IP27: fix exception handler replication

2 months agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Fri, 15 Nov 2019 17:05:39 +0000 (09:05 -0800)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull more KVM fixes from Paolo Bonzini:

 - fixes for CONFIG_KVM_COMPAT=n

 - two updates to the IFU erratum

 - selftests build fix

 - brown paper bag fix

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: Add a comment describing the /dev/kvm no_compat handling
  KVM: x86/mmu: Take slots_lock when using kvm_mmu_zap_all_fast()
  KVM: Forbid /dev/kvm being opened by a compat task when CONFIG_KVM_COMPAT=n
  KVM: X86: Reset the three MSR list number variables to 0 in kvm_init_msr_list()
  selftests: kvm: fix build with glibc >= 2.30
  kvm: x86: disable shattered huge page recovery for PREEMPT_RT.

2 months agoMerge tag 'mmc-v5.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Linus Torvalds [Fri, 15 Nov 2019 16:58:56 +0000 (08:58 -0800)]
Merge tag 'mmc-v5.4-rc7' of git://git./linux/kernel/git/ulfh/mmc

Pull MMC fix from Ulf Hansson:
 "Don't overwrite quirk flags in sdhci-of-at91 host driver"

* tag 'mmc-v5.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: sdhci-of-at91: fix quirk2 overwrite

2 months agoMerge tag 'sound-5.4-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai...
Linus Torvalds [Fri, 15 Nov 2019 16:53:52 +0000 (08:53 -0800)]
Merge tag 'sound-5.4-rc8' of git://git./linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "A few small last-minute fixes for USB-audio and HD-audio as well as
  for PCM core:

   - A race fix for PCM core between stopping and closing a stream

   - USB-audio regressions in the recent descriptor validation code and
     relevant changes

   - A read of uninitialized value in USB-audio spotted by fuzzer

   - A fix for USB-audio race at stopping a stream

   - Intel HD-audio platform fixes"

* tag 'sound-5.4-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: usb-audio: Fix incorrect size check for processing/extension units
  ALSA: usb-audio: Fix incorrect NULL check in create_yamaha_midi_quirk()
  ALSA: pcm: Fix stream lock usage in snd_pcm_period_elapsed()
  ALSA: usb-audio: not submit urb for stopped endpoint
  ALSA: hda: hdmi - fix pin setup on Tigerlake
  ALSA: hda: Add Cometlake-S PCI ID
  ALSA: usb-audio: Fix missing error check at mixer resolution test

2 months agoMerge tag 'drm-fixes-2019-11-15' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Fri, 15 Nov 2019 16:47:34 +0000 (08:47 -0800)]
Merge tag 'drm-fixes-2019-11-15' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Here is this weeks non-intel hw vuln fixes pull. Three drivers, all
  small fixes.

  i915:
   - MOCS table fixes for EHL and TGL
   - Update Display's rawclock on resume
   - GVT's dmabuf reference drop fix

  amdgpu:
   - Fix a potential crash in firmware parsing

  sun4i:
   - One fix to the dotclock dividers range for sun4i"

* tag 'drm-fixes-2019-11-15' of git://anongit.freedesktop.org/drm/drm:
  drm/amdgpu: fix null pointer deref in firmware header printing
  drm/i915/tgl: MOCS table update
  Revert "drm/i915/ehl: Update MOCS table for EHL"
  drm/sun4i: tcon: Set min division of TCON0_DCLK to 1.
  drm/i915: update rawclk also on resume
  drm/i915/gvt: fix dropping obj reference twice

2 months agoMerge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Linus Torvalds [Fri, 15 Nov 2019 16:44:08 +0000 (08:44 -0800)]
Merge branch 'fixes' of git://git./linux/kernel/git/viro/vfs

Pull misc vfs fixes from Al Viro:
 "Assorted fixes all over the place; some of that is -stable fodder,
  some regressions from the last window"

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  ecryptfs_lookup_interpose(): lower_dentry->d_parent is not stable either
  ecryptfs_lookup_interpose(): lower_dentry->d_inode is not stable
  ecryptfs: fix unlink and rmdir in face of underlying fs modifications
  audit_get_nd(): don't unlock parent too early
  exportfs_decode_fh(): negative pinned may become positive without the parent locked
  cgroup: don't put ERR_PTR() into fc->root
  autofs: fix a leak in autofs_expire_indirect()
  aio: Fix io_pgetevents() struct __compat_aio_sigset layout
  fs/namespace.c: fix use-after-free of mount in mnt_warn_timestamp_expiry()

2 months agosched/uclamp: Fix incorrect condition
Qais Yousef [Thu, 14 Nov 2019 21:10:52 +0000 (21:10 +0000)]
sched/uclamp: Fix incorrect condition

uclamp_update_active() should perform the update when
p->uclamp[clamp_id].active is true. But when the logic was inverted in
[1], the if condition wasn't inverted correctly too.

[1] https://lore.kernel.org/lkml/20190902073836.GO2369@hirez.programming.kicks-ass.net/

Reported-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Ben Segall <bsegall@google.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Patrick Bellasi <patrick.bellasi@matbug.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: babbe170e053 ("sched/uclamp: Update CPU's refcount on TG's clamp changes")
Link: https://lkml.kernel.org/r/20191114211052.15116-1-qais.yousef@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2 months agoKVM: Add a comment describing the /dev/kvm no_compat handling
Marc Zyngier [Thu, 14 Nov 2019 13:17:39 +0000 (13:17 +0000)]
KVM: Add a comment describing the /dev/kvm no_compat handling

Add a comment explaining the rational behind having both
no_compat open and ioctl callbacks to fend off compat tasks.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2 months agoMerge branch 'hns3-fixes'
David S. Miller [Fri, 15 Nov 2019 02:06:34 +0000 (18:06 -0800)]
Merge branch 'hns3-fixes'

Huazhong Tan says:

====================
net: hns3: fixes for -net

This series includes misc fixes for the HNS3 ethernet driver.

[patch 1/3] adds a compatible handling for configuration of
MAC VLAN swithch parameter.

[patch 2/3] re-allocates SSU buffer when pfc_en changed.

[patch 3/3] fixes a bug for ETS bandwidth validation.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agonet: hns3: fix ETS bandwidth validation bug
Yonglong Liu [Thu, 14 Nov 2019 02:32:41 +0000 (10:32 +0800)]
net: hns3: fix ETS bandwidth validation bug

Some device only support 4 TCs, but the driver check the total
bandwidth of 8 TCs, so may cause wrong configurations write to
the hw.

This patch uses hdev->tc_max to instead HNAE3_MAX_TC to fix it.

Fixes: e432abfb99e5 ("net: hns3: add common validation in hclge_dcb")
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agonet: hns3: reallocate SSU' buffer size when pfc_en changes
Yunsheng Lin [Thu, 14 Nov 2019 02:32:40 +0000 (10:32 +0800)]
net: hns3: reallocate SSU' buffer size when pfc_en changes

When a TC's PFC is disabled or enabled, the RX private buffer for
this TC need to be changed too, otherwise this may cause packet
dropped problem.

This patch fixes it by calling hclge_buffer_alloc to reallocate
buffer when pfc_en changes.

Fixes: cacde272dd00 ("net: hns3: Add hclge_dcb module for the support of DCB feature")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agonet: hns3: add compatible handling for MAC VLAN switch parameter configuration
Guangbin Huang [Thu, 14 Nov 2019 02:32:39 +0000 (10:32 +0800)]
net: hns3: add compatible handling for MAC VLAN switch parameter configuration

Previously, hns3 driver just directly send specific setting bit
and mask bits of MAC VLAN switch parameter to the firmware, it
can not be compatible with the old firmware, because the old one
ignores mask bits and covers all bits with new setting bits.
So when running with old firmware, the communication between PF
and VF will fail after resetting or configuring spoof check, since
they will do the MAC VLAN switch parameter configuration.

This patch fixes this problem by reading switch parameter firstly,
then just modifies the corresponding bit and sends it to firmware.

Fixes: dd2956eab104 ("net: hns3: not allow SSU loopback while execute ethtool -t dev")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agoravb: implement MTU change while device is up
Ulrich Hecht [Thu, 14 Nov 2019 01:49:49 +0000 (02:49 +0100)]
ravb: implement MTU change while device is up

Pre-allocates buffers sufficient for the maximum supported MTU (2026) in
order to eliminate the possibility of resource exhaustion when changing the
MTU while the device is up.

Signed-off-by: Ulrich Hecht <uli+renesas@fpond.eu>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agotipc: add back tipc prefix to log messages
Matt Bennett [Wed, 13 Nov 2019 23:20:03 +0000 (12:20 +1300)]
tipc: add back tipc prefix to log messages

The tipc prefix for log messages generated by tipc was
removed in commit 07f6c4bc048a ("tipc: convert tipc reference
table to use generic rhashtable").

This is still a useful prefix so add it back.

Signed-off-by: Matt Bennett <matt.bennett@alliedtelesis.co.nz>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agoMerge tag 'drm-fixes-5.4-2019-11-14' of git://people.freedesktop.org/~agd5f/linux...
Dave Airlie [Fri, 15 Nov 2019 00:38:34 +0000 (10:38 +1000)]
Merge tag 'drm-fixes-5.4-2019-11-14' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

drm-fixes-5.4-2019-11-14:

amdgpu:
- Fix a potential crash in firmware parsing

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191114221354.3914-1-alexander.deucher@amd.com
2 months agoMerge tag 'drm-misc-fixes-2019-11-13' of git://anongit.freedesktop.org/drm/drm-misc...
Dave Airlie [Fri, 15 Nov 2019 00:38:11 +0000 (10:38 +1000)]
Merge tag 'drm-misc-fixes-2019-11-13' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

- One fix to the dotclock dividers range for sun4i

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <mripard@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20191113142645.GA967172@gilmour.lan
2 months agoMerge tag 'drm-intel-fixes-2019-11-13' of git://anongit.freedesktop.org/drm/drm-intel...
Dave Airlie [Fri, 15 Nov 2019 00:37:02 +0000 (10:37 +1000)]
Merge tag 'drm-intel-fixes-2019-11-13' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

- MOCS table fixes for EHL and TGL
- Update Display's rawclock on resume
- GVT's dmabuf reference drop fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191114055302.GA3564@intel.com
2 months agodrm/amdgpu: fix null pointer deref in firmware header printing
Xiaojie Yuan [Thu, 5 Sep 2019 08:50:22 +0000 (16:50 +0800)]
drm/amdgpu: fix null pointer deref in firmware header printing

v2: declare as (struct common_firmware_header *) type because
    struct xxx_firmware_header inherits from it

When CE's ucode_id(8) is used to get sdma_hdr, we will be accessing an
unallocated amdgpu_firmware_info instance.

This issue appears on rhel7.7 with gcc 4.8.5. Newer compilers might have
optimized out such 'defined but not referenced' variable.

[ 1120.798564] BUG: unable to handle kernel NULL pointer dereference at 000000000000000a
[ 1120.806703] IP: [<ffffffffc0e3c9b3>] psp_np_fw_load+0x1e3/0x390 [amdgpu]
[ 1120.813693] PGD 80000002603ff067 PUD 271b8d067 PMD 0
[ 1120.818931] Oops: 0000 [#1] SMP
[ 1120.822245] Modules linked in: amdgpu(OE+) amdkcl(OE) amd_iommu_v2 amdttm(OE) amd_sched(OE) xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun bridge stp llc devlink ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_mangle iptable_security iptable_raw nf_conntrack libcrc32c ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter sunrpc dm_mirror dm_region_hash dm_log dm_mod intel_pmc_core intel_powerclamp coretemp intel_rapl joydev kvm_intel eeepc_wmi asus_wmi kvm sparse_keymap iTCO_wdt irqbypass rfkill crc32_pclmul snd_hda_codec_realtek mxm_wmi ghash_clmulni_intel intel_wmi_thunderbolt iTCO_vendor_support snd_hda_codec_generic snd_hda_codec_hdmi aesni_intel lrw gf128mul glue_helper ablk_helper sg cryptd pcspkr snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd pinctrl_sunrisepoint pinctrl_intel soundcore acpi_pad mei_me wmi mei i2c_i801 pcc_cpufreq ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic i915 i2c_algo_bit iosf_mbi drm_kms_helper e1000e syscopyarea sysfillrect sysimgblt fb_sys_fops ahci libahci drm ptp libata crct10dif_pclmul crct10dif_common crc32c_intel serio_raw pps_core drm_panel_orientation_quirks video i2c_hid
[ 1120.954136] CPU: 4 PID: 2426 Comm: modprobe Tainted: G           OE  ------------   3.10.0-1062.el7.x86_64 #1
[ 1120.964390] Hardware name: System manufacturer System Product Name/Z170-A, BIOS 1302 11/09/2015
[ 1120.973321] task: ffff991ef1e3c1c0 ti: ffff991ee625c000 task.ti: ffff991ee625c000
[ 1120.981020] RIP: 0010:[<ffffffffc0e3c9b3>]  [<ffffffffc0e3c9b3>] psp_np_fw_load+0x1e3/0x390 [amdgpu]
[ 1120.990483] RSP: 0018:ffff991ee625f950  EFLAGS: 00010202
[ 1120.995935] RAX: 0000000000000002 RBX: ffff991edf6b2d38 RCX: ffff991edf6a0000
[ 1121.003391] RDX: 0000000000000000 RSI: ffff991f01d13898 RDI: ffffffffc110afb3
[ 1121.010706] RBP: ffff991ee625f9b0 R08: 0000000000000000 R09: 0000000000000000
[ 1121.018029] R10: 00000000000004c4 R11: ffff991ee625f64e R12: ffff991edf6b3220
[ 1121.025353] R13: ffff991edf6a0000 R14: 0000000000000008 R15: ffff991edf6b2d30
[ 1121.032666] FS:  00007f97b0c0b740(0000) GS:ffff991f01d00000(0000) knlGS:0000000000000000
[ 1121.041000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1121.046880] CR2: 000000000000000a CR3: 000000025e604000 CR4: 00000000003607e0
[ 1121.054239] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1121.061631] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1121.068938] Call Trace:
[ 1121.071494]  [<ffffffffc0e3dba8>] psp_hw_init+0x218/0x270 [amdgpu]
[ 1121.077886]  [<ffffffffc0da3188>] amdgpu_device_fw_loading+0xe8/0x160 [amdgpu]
[ 1121.085296]  [<ffffffffc0e3b34c>] ? vega10_ih_irq_init+0x4bc/0x730 [amdgpu]
[ 1121.092534]  [<ffffffffc0da5c75>] amdgpu_device_init+0x1495/0x1c90 [amdgpu]
[ 1121.099675]  [<ffffffffc0da9cab>] amdgpu_driver_load_kms+0x8b/0x2f0 [amdgpu]
[ 1121.106888]  [<ffffffffc01b25cf>] drm_dev_register+0x12f/0x1d0 [drm]
[ 1121.113419]  [<ffffffffa4dcdfd8>] ? pci_enable_device_flags+0xe8/0x140
[ 1121.120183]  [<ffffffffc0da260a>] amdgpu_pci_probe+0xca/0x170 [amdgpu]
[ 1121.126919]  [<ffffffffa4dcf97a>] local_pci_probe+0x4a/0xb0
[ 1121.132622]  [<ffffffffa4dd10c9>] pci_device_probe+0x109/0x160
[ 1121.138607]  [<ffffffffa4eb4205>] driver_probe_device+0xc5/0x3e0
[ 1121.144766]  [<ffffffffa4eb4603>] __driver_attach+0x93/0xa0
[ 1121.150507]  [<ffffffffa4eb4570>] ? __device_attach+0x50/0x50
[ 1121.156422]  [<ffffffffa4eb1da5>] bus_for_each_dev+0x75/0xc0
[ 1121.162213]  [<ffffffffa4eb3b7e>] driver_attach+0x1e/0x20
[ 1121.167771]  [<ffffffffa4eb3620>] bus_add_driver+0x200/0x2d0
[ 1121.173590]  [<ffffffffa4eb4c94>] driver_register+0x64/0xf0
[ 1121.179345]  [<ffffffffa4dd0905>] __pci_register_driver+0xa5/0xc0
[ 1121.185593]  [<ffffffffc099f000>] ? 0xffffffffc099efff
[ 1121.190914]  [<ffffffffc099f0a4>] amdgpu_init+0xa4/0xb0 [amdgpu]
[ 1121.197101]  [<ffffffffa4a0210a>] do_one_initcall+0xba/0x240
[ 1121.202901]  [<ffffffffa4b1c90a>] load_module+0x271a/0x2bb0
[ 1121.208598]  [<ffffffffa4dad740>] ? ddebug_proc_write+0x100/0x100
[ 1121.214894]  [<ffffffffa4b1ce8f>] SyS_init_module+0xef/0x140
[ 1121.220698]  [<ffffffffa518bede>] system_call_fastpath+0x25/0x2a
[ 1121.226870] Code: b4 01 60 a2 00 00 31 c0 e8 83 60 33 e4 41 8b 47 08 48 8b 4d d0 48 c7 c7 b3 af 10 c1 48 69 c0 68 07 00 00 48 8b 84 01 60 a2 00 00 <48> 8b 70 08 31 c0 48 89 75 c8 e8 56 60 33 e4 48 8b 4d d0 48 c7
[ 1121.247422] RIP  [<ffffffffc0e3c9b3>] psp_np_fw_load+0x1e3/0x390 [amdgpu]
[ 1121.254432]  RSP <ffff991ee625f950>
[ 1121.258017] CR2: 000000000000000a
[ 1121.261427] ---[ end trace e98b35387ede75bd ]---

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Fixes: c5fb912653dae3f878 ("drm/amdgpu: add firmware header printing for psp fw loading (v2)")
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 months agorsxx: add missed destroy_workqueue calls in remove
Chuhong Yuan [Wed, 13 Nov 2019 06:38:47 +0000 (14:38 +0800)]
rsxx: add missed destroy_workqueue calls in remove

The driver misses calling destroy_workqueue in remove like what is done
when probe fails.
Add the missed calls to fix it.

Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 months agoiocost: check active_list of all the ancestors in iocg_activate()
Jiufei Xue [Wed, 13 Nov 2019 07:21:31 +0000 (15:21 +0800)]
iocost: check active_list of all the ancestors in iocg_activate()

There is a bug that checking the same active_list over and over again
in iocg_activate(). The intention of the code was checking whether all
the ancestors and self have already been activated. So fix it.

Fixes: 7caa47151ab2 ("blkcg: implement blk-iocost")
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jiufei Xue <jiufei.xue@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 months agorbd: silence bogus uninitialized warning in rbd_object_map_update_finish()
Ilya Dryomov [Wed, 13 Nov 2019 11:07:15 +0000 (12:07 +0100)]
rbd: silence bogus uninitialized warning in rbd_object_map_update_finish()

Some versions of gcc (so far 6.3 and 7.4) throw a warning:

  drivers/block/rbd.c: In function 'rbd_object_map_callback':
  drivers/block/rbd.c:2124:21: warning: 'current_state' may be used uninitialized in this function [-Wmaybe-uninitialized]
        (current_state == OBJECT_EXISTS && state == OBJECT_EXISTS_CLEAN))
  drivers/block/rbd.c:2092:23: note: 'current_state' was declared here
    u8 state, new_state, current_state;
                          ^~~~~~~~~~~~~

It's bogus because all current_state accesses are guarded by
has_current_state.

Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
2 months agoceph: increment/decrement dio counter on async requests
Jeff Layton [Wed, 13 Nov 2019 14:56:06 +0000 (09:56 -0500)]
ceph: increment/decrement dio counter on async requests

Ceph can in some cases issue an async DIO request, in which case we can
end up calling ceph_end_io_direct before the I/O is actually complete.
That may allow buffered operations to proceed while DIO requests are
still in flight.

Fix this by incrementing the i_dio_count when issuing an async DIO
request, and decrement it when tearing down the aio_req.

Fixes: 321fe13c9398 ("ceph: add buffered/direct exclusionary locking for reads and writes")
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2 months agoceph: take the inode lock before acquiring cap refs
Jeff Layton [Wed, 13 Nov 2019 14:10:27 +0000 (09:10 -0500)]
ceph: take the inode lock before acquiring cap refs

Most of the time, we (or the vfs layer) takes the inode_lock and then
acquires caps, but ceph_read_iter does the opposite, and that can lead
to a deadlock.

When there are multiple clients treading over the same data, we can end
up in a situation where a reader takes caps and then tries to acquire
the inode_lock. Another task holds the inode_lock and issues a request
to the MDS which needs to revoke the caps, but that can't happen until
the inode_lock is unwedged.

Fix this by having ceph_read_iter take the inode_lock earlier, before
attempting to acquire caps.

Fixes: 321fe13c9398 ("ceph: add buffered/direct exclusionary locking for reads and writes")
Link: https://tracker.ceph.com/issues/36348
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2 months agoALSA: usb-audio: Fix incorrect size check for processing/extension units
Takashi Iwai [Thu, 14 Nov 2019 16:56:12 +0000 (17:56 +0100)]
ALSA: usb-audio: Fix incorrect size check for processing/extension units

The recently introduced unit descriptor validation had some bug for
processing and extension units, it counts a bControlSize byte twice so
it expected a bigger size than it should have been.  This seems
resulting in a probe error on a few devices.

Fix the calculation for proper checks of PU and EU.

Fixes: 57f8770620e9 ("ALSA: usb-audio: More validations of descriptor units")
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20191114165613.7422-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2 months agoMerge tag 'kbuild-fixes-v5.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 14 Nov 2019 16:48:10 +0000 (08:48 -0800)]
Merge tag 'kbuild-fixes-v5.4-3' of git://git./linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

 - fix build error when compiling SPARC VDSO with CONFIG_COMPAT=y

 - pass correct --arch option to Sparse

* tag 'kbuild-fixes-v5.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kbuild: tell sparse about the $ARCH
  sparc: vdso: fix build error of vdso32

2 months agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Linus Torvalds [Thu, 14 Nov 2019 16:37:48 +0000 (08:37 -0800)]
Merge tag 'for-linus' of git://git./linux/kernel/git/rdma/rdma

Pull RDMA fixes from Jason Gunthorpe:
 "Bug fixes for old bugs in the hns and hfi1 drivers:

   - Calculate various values in hns properly to avoid over/underflows
     in some cases

   - Fix an oops, PCI negotiation on Gen4 systems, and bugs related to
     retries"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/hns: Correct the value of srq_desc_size
  RDMA/hns: Correct the value of HNS_ROCE_HEM_CHUNK_LEN
  IB/hfi1: TID RDMA WRITE should not return IB_WC_RNR_RETRY_EXC_ERR
  IB/hfi1: Calculate flow weight based on QP MTU for TID RDMA
  IB/hfi1: Ensure r_tid_ack is valid before building TID RDMA ACK packet
  IB/hfi1: Ensure full Gen3 speed in a Gen4 system

2 months agoKVM: x86/mmu: Take slots_lock when using kvm_mmu_zap_all_fast()
Sean Christopherson [Wed, 13 Nov 2019 19:30:32 +0000 (11:30 -0800)]
KVM: x86/mmu: Take slots_lock when using kvm_mmu_zap_all_fast()

Acquire the per-VM slots_lock when zapping all shadow pages as part of
toggling nx_huge_pages.  The fast zap algorithm relies on exclusivity
(via slots_lock) to identify obsolete vs. valid shadow pages, because it
uses a single bit for its generation number. Holding slots_lock also
obviates the need to acquire a read lock on the VM's srcu.

Failing to take slots_lock when toggling nx_huge_pages allows multiple
instances of kvm_mmu_zap_all_fast() to run concurrently, as the other
user, KVM_SET_USER_MEMORY_REGION, does not take the global kvm_lock.
(kvm_mmu_zap_all_fast() does take kvm->mmu_lock, but it can be
temporarily dropped by kvm_zap_obsolete_pages(), so it is not enough
to enforce exclusivity).

Concurrent fast zap instances causes obsolete shadow pages to be
incorrectly identified as valid due to the single bit generation number
wrapping, which results in stale shadow pages being left in KVM's MMU
and leads to all sorts of undesirable behavior.
The bug is easily confirmed by running with CONFIG_PROVE_LOCKING and
toggling nx_huge_pages via its module param.

Note, until commit 4ae5acbc4936 ("KVM: x86/mmu: Take slots_lock when
using kvm_mmu_zap_all_fast()", 2019-11-13) the fast zap algorithm used
an ulong-sized generation instead of relying on exclusivity for
correctness, but all callers except the recently added set_nx_huge_pages()
needed to hold slots_lock anyways.  Therefore, this patch does not have
to be backported to stable kernels.

Given that toggling nx_huge_pages is by no means a fast path, force it
to conform to the current approach instead of reintroducing the previous
generation count.

Fixes: b8e8c8303ff28 ("kvm: mmu: ITLB_MULTIHIT mitigation", but NOT FOR STABLE)
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>