muen/linux.git
3 years agonet: socionext: Reset tx queue in ndo_stop
Masahisa Kojima [Tue, 23 Oct 2018 11:24:28 +0000 (20:24 +0900)]
net: socionext: Reset tx queue in ndo_stop

We observed that packets and bytes count are not reset
when user performs interface down. Eventually, tx queue is
exhausted and packets will not be sent out.
To avoid this problem, resets tx queue in ndo_stop.

Fixes: 533dd11a12f6 ("net: socionext: Add Synquacer NetSec driver")
Signed-off-by: Masahisa Kojima <masahisa.kojima@linaro.org>
Signed-off-by: Yoshitoyo Osaki <osaki.yoshitoyo@socionext.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: socionext: Add dummy PHY register read in phy_write()
Masahisa Kojima [Tue, 23 Oct 2018 11:24:27 +0000 (20:24 +0900)]
net: socionext: Add dummy PHY register read in phy_write()

There is a compatibility issue between RTL8211E implemented
in Developerbox and netsec ethernet controller IP.

Our MDIO controller stops MDC clock right after the write
access, but RTL8211E expects MDC clock must be kept toggling
for several clock cycle with MDIO high before entering
the IDLE state. Without keeping clock after write access,
write access is not correctly handled and register is not
updated.

To meet this requirement, netsec driver needs to issue dummy
read(e.g. read PHYID1(offset 0x2) register) right after write
access, to keep MDC clock.

We think this compatibility issue is a problem specific to
our MDIO controller and RTL8211E.

Fixes: 533dd11a12f6 ("net: socionext: Add Synquacer NetSec driver")
Signed-off-by: Masahisa Kojima <masahisa.kojima@linaro.org>
Signed-off-by: Yoshitoyo Osaki <osaki.yoshitoyo@socionext.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: socionext: Stop PHY before resetting netsec
Masahisa Kojima [Tue, 23 Oct 2018 11:24:26 +0000 (20:24 +0900)]
net: socionext: Stop PHY before resetting netsec

In ndo_stop, driver resets the netsec ethernet controller IP.
When the netsec IP is reset, HW running mode turns to NRM mode
and driver has to wait until this mode transition completes.

But mode transition to NRM will not complete if the PHY is
in normal operation state. Netsec IP requires PHY is in
power down state when it is reset.

This modification stops the PHY before resetting netsec.

Together with this modification, phy_addr is stored in netsec_priv
structure because ndev->phydev is not yet ready in ndo_init.

Fixes: 533dd11a12f6 ("net: socionext: Add Synquacer NetSec driver")
Signed-off-by: Masahisa Kojima <masahisa.kojima@linaro.org>
Signed-off-by: Yoshitoyo Osaki <osaki.yoshitoyo@socionext.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: stmmac: Set OWN bit for jumbo frames
Thor Thayer [Mon, 22 Oct 2018 22:22:26 +0000 (17:22 -0500)]
net: stmmac: Set OWN bit for jumbo frames

Ping with Jumbo packet does not reply and get a watchdog timeout

[   46.059616] ------------[ cut here ]------------
[   46.064268] NETDEV WATCHDOG: eth0 (socfpga-dwmac): transmit queue 0 timed out
[   46.071471] WARNING: CPU: 1 PID: 0 at net/sched/sch_generic.c:461 dev_watchdog+0x2cc/0x2d8
[   46.079708] Modules linked in:
[   46.082761] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.18.0-00115-gc262be665854-dirty #264
[   46.091082] Hardware name: SoCFPGA Stratix 10 SoCDK (DT)
[   46.096377] pstate: 20000005 (nzCv daif -PAN -UAO)
[   46.101152] pc : dev_watchdog+0x2cc/0x2d8
[   46.105149] lr : dev_watchdog+0x2cc/0x2d8
[   46.109144] sp : ffff00000800bd80
[   46.112447] x29: ffff00000800bd80 x28: ffff80007a9b4940
[   46.117744] x27: 00000000ffffffff x26: ffff80007aa183b0
[   46.123040] x25: 0000000000000001 x24: 0000000000000140
[   46.128336] x23: ffff80007aa1839c x22: ffff80007aa17fb0
[   46.133632] x21: ffff80007aa18000 x20: ffff0000091a7000
[   46.138927] x19: 0000000000000000 x18: ffffffffffffffff
[   46.144223] x17: 0000000000000000 x16: 0000000000000000
[   46.149519] x15: ffff0000091a96c8 x14: 07740775076f0720
[   46.154814] x13: 07640765076d0769 x12: 0774072007300720
[   46.160110] x11: 0765077507650775 x10: 0771072007740769
[   46.165406] x9 : 076d0773076e0761 x8 : 077207740720073a
[   46.170702] x7 : 072907630761076d x6 : ffff80007ff9a0c0
[   46.175997] x5 : ffff80007ff9a0c0 x4 : 0000000000000002
[   46.181293] x3 : 0000000000000000 x2 : ffff0000091ac180
[   46.186589] x1 : e6a742ebe628e800 x0 : 0000000000000000
[   46.191885] Call trace:
[   46.194326]  dev_watchdog+0x2cc/0x2d8
[   46.197980]  call_timer_fn+0x20/0x78
[   46.201544]  expire_timers+0xa4/0xb0
[   46.205108]  run_timer_softirq+0xe4/0x198
[   46.209107]  __do_softirq+0x114/0x210
[   46.212760]  irq_exit+0xd0/0xd8
[   46.215895]  __handle_domain_irq+0x60/0xb0
[   46.219977]  gic_handle_irq+0x58/0xa8
[   46.223628]  el1_irq+0xb0/0x128
[   46.226761]  arch_cpu_idle+0x10/0x18
[   46.230326]  do_idle+0x1d4/0x288
[   46.233544]  cpu_startup_entry+0x24/0x28
[   46.237457]  secondary_start_kernel+0x17c/0x1c0
[   46.241971] ---[ end trace 57048cd1372cd828 ]---

Inspection of queue showed Jumbo packets were not sent out.
The ring Jumbo packet function needs to set the OWN bit so
the packet is sent.

Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoarm64: dts: stratix10: Support Ethernet Jumbo frame
Thor Thayer [Mon, 22 Oct 2018 22:22:25 +0000 (17:22 -0500)]
arm64: dts: stratix10: Support Ethernet Jumbo frame

Properly specify the RX and TX FIFO size which is important
for Jumbo frames.
Update the max-frame-size to support Jumbo frames.

Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
David S. Miller [Tue, 23 Oct 2018 03:21:30 +0000 (20:21 -0700)]
Merge git://git./pub/scm/linux/kernel/git/pablo/nf

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for your net tree:

1) rbtree lookup from control plane returns the left-hand side element
   of the range when the interval end flag is set on.

2) osf extension is not supported from the input path, reject this from
   the control plane, from Fernando Fernandez Mancera.

3) xt_TEE is leaving output interface unset due to a recent incorrect
   netns rework, from Taehee Yoo.

4) xt_TEE allows to select an interface which does not belong to this
   netnamespace, from Taehee Yoo.

5) Zero private extension area in nft_compat, just like we do in x_tables,
   otherwise we leak kernel memory to userspace.

6) Missing .checkentry and .destroy entries in new DNAT extensions breaks
   it since we never load nf_conntrack dependencies, from Paolo Abeni.

7) Do not remove flowtable hook from netns exit path, the netdevice handler
   already deals with this, also from Taehee Yoo.

8) Only cleanup flowtable entries that reside in this netnamespace, also
   from Taehee Yoo.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agotls: Add maintainers
Dave Watson [Mon, 22 Oct 2018 19:36:37 +0000 (19:36 +0000)]
tls: Add maintainers

Add John and Daniel as additional tls co-maintainers to help review
patches and fix syzbot reports.

Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ethernet: ti: cpsw: unsync mcast entries while switch promisc mode
Ivan Khoronzhuk [Mon, 22 Oct 2018 18:51:36 +0000 (21:51 +0300)]
net: ethernet: ti: cpsw: unsync mcast entries while switch promisc mode

After flushing all mcast entries from the table, the ones contained in
mc list of ndev are not restored when promisc mode is toggled off,
because they are considered as synched with ALE, thus, in order to
restore them after promisc mode - reset syncing info. This fix
touches only switch mode devices, including single port boards
like Beagle Bone.

Fixes: commit 5da1948969bc
("net: ethernet: ti: cpsw: fix lost of mcast packets while rx_mode update")

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'octeontx2-af-NPC-parser-and-NIX-blocks-initialization'
David S. Miller [Tue, 23 Oct 2018 03:15:39 +0000 (20:15 -0700)]
Merge branch 'octeontx2-af-NPC-parser-and-NIX-blocks-initialization'

Sunil Goutham says:

====================
octeontx2-af: NPC parser and NIX blocks initialization

This patchset is a continuation to earlier submitted two patch
series to add a new driver for Marvell's OcteonTX2 SOC's
Resource virtualization unit (RVU) admin function driver.

1. octeontx2-af: Add RVU Admin Function driver
   https://www.spinics.net/lists/netdev/msg528272.html
2. octeontx2-af: NPA and NIX blocks initialization
   https://www.spinics.net/lists/netdev/msg529163.html

This patch series adds more NIX block configuration logic
and additionally adds NPC block parser profile configuration.
In brief below is what this series adds.
NIX block:
- Support for PF/VF to allocate/free transmit scheduler queues,
  maintenance and their configuration.
- Adds support for packet replication lists, only broadcast
  packets is covered for now.
- Defines few RSS flow algorithms for HW to distribute packets.
  This is not the hash algorithsm (i.e toeplitz or crc32), here SW
  defines what fields in packet should HW take and calculate the hash.
- Support for PF/VF to configure VTAG strip and capture capabilities.
- Reset NIXLF statastics.

NPC block:
This block has multiple parser engines which support packet parsing
at multiple layers and generates a parse result which is further used
to generate a key. Based on packet field offsets in the key, SW can
install packet forwarding rules.
This patch series adds
- Initial parser profile to be programmed into parser engines.
- Default forwarding rules to forward packets to different logical
  interfaces having a NIXLF attached.
- Support for promiscuous and multicast modes.

Changes from v1:
 1 Fixed kernel build failure when compiled with BIG_ENDIAN enabled.
   - Reported by Kbuild test robot
 2 Fixed a warning observed when kernel is built with -Wunused-but-set-variable
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoocteontx2-af: Support for NIXLF's UCAST/PROMISC/ALLMULTI modes
Sunil Goutham [Mon, 22 Oct 2018 17:56:04 +0000 (23:26 +0530)]
octeontx2-af: Support for NIXLF's UCAST/PROMISC/ALLMULTI modes

By default NIXLF is set in UCAST mode. This patch adds a new
mailbox message which when sent by a RVU PF changes this default
mode. When promiscuous mode is needed, the reserved promisc entry
for each of RVU PF is setup to match against ingress channel number
only, so that all pkts on that channel are accepted and forwarded
to the mode change requesting PF_FUNC's NIXLF.

PROMISC and ALLMULTI modes are supported only for PFs, for VFs only
UCAST mode is supported.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoocteontx2-af: Support for setting MAC address
Sunil Goutham [Mon, 22 Oct 2018 17:56:03 +0000 (23:26 +0530)]
octeontx2-af: Support for setting MAC address

Added a new mailbox message for a PF/VF to set/update
it's NIXLF's MAC address. Also updates unicast NPC
MCAM entry with this address as matching DMAC.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoocteontx2-af: Support for changing RSS algorithm
Sunil Goutham [Mon, 22 Oct 2018 17:56:02 +0000 (23:26 +0530)]
octeontx2-af: Support for changing RSS algorithm

This patch adds support for a RVU PF/VF to change
NIX Rx flowkey algorithm index in NPC RX RSS_ACTION.
eg: a ethtool command changing RSS algorithm for a netdev
interface would trigger this change in NPC.

If PF/VF doesn't specify any MCAM entry index then default
UCAST entry of the NIXLF attached to PF/VF will be updated
with RSS_ACTION and flowkey index.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoocteontx2-af: NIX Rx flowkey configuration for RSS
Sunil Goutham [Mon, 22 Oct 2018 17:56:01 +0000 (23:26 +0530)]
octeontx2-af: NIX Rx flowkey configuration for RSS

Configure NIX RX flowkey algorithm configuration to support
RSS (receive side scaling). Currently support for only L3/L4
2-tuple and 4-tuple hash of IPv4/v6/TCP/UDP/SCTP is added.
HW supports upto 32 different flowkey algorithms which SW
can define, this patch defines 9. NPC RX ACTION has to point
to one of these flowkey indices for RSS to work.

The configuration is dependent on NPC parse result's layer
info. So if NPC KPU profile changes suchthat LID/LTYPE values
of above said protocols change then this configuration will
most likely be effected.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoocteontx2-af: Install ucast and bcast pkt forwarding rules
Sunil Goutham [Mon, 22 Oct 2018 17:56:00 +0000 (23:26 +0530)]
octeontx2-af: Install ucast and bcast pkt forwarding rules

Upon NIXLF_ALLOC install a unicast forwarding rule in NPC MCAM
like below
 - Match pkt DMAC with NIXLF attached PF/VF's MAC address.
 - Ingress channel
 - Action is UCAST
 - Forward to PF_FUNC of this NIXLF
And broadcast pkt forwarding rule as
 - Match L2B bit in MCAM search key
 - Ingress channel
 - Action is UCAST, for now, later it will be changed to MCAST.
Only PFs can install this rule

Upon NIXLF_FREE disable all MCAM entries in use by that NIXLF.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoocteontx2-af: Add LMAC channel info to NIXLF_ALLOC response
Stanislaw Kardach [Mon, 22 Oct 2018 17:55:59 +0000 (23:25 +0530)]
octeontx2-af: Add LMAC channel info to NIXLF_ALLOC response

Add LMAC channel info like Rx/Tx channel base and count to
NIXLF_ALLOC mailbox message response. This info is used by
NIXLF attached RVU PF/VF to configure SQ's default channel,
TL3_TL2_LINKX_CFG and to install MCAM rules in NPC based
on matching ingress channel number.

Signed-off-by: Stanislaw Kardach <skardach@marvell.com>
Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoocteontx2-af: NPC MCAM and LDATA extract minimal configuration
Sunil Goutham [Mon, 22 Oct 2018 17:55:58 +0000 (23:25 +0530)]
octeontx2-af: NPC MCAM and LDATA extract minimal configuration

This patch adds some minimal configuration for NPC MCAM and
LDATA extraction which is sufficient enough to install
ucast/bcast/promiscuous forwarding rules. Below is the
config done
- LDATA extraction config to extract DMAC from pkt
  to offset 64bit in MCAM search key.
- Set MCAM lookup keysize to 224bits
- Set MCAM TX miss action to UCAST_DEFAULT
- Set MCAM RX miss action to DROP

Also inorder to have guaranteed space in MCAM to install
ucast forwarding rule for each of RVU PF/VF, reserved
one MCAM entry for each of NIXLF for ucast rule. And two
entries for each of RVU PF. One for bcast pkt replication
and other for promiscuous mode which allows all pkts
received on a HW CGX/LBK channel.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoocteontx2-af: Enable packet length and csum validation
Sunil Goutham [Mon, 22 Oct 2018 17:55:57 +0000 (23:25 +0530)]
octeontx2-af: Enable packet length and csum validation

Config NPC layer info from KPU profile into protocol
checker to identify outer L2/IPv4/TCP/UDP headers in a
packet. And enable IPv4 checksum validation.

L3/L4 and L4 CSUM validation will be enabled by PF/VF
drivers by configuring NIX_AF_LF(0..127)_RX_CFG via mbox
i.e 'nix_lf_alloc_req->rx_cfg'

Also enable setting of NPC_RESULT_S[L2B] when an outer
L2 broadcast address is detected. This will help in
installing NPC MCAM rules for broadcast packets.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoocteontx2-af: Support for VTAG strip and capture
Vamsi Attunuru [Mon, 22 Oct 2018 17:55:56 +0000 (23:25 +0530)]
octeontx2-af: Support for VTAG strip and capture

Added support for PF/VF drivers to configure NIX to
capture and/or strip VLAN tag from ingress packets.

Signed-off-by: Vamsi Attunuru <vamsi.attunuru@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoocteontx2-af: Update bcast list upon NIXLF alloc/free
Sunil Goutham [Mon, 22 Oct 2018 17:55:55 +0000 (23:25 +0530)]
octeontx2-af: Update bcast list upon NIXLF alloc/free

Upon NIXLF ALLOC/FREE, add or remove corresponding PF_FUNC from
the broadcast packet replication list of the CGX LMAC mapped
RVU PF.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoocteontx2-af: Broadcast packet replication support
Sunil Goutham [Mon, 22 Oct 2018 17:55:54 +0000 (23:25 +0530)]
octeontx2-af: Broadcast packet replication support

Allocate memory for mcast/bcast/mirror replication entry
contexts, replication buffers (used by HW) and config HW
with corresponding memory bases. Added support for installing
MCEs via NIX AQ mbox.

For now support is restricted to broadcast pkt replication,
hence MCE table size and number of replication buffers
allocated are less. Each CGX LMAC mapped RVU PF is assigned
a MCE table of size 'num VFs of that PF + PF'.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoocteontx2-af: Config pkind for CGX mapped PFs
Geetha sowjanya [Mon, 22 Oct 2018 17:55:53 +0000 (23:25 +0530)]
octeontx2-af: Config pkind for CGX mapped PFs

For each CGX LMAC that is mapped to a RVU PF, allocate
a pkind and config the same in CGX. For a received packet
at CGX LMAC interface this pkind is used by NPC block
to start parsing of packet.

Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoocteontx2-af: Config NPC KPU engines with parser profile
Sunil Goutham [Mon, 22 Oct 2018 17:55:52 +0000 (23:25 +0530)]
octeontx2-af: Config NPC KPU engines with parser profile

This patch configures all 16 KPUs and iKPU (pkinds) with
the KPU parser profile defined in npc_profile.h. Each KPU
engine has a 128 entry CAM, only CAM entries which are listed
in the profile are enabled and rest are left disabled.

Also
- Memory is allocated for pkind's bitmap and PFFUNC, interface
  channel mapping.
- Added all CSR offsets of NPC HW block.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoocteontx2-af: Add NPC KPU profile
Hao Zheng [Mon, 22 Oct 2018 17:55:51 +0000 (23:25 +0530)]
octeontx2-af: Add NPC KPU profile

NPC block is responsible for parsing and forwarding
packets to different NIXLFs. NPC has 16 KPU engines
(Kangaroo parse engine) and one iKPU which represents
pkinds. Each physical port either CGX/LBK is assigned
a pkind and upon receiving a packet HW takes that port's
pkind and starts parsing as per the KPU engines config.

This patch adds header files which contain configuration
profile/array for each of the iKPU and 16 KPU engines.

Signed-off-by: Hao Zheng <hao.zheng@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoocteontx2-af: Reset NIXLF's Rx/Tx stats
Vamsi Attunuru [Mon, 22 Oct 2018 17:55:50 +0000 (23:25 +0530)]
octeontx2-af: Reset NIXLF's Rx/Tx stats

This patch adds a new mailbox message to reset
a NIXLF's receive and transmit HW stats.

Signed-off-by: Vamsi Attunuru <vamsi.attunuru@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoocteontx2-af: NIX Tx scheduler queue config support
Sunil Goutham [Mon, 22 Oct 2018 17:55:49 +0000 (23:25 +0530)]
octeontx2-af: NIX Tx scheduler queue config support

This patch adds support for a PF/VF driver to configure
NIX transmit scheduler queues via mbox. Since PF/VF doesn't
know the absolute HW index of the NIXLF attached to it, AF
traps the register config and overwrites with the correct
NIXLF index.

HW supports shaping, colouring and policing of packets with
these multilevel traffic scheduler queues. Instead of
introducing different mbox message formats for different
configurations and making both AF & PF/VF driver implementation
cumbersome, access to the scheduler queue's CSRs is provided
via mbox. AF checks whether the sender PF/VF has the
corresponding queue allocated or not and dumps the config
to HW. With a single mbox msg 20 registers can be configured.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoocteontx2-af: NIX Tx scheduler queues alloc/free
Sunil Goutham [Mon, 22 Oct 2018 17:55:48 +0000 (23:25 +0530)]
octeontx2-af: NIX Tx scheduler queues alloc/free

Added support for a PF/VF to allocate or free NIX transmit
scheduler queues via mbox. For setting up pkt transmission
priorities between queues, the scheduler queues have to be
contiguous w.r.t their HW indices. So both contiguous and
non-contiguous allocations are supported.

Upon receiving NIX_TXSCH_FREE mbox msg all scheduler queues
allocated to sending PFFUNC (PF/VF) will be freed. Selective
free is not supported.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agollc: do not use sk_eat_skb()
Eric Dumazet [Mon, 22 Oct 2018 16:24:27 +0000 (09:24 -0700)]
llc: do not use sk_eat_skb()

syzkaller triggered a use-after-free [1], caused by a combination of
skb_get() in llc_conn_state_process() and usage of sk_eat_skb()

sk_eat_skb() is assuming the skb about to be freed is only used by
the current thread. TCP/DCCP stacks enforce this because current
thread holds the socket lock.

llc_conn_state_process() wants to make sure skb does not disappear,
and holds a reference on the skb it manipulates. But as soon as this
skb is added to socket receive queue, another thread can consume it.

This means that llc must use regular skb_unlink() and kfree_skb()
so that both producer and consumer can safely work on the same skb.

[1]
BUG: KASAN: use-after-free in atomic_read include/asm-generic/atomic-instrumented.h:21 [inline]
BUG: KASAN: use-after-free in refcount_read include/linux/refcount.h:43 [inline]
BUG: KASAN: use-after-free in skb_unref include/linux/skbuff.h:967 [inline]
BUG: KASAN: use-after-free in kfree_skb+0xb7/0x580 net/core/skbuff.c:655
Read of size 4 at addr ffff8801d1f6fba4 by task ksoftirqd/1/18

CPU: 1 PID: 18 Comm: ksoftirqd/1 Not tainted 4.19.0-rc8+ #295
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1c4/0x2b6 lib/dump_stack.c:113
 print_address_description.cold.8+0x9/0x1ff mm/kasan/report.c:256
 kasan_report_error mm/kasan/report.c:354 [inline]
 kasan_report.cold.9+0x242/0x309 mm/kasan/report.c:412
 check_memory_region_inline mm/kasan/kasan.c:260 [inline]
 check_memory_region+0x13e/0x1b0 mm/kasan/kasan.c:267
 kasan_check_read+0x11/0x20 mm/kasan/kasan.c:272
 atomic_read include/asm-generic/atomic-instrumented.h:21 [inline]
 refcount_read include/linux/refcount.h:43 [inline]
 skb_unref include/linux/skbuff.h:967 [inline]
 kfree_skb+0xb7/0x580 net/core/skbuff.c:655
 llc_sap_state_process+0x9b/0x550 net/llc/llc_sap.c:224
 llc_sap_rcv+0x156/0x1f0 net/llc/llc_sap.c:297
 llc_sap_handler+0x65e/0xf80 net/llc/llc_sap.c:438
 llc_rcv+0x79e/0xe20 net/llc/llc_input.c:208
 __netif_receive_skb_one_core+0x14d/0x200 net/core/dev.c:4913
 __netif_receive_skb+0x2c/0x1e0 net/core/dev.c:5023
 process_backlog+0x218/0x6f0 net/core/dev.c:5829
 napi_poll net/core/dev.c:6249 [inline]
 net_rx_action+0x7c5/0x1950 net/core/dev.c:6315
 __do_softirq+0x30c/0xb03 kernel/softirq.c:292
 run_ksoftirqd+0x94/0x100 kernel/softirq.c:653
 smpboot_thread_fn+0x68b/0xa00 kernel/smpboot.c:164
 kthread+0x35a/0x420 kernel/kthread.c:246
 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:413

Allocated by task 18:
 save_stack+0x43/0xd0 mm/kasan/kasan.c:448
 set_track mm/kasan/kasan.c:460 [inline]
 kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553
 kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490
 kmem_cache_alloc_node+0x144/0x730 mm/slab.c:3644
 __alloc_skb+0x119/0x770 net/core/skbuff.c:193
 alloc_skb include/linux/skbuff.h:995 [inline]
 llc_alloc_frame+0xbc/0x370 net/llc/llc_sap.c:54
 llc_station_ac_send_xid_r net/llc/llc_station.c:52 [inline]
 llc_station_rcv+0x1dc/0x1420 net/llc/llc_station.c:111
 llc_rcv+0xc32/0xe20 net/llc/llc_input.c:220
 __netif_receive_skb_one_core+0x14d/0x200 net/core/dev.c:4913
 __netif_receive_skb+0x2c/0x1e0 net/core/dev.c:5023
 process_backlog+0x218/0x6f0 net/core/dev.c:5829
 napi_poll net/core/dev.c:6249 [inline]
 net_rx_action+0x7c5/0x1950 net/core/dev.c:6315
 __do_softirq+0x30c/0xb03 kernel/softirq.c:292

Freed by task 16383:
 save_stack+0x43/0xd0 mm/kasan/kasan.c:448
 set_track mm/kasan/kasan.c:460 [inline]
 __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521
 kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
 __cache_free mm/slab.c:3498 [inline]
 kmem_cache_free+0x83/0x290 mm/slab.c:3756
 kfree_skbmem+0x154/0x230 net/core/skbuff.c:582
 __kfree_skb+0x1d/0x20 net/core/skbuff.c:642
 sk_eat_skb include/net/sock.h:2366 [inline]
 llc_ui_recvmsg+0xec2/0x1610 net/llc/af_llc.c:882
 sock_recvmsg_nosec net/socket.c:794 [inline]
 sock_recvmsg+0xd0/0x110 net/socket.c:801
 ___sys_recvmsg+0x2b6/0x680 net/socket.c:2278
 __sys_recvmmsg+0x303/0xb90 net/socket.c:2390
 do_sys_recvmmsg+0x181/0x1a0 net/socket.c:2466
 __do_sys_recvmmsg net/socket.c:2484 [inline]
 __se_sys_recvmmsg net/socket.c:2480 [inline]
 __x64_sys_recvmmsg+0xbe/0x150 net/socket.c:2480
 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

The buggy address belongs to the object at ffff8801d1f6fac0
 which belongs to the cache skbuff_head_cache of size 232
The buggy address is located 228 bytes inside of
 232-byte region [ffff8801d1f6fac0ffff8801d1f6fba8)
The buggy address belongs to the page:
page:ffffea000747dbc0 count:1 mapcount:0 mapping:ffff8801d9be7680 index:0xffff8801d1f6fe80
flags: 0x2fffc0000000100(slab)
raw: 02fffc0000000100 ffffea0007346e88 ffffea000705b108 ffff8801d9be7680
raw: ffff8801d1f6fe80 ffff8801d1f6f0c0 000000010000000b 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff8801d1f6fa80: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
 ffff8801d1f6fb00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff8801d1f6fb80: fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc
                               ^
 ffff8801d1f6fc00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff8801d1f6fc80: fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet/wan/fsl_ucc_hdlc: error counters
Mathias Thore [Mon, 22 Oct 2018 12:55:50 +0000 (14:55 +0200)]
net/wan/fsl_ucc_hdlc: error counters

Extract error information from rx and tx buffer descriptors,
and update error counters.

Signed-off-by: Mathias Thore <mathias.thore@infinera.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: legacy: simplify getting .driver_data
Wolfram Sang [Sun, 21 Oct 2018 20:01:05 +0000 (22:01 +0200)]
net: dsa: legacy: simplify getting .driver_data

We should get 'driver_data' from 'struct device' directly. Going via
platform_device is an unneeded step back and forth.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoptp: ptp_dte: simplify getting .driver_data
Wolfram Sang [Sun, 21 Oct 2018 20:00:39 +0000 (22:00 +0200)]
ptp: ptp_dte: simplify getting .driver_data

We should get 'driver_data' from 'struct device' directly. Going via
platform_device is an unneeded step back and forth.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ena: fix compilation error in xtensa architecture
Arthur Kiyanovski [Sun, 21 Oct 2018 15:07:14 +0000 (18:07 +0300)]
net: ena: fix compilation error in xtensa architecture

linux/prefetch.h is never explicitly included in ena_com, although
functions from it, such as prefetchw(), are used throughout ena_com.
This is an inclusion bug, and we fix it here by explicitly including
linux/prefetch.h. The bug was exposed when the driver was compiled
for the xtensa architecture.

Fixes: 689b2bdaaa14 ("net: ena: add functions for handling Low Latency Queues in ena_com")
Fixes: 8c590f977638 ("ena: Fix Kconfig dependency on X86")
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoaf_unix.h: trivial whitespace cleanup
Vito Caputo [Sun, 21 Oct 2018 11:33:03 +0000 (04:33 -0700)]
af_unix.h: trivial whitespace cleanup

Replace spurious spaces with a tab and remove superfluous tab from
unix_sock struct.

Signed-off-by: Vito Caputo <vcaputo@pengaru.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet/mlx5: Allocate enough space for the FDB sub-namespaces
Dan Carpenter [Sun, 21 Oct 2018 11:05:49 +0000 (14:05 +0300)]
net/mlx5: Allocate enough space for the FDB sub-namespaces

FDB_MAX_CHAIN is three.  We wanted to allocate enough memory to hold four
structs but there are missing parentheses so we only allocate enough
memory for three structs and the first byte of the fourth one.

Fixes: 328edb499f99 ("net/mlx5: Split FDB fast path prio to multiple namespaces")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'forbid-goto_chain-fallback'
David S. Miller [Tue, 23 Oct 2018 02:42:58 +0000 (19:42 -0700)]
Merge branch 'forbid-goto_chain-fallback'

Davide Caratti says:

====================
net/sched: forbid 'goto_chain' on fallback actions

the following command:

 # tc actions add action police rate 1mbit burst 1k conform-exceed \
 > pass / goto chain 42

generates a NULL pointer dereference when packets exceed the configured
rate. Similarly, the following command:

 # tc actions add action pass random determ goto chain 42 2

makes the kernel crash with NULL dereference when the first packet does
not match the 'pass' action.

gact and police allow users to specify a fallback control action, that is
stored in the action private data. 'goto chain x' never worked for these
cases, since a->goto_chain handle was never initialized. There is only one
goto_chain handle per TC action, and it is designed to be non-NULL only if
tcf_action contains a 'goto chain' command. So, let's forbid 'goto chain'
on fallback actions.

Patch 1/4 and 2/4 change the .init() functions of police and gact, to let
them return an error when users try to set 'goto chain x' in the fallback
action. Patch 3/4 and 4/4 add TDC selftest coverage to this new behavior.
====================

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agotc-tests: test denial of 'goto chain' for exceed traffic in police.json
Davide Caratti [Sat, 20 Oct 2018 21:33:10 +0000 (23:33 +0200)]
tc-tests: test denial of 'goto chain' for exceed traffic in police.json

add test to verify if act_police forbids 'goto chain' control actions for
'exceed' traffic.

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agotc-tests: test denial of 'goto chain' on 'random' traffic in gact.json
Davide Caratti [Sat, 20 Oct 2018 21:33:09 +0000 (23:33 +0200)]
tc-tests: test denial of 'goto chain' on 'random' traffic in gact.json

add test to verify if act_gact forbids 'goto chain' control actions on
'random' traffic in gact.json.

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet/sched: act_police: disallow 'goto chain' on fallback control action
Davide Caratti [Sat, 20 Oct 2018 21:33:08 +0000 (23:33 +0200)]
net/sched: act_police: disallow 'goto chain' on fallback control action

in the following command:

 # tc action add action police rate <r> burst <b> conform-exceed <c1>/<c2>

'goto chain x' is allowed only for c1: setting it for c2 makes the kernel
crash with NULL pointer dereference, since TC core doesn't initialize the
chain handle.

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet/sched: act_gact: disallow 'goto chain' on fallback control action
Davide Caratti [Sat, 20 Oct 2018 21:33:07 +0000 (23:33 +0200)]
net/sched: act_gact: disallow 'goto chain' on fallback control action

in the following command:

 # tc action add action <c1> random <rand_type> <c2> <rand_param>

'goto chain x' is allowed only for c1: setting it for c2 makes the kernel
crash with NULL pointer dereference, since TC core doesn't initialize the
chain handle.

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phy: phy_support_sym_pause: Clear Asym Pause
Andrew Lunn [Sat, 20 Oct 2018 20:41:28 +0000 (22:41 +0200)]
net: phy: phy_support_sym_pause: Clear Asym Pause

When indicating the MAC supports Symmetric Pause, clear the Asymmetric
Pause bit, which could of been already set is the PHY supports it.

Reported-by: Labbe Corentin <clabbe@baylibre.com>
Fixes: c306ad36184f ("net: ethernet: Add helper for MACs which support pause")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: bpfilter: Set user mode helper's command line
Olivier Brunel [Sat, 20 Oct 2018 17:39:57 +0000 (19:39 +0200)]
net: bpfilter: Set user mode helper's command line

Signed-off-by: Olivier Brunel <jjk@jjacky.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoumh: Add command line to user mode helpers
Olivier Brunel [Sat, 20 Oct 2018 17:39:56 +0000 (19:39 +0200)]
umh: Add command line to user mode helpers

User mode helpers were spawned without a command line, and because
an empty command line is used by many tools to identify processes as
kernel threads, this could cause some issues.

Notably during killing spree on shutdown, since such helper would then
be skipped (i.e. not killed) which would result in the process remaining
alive, and thus preventing unmouting of the rootfs (as experienced with
the bpfilter umh).

Fixes: 449325b52b7a ("umh: introduce fork_usermode_blob() helper")
Signed-off-by: Olivier Brunel <jjk@jjacky.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoqlcnic: fix a return in qlcnic_dcb_get_capability()
Dan Carpenter [Fri, 19 Oct 2018 20:11:11 +0000 (23:11 +0300)]
qlcnic: fix a return in qlcnic_dcb_get_capability()

These functions are supposed to return one on failure and zero on
success.  Returning a zero here could cause uninitialized variable
bugs in several of the callers.  For example:

    drivers/scsi/cxgbi/cxgb4i/cxgb4i.c:1660 get_iscsi_dcb_priority()
    error: uninitialized symbol 'caps'.

Fixes: 48365e485275 ("qlcnic: dcb: Add support for CEE Netlink interface.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'net-Add-support-for-dumping-addresses-for-a-specific-device'
David S. Miller [Tue, 23 Oct 2018 02:33:29 +0000 (19:33 -0700)]
Merge branch 'net-Add-support-for-dumping-addresses-for-a-specific-device'

David Ahern says:

====================
net: Add support for dumping addresses for a specific device

Use the recently added kernel side filter infrastructure to add support
for dumping addresses only for a specific device.

Patch 1 creates an IPv4 version similar to IPv6's in6_dump_addrs function.

Patch 2 simplifies in6_dump_addrs by moving index tracking of IP
addresses from inet6_dump_addr to in6_dump_addrs.

Patches 3 and 4 use the device-based address dump helpers to limit a
dump to just the addresses on a specific device.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet/ipv6: Add support for dumping addresses for a specific device
David Ahern [Fri, 19 Oct 2018 19:45:30 +0000 (12:45 -0700)]
net/ipv6: Add support for dumping addresses for a specific device

If an RTM_GETADDR dump request has ifa_index set in the ifaddrmsg
header, then return only the addresses for that device.

Since inet6_dump_addr is reused for multicast and anycast addresses,
this adds support for device specfic dumps of RTM_GETMULTICAST and
RTM_GETANYCAST as well.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet/ipv4: Add support for dumping addresses for a specific device
David Ahern [Fri, 19 Oct 2018 19:45:29 +0000 (12:45 -0700)]
net/ipv4: Add support for dumping addresses for a specific device

If an RTM_GETADDR dump request has ifa_index set in the ifaddrmsg
header, then return only the addresses for that device.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet/ipv6: Remove ip_idx arg to in6_dump_addrs
David Ahern [Fri, 19 Oct 2018 19:45:28 +0000 (12:45 -0700)]
net/ipv6: Remove ip_idx arg to in6_dump_addrs

ip_idx is always 0 going into in6_dump_addrs; it is passed as a pointer
to save the last good index into cb. Since cb is already argument to
in6_dump_addrs, just save the value there.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet/ipv4: Move loop over addresses on a device into in_dev_dump_addr
David Ahern [Fri, 19 Oct 2018 19:45:27 +0000 (12:45 -0700)]
net/ipv4: Move loop over addresses on a device into in_dev_dump_addr

Similar to IPv6 move the logic that walks over the ipv4 address list
for a device into a helper.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'hns3-next'
David S. Miller [Tue, 23 Oct 2018 02:31:14 +0000 (19:31 -0700)]
Merge branch 'hns3-next'

Salil Mehta says:

====================
Adds support of RAS Error Handling in HNS3 Driver

This patch-set adds support related to RAS Error handling to the HNS3
Ethernet PF Driver. Set of errors occurred in the HNS3 hardware are
reported to the driver through the PCIe AER interface. The received
error information is then used to classify the received errors and
then decide the appropriate receovery action depending on the type
of error.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: hns3: Add enable and process hw errors of TM scheduler
Shiju Jose [Fri, 19 Oct 2018 19:15:32 +0000 (20:15 +0100)]
net: hns3: Add enable and process hw errors of TM scheduler

This patch enables and process hw errors of TM scheduler and
QCN(Quantized Congestion Control).

Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: hns3: Add enable and process hw errors from PPP
Shiju Jose [Fri, 19 Oct 2018 19:15:31 +0000 (20:15 +0100)]
net: hns3: Add enable and process hw errors from PPP

This patch enables and process hw errors from the
PPP(Programmable Packet Process) block.

Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: hns3: Add enable and process hw errors from IGU, EGU and NCSI
Shiju Jose [Fri, 19 Oct 2018 19:15:30 +0000 (20:15 +0100)]
net: hns3: Add enable and process hw errors from IGU, EGU and NCSI

This patch adds enable and processing of hw errors from IGU(Ingress Unit),
EGU(Egress Unit) and NCSI(Network Controller Sideband Interface).

Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: hns3: Add enable and process common ecc errors
Shiju Jose [Fri, 19 Oct 2018 19:15:29 +0000 (20:15 +0100)]
net: hns3: Add enable and process common ecc errors

This patch adds enable and processing of ecc errors from
common HNS blocks, CMDQ(Command Queue),
IMP(Integrated Management Processor) and TQP(Task Queue Pair).

Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: hns3: Add support to enable and disable hw errors
Shiju Jose [Fri, 19 Oct 2018 19:15:28 +0000 (20:15 +0100)]
net: hns3: Add support to enable and disable hw errors

This patch adds functions to enable and disable hw errors.

Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: hns3: Add PCIe AER error recovery
Shiju Jose [Fri, 19 Oct 2018 19:15:27 +0000 (20:15 +0100)]
net: hns3: Add PCIe AER error recovery

This patch adds the error recovery for the HNS hw errors.

Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: hns3: Add PCIe AER callback error_detected
Shiju Jose [Fri, 19 Oct 2018 19:15:26 +0000 (20:15 +0100)]
net: hns3: Add PCIe AER callback error_detected

Set of hw errors occurred in the HNS3 are reported to the
hns3 driver through PCIe AER and RAS.The error info will be
processed and appropriately recovered.
This patch adds error_detected callback and error processing.

Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomISDN: Fix type of switch control variable in ctrl_teimanager
Nathan Chancellor [Fri, 19 Oct 2018 18:00:30 +0000 (11:00 -0700)]
mISDN: Fix type of switch control variable in ctrl_teimanager

Clang warns (trimmed for brevity):

drivers/isdn/mISDN/tei.c:1193:7: warning: overflow converting case value
to switch condition type (2147764552 to 18446744071562348872) [-Wswitch]
        case IMHOLD_L1:
             ^
drivers/isdn/mISDN/tei.c:1187:7: warning: overflow converting case value
to switch condition type (2147764550 to 18446744071562348870) [-Wswitch]
        case IMCLEAR_L2:
             ^
2 warnings generated.

The root cause is that the _IOC macro can generate really large numbers,
which don't find into type int. My research into how GCC and Clang are
handling this at a low level didn't prove fruitful and surveying the
kernel tree shows that aside from here and a few places in the scsi
subsystem, everything that uses _IOC is at least of type 'unsigned int'.
Make that change here because as nothing in this function cares about
the signedness of the variable and it removes ambiguity, which is never
good when dealing with compilers.

While we're here, remove the unnecessary local variable ret (just return
-EINVAL and 0 directly).

Link: https://github.com/ClangBuiltLinux/linux/issues/67
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agotipc: eliminate message disordering during binding table update
Jon Maloy [Fri, 19 Oct 2018 17:55:40 +0000 (19:55 +0200)]
tipc: eliminate message disordering during binding table update

We have seen the following race scenario:
1) named_distribute() builds a "bulk" message, containing a PUBLISH
   item for a certain publication. This is based on the contents of
   the binding tables's 'cluster_scope' list.
2) tipc_named_withdraw() removes the same publication from the list,
   bulds a WITHDRAW message and distributes it to all cluster nodes.
3) tipc_named_node_up(), which was calling named_distribute(), sends
   out the bulk message built under 1)
4) The WITHDRAW message arrives at the just detected node, finds
   no corresponding publication, and is dropped.
5) The PUBLISH item arrives at the same node, is added to its binding
   table, and remains there forever.

This arrival disordering was earlier taken care of by the backlog queue,
originally added for a different purpose, which was removed in the
commit referred to below, but we now need a different solution.
In this commit, we replace the rcu lock protecting the 'cluster_scope'
list with a regular RW lock which comprises even the sending of the
bulk message. This both guarantees both the list integrity and the
message sending order. We will later add a commit which cleans up
this code further.

Note that this commit needs recently added commit d3092b2efca1 ("tipc:
fix unsafe rcu locking when accessing publication list") to apply
cleanly.

Fixes: 37922ea4a310 ("tipc: permit overlapping service ranges in name table")
Reported-by: Tuong Lien Tong <tuong.t.lien@dektech.com.au>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoocteontx2-af: Remove set but not used variables 'devnum, is_pf'
YueHaibing [Fri, 19 Oct 2018 13:03:06 +0000 (13:03 +0000)]
octeontx2-af: Remove set but not used variables 'devnum, is_pf'

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/ethernet/marvell/octeontx2/af/rvu.c: In function 'rvu_detach_rsrcs':
drivers/net/ethernet/marvell/octeontx2/af/rvu.c:855:6: warning:
 variable 'devnum' set but not used [-Wunused-but-set-variable]

drivers/net/ethernet/marvell/octeontx2/af/rvu.c:853:7: warning:
 variable 'is_pf' set but not used [-Wunused-but-set-variable]

drivers/net/ethernet/marvell/octeontx2/af/rvu.c: In function 'rvu_mbox_handler_ATTACH_RESOURCES':
drivers/net/ethernet/marvell/octeontx2/af/rvu.c:1054:7: warning:
 variable 'is_pf' set but not used [-Wunused-but-set-variable]

drivers/net/ethernet/marvell/octeontx2/af/rvu.c:1053:6: warning:
 variable 'devnum' set but not used [-Wunused-but-set-variable]

It never used since introduction in commit
746ea74241fa ("octeontx2-af: Add RVU block LF provisioning support")

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoocteontx2-af: Remove set but not used variable 'block'
YueHaibing [Fri, 19 Oct 2018 12:51:28 +0000 (12:51 +0000)]
octeontx2-af: Remove set but not used variable 'block'

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/ethernet/marvell/octeontx2/af/rvu_npa.c: In function 'rvu_npa_init':
drivers/net/ethernet/marvell/octeontx2/af/rvu_npa.c:446:20: warning:
 variable 'block' set but not used [-Wunused-but-set-variable]

It never used since introduction in
commit 7a37245ef23f ("octeontx2-af: NPA block admin queue init")

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'phy-ocelot-serdes-fix-out-of-bounds-read'
David S. Miller [Tue, 23 Oct 2018 02:27:15 +0000 (19:27 -0700)]
Merge branch 'phy-ocelot-serdes-fix-out-of-bounds-read'

Gustavo A. R. Silva says:

====================
phy: ocelot-serdes: fix out-of-bounds read

This patchset aims to fix an out-of-bounds bug in
the phy-ocelot-serdes driver.

Currently, there is an out-of-bounds read on array ctrl->phys,
once variable i reaches the maximum array size of SERDES_MAX
in the for loop.

Quentin Schulz pointed out that SERDES_MAX is a valid value to
index ctrl->phys. So, I updated SERDES_MAX to be SERDES6G_MAX + 1
in include/dt-bindings/phy/phy-ocelot-serdes.h.

Then I changed the condition in the for loop from
i <= SERDES_MAX to i < SERDES_MAX in order to
complete the fix.

The reason I'm sending this fix as series is because
checkpatch reported an error when I first tried to
integrate the whole solution into a singe patch. So,
changes to dt-bindings should be sent as a separate
patch.

Changes in v3:
 - Post the series to netdev, so Dave can take it.

Changes in v2:
 - Send the whole series to Kishon Vijay Abraham I, so it
   can be taken into the PHY tree.
 - Add Quentin's Reviewed-by to commit log in both patches.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agophy: ocelot-serdes: fix out-of-bounds read
Gustavo A. R. Silva [Fri, 19 Oct 2018 09:21:38 +0000 (11:21 +0200)]
phy: ocelot-serdes: fix out-of-bounds read

Currently, there is an out-of-bounds read on array ctrl->phys,
once variable i reaches the maximum array size of SERDES_MAX
in the for loop.

Fix this by changing the condition in the for loop from
i <= SERDES_MAX to i < SERDES_MAX.

Addresses-Coverity-ID: 1473966 ("Out-of-bounds read")
Addresses-Coverity-ID: 1473959 ("Out-of-bounds read")
Fixes: 51f6b410fc22 ("phy: add driver for Microsemi Ocelot SerDes muxing")
Reviewed-by: Quentin Schulz <quentin.schulz@bootlin.com>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodt-bindings: phy: Update SERDES_MAX to be SERDES_MAX + 1
Gustavo A. R. Silva [Fri, 19 Oct 2018 09:19:13 +0000 (11:19 +0200)]
dt-bindings: phy: Update SERDES_MAX to be SERDES_MAX + 1

SERDES_MAX is a valid value to index ctrl->phys in
drivers/phy/mscc/phy-ocelot-serdes.c. But, currently,
there is an out-of-bounds bug in the mentioned driver
when reading from ctrl->phys, because the size of
array ctrl->phys is SERDES_MAX.

Partially fix this by updating SERDES_MAX to be SERDES6G_MAX + 1.

Notice that this is the first part of the solution to
the out-of-bounds bug mentioned above. Although this
change is not dependent on any other one.

Suggested-by: Quentin Schulz <quentin.schulz@bootlin.com>
Reviewed-by: Quentin Schulz <quentin.schulz@bootlin.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agotipc: use destination length for copy string
Guoqing Jiang [Fri, 19 Oct 2018 04:08:22 +0000 (12:08 +0800)]
tipc: use destination length for copy string

Got below warning with gcc 8.2 compiler.

net/tipc/topsrv.c: In function ‘tipc_topsrv_start’:
net/tipc/topsrv.c:660:2: warning: ‘strncpy’ specified bound depends on the length of the source argument [-Wstringop-overflow=]
  strncpy(srv->name, name, strlen(name) + 1);
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
net/tipc/topsrv.c:660:27: note: length computed here
  strncpy(srv->name, name, strlen(name) + 1);
                           ^~~~~~~~~~~~
So change it to correct length and use strscpy.

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoisdn: hfc_{pci,sx}: Avoid empty body if statements
Nathan Chancellor [Fri, 19 Oct 2018 01:11:04 +0000 (18:11 -0700)]
isdn: hfc_{pci,sx}: Avoid empty body if statements

Clang warns:

drivers/isdn/hisax/hfc_pci.c:131:34: error: if statement has empty body
[-Werror,-Wempty-body]
        if (Read_hfc(cs, HFCPCI_INT_S1));
                                        ^
drivers/isdn/hisax/hfc_pci.c:131:34: note: put the semicolon on a
separate line to silence this warning

In my attempt to hide the warnings because I thought they didn't serve
any purpose[1], Masahiro Yamada pointed out that {Read,Write}_hfc in
hci_pci.c should be using a standard register access method; otherwise,
the compiler will just remove the if statements.

For hfc_pci, use the versions of {Read,Write}_hfc found in
drivers/isdn/hardware/mISDN/hfc_pCI.h while converting pci_io to be
'void __iomem *' (and clean up ioremap) then remove the empty if
statements.

For hfc_sx, {Read,Write}_hfc are already use a proper register accessor
(inb, outb) so just remove the unnecessary if statements.

[1]: https://lore.kernel.org/lkml/20181016021454.11953-1-natechancellor@gmail.com/

Link: https://github.com/ClangBuiltLinux/linux/issues/66
Suggested-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
David S. Miller [Mon, 22 Oct 2018 04:11:46 +0000 (21:11 -0700)]
Merge git://git./linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2018-10-21

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Implement two new kind of BPF maps, that is, queue and stack
   map along with new peek, push and pop operations, from Mauricio.

2) Add support for MSG_PEEK flag when redirecting into an ingress
   psock sk_msg queue, and add a new helper bpf_msg_push_data() for
   insert data into the message, from John.

3) Allow for BPF programs of type BPF_PROG_TYPE_CGROUP_SKB to use
   direct packet access for __skb_buff, from Song.

4) Use more lightweight barriers for walking perf ring buffer for
   libbpf and perf tool as well. Also, various fixes and improvements
   from verifier side, from Daniel.

5) Add per-symbol visibility for DSO in libbpf and hide by default
   global symbols such as netlink related functions, from Andrey.

6) Two improvements to nfp's BPF offload to check vNIC capabilities
   in case prog is shared with multiple vNICs and to protect against
   mis-initializing atomic counters, from Jakub.

7) Fix for bpftool to use 4 context mode for the nfp disassembler,
   also from Jakub.

8) Fix a return value comparison in test_libbpf.sh and add several
   bpftool improvements in bash completion, documentation of bpf fs
   restrictions and batch mode summary print, from Quentin.

9) Fix a file resource leak in BPF selftest's load_kallsyms()
   helper, from Peng.

10) Fix an unused variable warning in map_lookup_and_delete_elem(),
    from Alexei.

11) Fix bpf_skb_adjust_room() signature in BPF UAPI helper doc,
    from Nicolas.

12) Add missing executables to .gitignore in BPF selftests, from Anders.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'net-simplify-getting-driver_data'
David S. Miller [Mon, 22 Oct 2018 04:10:12 +0000 (21:10 -0700)]
Merge branch 'net-simplify-getting-driver_data'

Wolfram Sang says:

====================
net: simplify getting .driver_data

I got tired of fixing this in Renesas drivers manually, so I took the big
hammer. Remove this cumbersome code pattern which got copy-pasted too much
already:

- struct platform_device *pdev = to_platform_device(dev);
- struct ep93xx_keypad *keypad = platform_get_drvdata(pdev);
+ struct ep93xx_keypad *keypad = dev_get_drvdata(dev);

A branch, tested by buildbot, can be found here:

git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux.git coccinelle/get_drvdata

I have been asked if it couldn't be done for dev_set_drvdata as well. I checked
it and did not find one occasion where it could be simplified like this. Not
much of a surprise because driver_data is usually set in probe() functions
which access struct platform_device in many other ways.

I am open for other comments, suggestions, too, of course.

Here is the cocci-script I created:

@@
struct device* d;
identifier pdev;
expression *ptr;
@@
(
- struct platform_device *pdev = to_platform_device(d);
|
- struct platform_device *pdev;
...
- pdev = to_platform_device(d);
)
<... when != pdev
- &pdev->dev
+ d
...>

ptr =
- platform_get_drvdata(pdev)
+ dev_get_drvdata(d)

<... when != pdev
- &pdev->dev
+ d
...>
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phy: mdio-mux-bcm-iproc: simplify getting .driver_data
Wolfram Sang [Sun, 21 Oct 2018 20:00:20 +0000 (22:00 +0200)]
net: phy: mdio-mux-bcm-iproc: simplify getting .driver_data

We should get 'driver_data' from 'struct device' directly. Going via
platform_device is an unneeded step back and forth.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ethernet: wiznet: w5300: simplify getting .driver_data
Wolfram Sang [Sun, 21 Oct 2018 20:00:19 +0000 (22:00 +0200)]
net: ethernet: wiznet: w5300: simplify getting .driver_data

We should get 'driver_data' from 'struct device' directly. Going via
platform_device is an unneeded step back and forth.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ethernet: ti: davinci_emac: simplify getting .driver_data
Wolfram Sang [Sun, 21 Oct 2018 20:00:18 +0000 (22:00 +0200)]
net: ethernet: ti: davinci_emac: simplify getting .driver_data

We should get 'driver_data' from 'struct device' directly. Going via
platform_device is an unneeded step back and forth.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ethernet: ti: cpsw: simplify getting .driver_data
Wolfram Sang [Sun, 21 Oct 2018 20:00:17 +0000 (22:00 +0200)]
net: ethernet: ti: cpsw: simplify getting .driver_data

We should get 'driver_data' from 'struct device' directly. Going via
platform_device is an unneeded step back and forth.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ethernet: smsc: smc91x: simplify getting .driver_data
Wolfram Sang [Sun, 21 Oct 2018 20:00:16 +0000 (22:00 +0200)]
net: ethernet: smsc: smc91x: simplify getting .driver_data

We should get 'driver_data' from 'struct device' directly. Going via
platform_device is an unneeded step back and forth.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ethernet: davicom: dm9000: simplify getting .driver_data
Wolfram Sang [Sun, 21 Oct 2018 20:00:15 +0000 (22:00 +0200)]
net: ethernet: davicom: dm9000: simplify getting .driver_data

We should get 'driver_data' from 'struct device' directly. Going via
platform_device is an unneeded step back and forth.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ethernet: cadence: macb_main: simplify getting .driver_data
Wolfram Sang [Sun, 21 Oct 2018 20:00:14 +0000 (22:00 +0200)]
net: ethernet: cadence: macb_main: simplify getting .driver_data

We should get 'driver_data' from 'struct device' directly. Going via
platform_device is an unneeded step back and forth.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: qca8k: simplify getting .driver_data
Wolfram Sang [Sun, 21 Oct 2018 20:00:13 +0000 (22:00 +0200)]
net: dsa: qca8k: simplify getting .driver_data

We should get 'driver_data' from 'struct device' directly. Going via
platform_device is an unneeded step back and forth.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: bcm_sf2: simplify getting .driver_data
Wolfram Sang [Sun, 21 Oct 2018 20:00:12 +0000 (22:00 +0200)]
net: dsa: bcm_sf2: simplify getting .driver_data

We should get 'driver_data' from 'struct device' directly. Going via
platform_device is an unneeded step back and forth.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
David S. Miller [Sun, 21 Oct 2018 18:54:28 +0000 (11:54 -0700)]
Merge git://git./linux/kernel/git/davem/net

David Ahern's dump indexing bug fix in 'net' overlapped the
change of the function signature of inet6_fill_ifaddr() in
'net-next'.  Trivially resolved.

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agotools: bpftool: fix completion for "bpftool map update"
Quentin Monnet [Sat, 20 Oct 2018 22:01:50 +0000 (23:01 +0100)]
tools: bpftool: fix completion for "bpftool map update"

When trying to complete "bpftool map update" commands, the call to
printf would print an error message that would show on the command line
if no map is found to complete the command line.

Fix it by making sure we have map ids to complete the line with, before
we try to print something.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
3 years agotools: bpftool: print nb of cmds to stdout (not stderr) for batch mode
Quentin Monnet [Sat, 20 Oct 2018 22:01:49 +0000 (23:01 +0100)]
tools: bpftool: print nb of cmds to stdout (not stderr) for batch mode

When batch mode is used and all commands succeeds, bpftool prints the
number of commands processed to stderr. There is no particular reason to
use stderr for this, we could as well use stdout. It would avoid getting
unnecessary output on stderr if the standard ouptut is redirected, for
example.

Reported-by: David Beckett <david.beckett@netronome.com>
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
3 years agotools: bpftool: document restriction on '.' in names to pin in bpffs
Quentin Monnet [Sat, 20 Oct 2018 22:01:48 +0000 (23:01 +0100)]
tools: bpftool: document restriction on '.' in names to pin in bpffs

Names used to pin eBPF programs and maps under the eBPF virtual file
system cannot contain a dot character, which is reserved for future
extensions of this file system.

Document this in bpftool man pages to avoid users getting confused if
pinning fails because of a dot.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
3 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Greg Kroah-Hartman [Sun, 21 Oct 2018 08:08:38 +0000 (10:08 +0200)]
Merge git://git./linux/kernel/git/davem/net

David writes:
  "Networking:

   A few straggler bug fixes:

   1) Fix indexing of multi-pass dumps of ipv6 addresses, from David
      Ahern.

   2) Revert RCU locking change for bonding netpoll, causes worse
      problems than it solves.

   3) pskb_trim_rcsum_slow() doesn't handle odd trim offsets, resulting
      in erroneous bad hw checksum triggers with CHECKSUM_COMPLETE
      devices.  From Dimitris Michailidis.

   4) a revert to some neighbour code changes that adjust notifications
      in a way that confuses some apps."

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  Revert "neighbour: force neigh_invalidate when NUD_FAILED update is from admin"
  net/ipv6: Fix index counter for unicast addresses in in6_dump_addrs
  net: fix pskb_trim_rcsum_slow() with odd trim offset
  Revert "bond: take rcu lock in netpoll_send_skb_on_dev"

3 years agoselftests/bpf: fix return value comparison for tests in test_libbpf.sh
Quentin Monnet [Sat, 20 Oct 2018 21:58:44 +0000 (22:58 +0100)]
selftests/bpf: fix return value comparison for tests in test_libbpf.sh

The return value for each test in test_libbpf.sh is compared with

    if (( $? == 0 )) ; then ...

This works well with bash, but not with dash, that /bin/sh is aliased to
on some systems (such as Ubuntu).

Let's replace this comparison by something that works on both shells.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
3 years agoMerge branch 'misc-improvements'
Alexei Starovoitov [Sun, 21 Oct 2018 06:13:33 +0000 (23:13 -0700)]
Merge branch 'misc-improvements'

Daniel Borkmann says:

====================
Last batch of misc patches I had in queue: first one removes some left-over
bits from ULP, second is a fix in the verifier where we wrongly use register
number as type to fetch the string for the dump, third disables xadd on flow
keys and subsequent one removes the flow key type from check_helper_mem_access()
as they cannot be passed into any helper as of today. Next one lets map push,
pop, peek avoid having to go through retpoline, and last one has a couple of
minor fixes and cleanups for the ring buffer walk.
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
3 years agobpf, libbpf: simplify and cleanup perf ring buffer walk
Daniel Borkmann [Sun, 21 Oct 2018 00:09:28 +0000 (02:09 +0200)]
bpf, libbpf: simplify and cleanup perf ring buffer walk

Simplify bpf_perf_event_read_simple() a bit and fix up some minor
things along the way: the return code in the header is not of type
int but enum bpf_perf_event_ret instead. Once callback indicated
to break the loop walking event data, it also needs to be consumed
in data_tail since it has been processed already.

Moreover, bpf_perf_event_print_t callback should avoid void * as
we actually get a pointer to struct perf_event_header and thus
applications can make use of container_of() to have type checks.
The walk also doesn't have to use modulo op since the ring size is
required to be power of two.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
3 years agobpf, verifier: avoid retpoline for map push/pop/peek operation
Daniel Borkmann [Sun, 21 Oct 2018 00:09:27 +0000 (02:09 +0200)]
bpf, verifier: avoid retpoline for map push/pop/peek operation

Extend prior work from 09772d92cd5a ("bpf: avoid retpoline for
lookup/update/delete calls on maps") to also apply to the recently
added map helpers that perform push/pop/peek operations so that
the indirect call can be avoided.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
3 years agobpf, verifier: remove unneeded flow key in check_helper_mem_access
Daniel Borkmann [Sun, 21 Oct 2018 00:09:26 +0000 (02:09 +0200)]
bpf, verifier: remove unneeded flow key in check_helper_mem_access

They PTR_TO_FLOW_KEYS is not used today to be passed into a helper
as memory, so it can be removed from check_helper_mem_access().

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
3 years agobpf, verifier: reject xadd on flow key memory
Daniel Borkmann [Sun, 21 Oct 2018 00:09:25 +0000 (02:09 +0200)]
bpf, verifier: reject xadd on flow key memory

We should not enable xadd operation for flow key memory if not
needed there anyway. There is no such issue as described in the
commit f37a8cb84cce ("bpf: reject stores into ctx via st and xadd")
since there's no context rewriter for flow keys today, but it
also shouldn't become part of the user facing behavior to allow
for it. After patch:

  0: (79) r7 = *(u64 *)(r1 +144)
  1: (b7) r3 = 4096
  2: (db) lock *(u64 *)(r7 +0) += r3
  BPF_XADD stores into R7 flow_keys is not allowed

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
3 years agobpf, verifier: fix register type dump in xadd and st
Daniel Borkmann [Sun, 21 Oct 2018 00:09:24 +0000 (02:09 +0200)]
bpf, verifier: fix register type dump in xadd and st

Using reg_type_str[insn->dst_reg] is incorrect since insn->dst_reg
contains the register number but not the actual register type. Add
a small reg_state() helper and use it to get to the type. Also fix
up the test_verifier test cases that have an incorrect errstr.

Fixes: 9d2be44a7f33 ("bpf: Reuse canonical string formatter for ctx errs")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
3 years agoulp: remove uid and user_visible members
Daniel Borkmann [Sun, 21 Oct 2018 00:09:23 +0000 (02:09 +0200)]
ulp: remove uid and user_visible members

They are not used anymore and therefore should be removed.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
3 years agoRevert "neighbour: force neigh_invalidate when NUD_FAILED update is from admin"
Roopa Prabhu [Sun, 21 Oct 2018 01:09:31 +0000 (18:09 -0700)]
Revert "neighbour: force neigh_invalidate when NUD_FAILED update is from admin"

This reverts commit 8e326289e3069dfc9fa9c209924668dd031ab8ef.

This patch results in unnecessary netlink notification when one
tries to delete a neigh entry already in NUD_FAILED state. Found
this with a buggy app that tries to delete a NUD_FAILED entry
repeatedly. While the notification issue can be fixed with more
checks, adding more complexity here seems unnecessary. Also,
recent tests with other changes in the neighbour code have
shown that the INCOMPLETE and PROBE checks are good enough for
the original issue.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet/ipv6: Fix index counter for unicast addresses in in6_dump_addrs
David Ahern [Fri, 19 Oct 2018 17:00:19 +0000 (10:00 -0700)]
net/ipv6: Fix index counter for unicast addresses in in6_dump_addrs

The loop wants to skip previously dumped addresses, so loops until
current index >= saved index. If the message fills it wants to save
the index for the next address to dump - ie., the one that did not
fit in the current message.

Currently, it is incrementing the index counter before comparing to the
saved index, and then the saved index is off by 1 - it assumes the
current address is going to fit in the message.

Change the index handling to increment only after a succesful dump.

Fixes: 502a2ffd7376a ("ipv6: convert idev_list to list macros")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'bpf-msg-push-data'
Daniel Borkmann [Sat, 20 Oct 2018 19:37:12 +0000 (21:37 +0200)]
Merge branch 'bpf-msg-push-data'

John Fastabend says:

====================
This series adds a new helper bpf_msg_push_data to be used by
sk_msg programs. The helper can be used to insert extra bytes into
the message that can then be used by the program as metadata tags
among other things.

The first patch adds the helper, second patch the libbpf support,
and last patch updates test_sockmap to run msg_push_data tests.

v2: rebase after queue map and in filter.c convert int -> u32
====================

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
3 years agobpf: test_sockmap add options to use msg_push_data
John Fastabend [Sat, 20 Oct 2018 02:56:51 +0000 (19:56 -0700)]
bpf: test_sockmap add options to use msg_push_data

Add options to run msg_push_data, this patch creates two more flags
in test_sockmap that can be used to specify the offset and length
of bytes to be added. The new options are --txmsg_start_push to
specify where bytes should be inserted and --txmsg_end_push to
specify how many bytes. This is analagous to the options that are
used to pull data, --txmsg_start and --txmsg_end.

In addition to adding the options tests are added to the test
suit to run the tests similar to what was done for msg_pull_data.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
3 years agobpf: libbpf support for msg_push_data
John Fastabend [Sat, 20 Oct 2018 02:56:50 +0000 (19:56 -0700)]
bpf: libbpf support for msg_push_data

Add support for new bpf_msg_push_data in libbpf.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
3 years agobpf: sk_msg program helper bpf_msg_push_data
John Fastabend [Sat, 20 Oct 2018 02:56:49 +0000 (19:56 -0700)]
bpf: sk_msg program helper bpf_msg_push_data

This allows user to push data into a msg using sk_msg program types.
The format is as follows,

bpf_msg_push_data(msg, offset, len, flags)

this will insert 'len' bytes at offset 'offset'. For example to
prepend 10 bytes at the front of the message the user can,

bpf_msg_push_data(msg, 0, 10, 0);

This will invalidate data bounds so BPF user will have to then recheck
data bounds after calling this. After this the msg size will have been
updated and the user is free to write into the added bytes. We allow
any offset/len as long as it is within the (data, data_end) range.
However, a copy will be required if the ring is full and its possible
for the helper to fail with ENOMEM or EINVAL errors which need to be
handled by the BPF program.

This can be used similar to XDP metadata to pass data between sk_msg
layer and lower layers.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
3 years agor8169: add support for Byte Queue Limits
Florian Westphal [Sat, 20 Oct 2018 10:25:27 +0000 (12:25 +0200)]
r8169: add support for Byte Queue Limits

This patch is basically a resubmit of 1e918876853a ("r8169: add support
for Byte Queue Limits") which was reverted later. The problems causing
the revert seem to have been fixed in the meantime.
Only change to the original patch is that the call to
netdev_reset_queue was moved to rtl8169_tx_clear.

The Tested-by refers to a system using the RTL8168evl chip version.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agor8169: handle all interrupt events in the hard irq handler
Heiner Kallweit [Thu, 18 Oct 2018 20:19:28 +0000 (22:19 +0200)]
r8169: handle all interrupt events in the hard irq handler

Having a separate "slow event" handler isn't needed because all
interrupt events trigger asynchronous activity. And in case of SYSErr
we have bigger problems than performance anyway.
This patch also allows to get rid of acking interrupt events in the
NAPI poll callback.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetoot...
David S. Miller [Sat, 20 Oct 2018 19:33:48 +0000 (12:33 -0700)]
Merge branch 'for-upstream' of git://git./linux/kernel/git/bluetooth/bluetooth-next

Johan Hedberg says:

====================
pull request: bluetooth-next 2018-10-20

Here's one more bluetooth-next pull request for the 4.20 kernel.

 - Added new USB ID for QCA_ROME controller
 - Added debug trace support from QCA wcn3990 controllers
 - Updated L2CAP to conform to latest Errata Service Release
 - Fix binding to non-removable BCM43430 devices

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
David S. Miller [Sat, 20 Oct 2018 19:32:44 +0000 (12:32 -0700)]
Merge git://git./linux/kernel/git/pablo/nf-next

Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for your net-next tree:

1) Use lockdep_is_held() in ipset_dereference_protected(), from Lance Roy.

2) Remove unused variable in cttimeout, from YueHaibing.

3) Add ttl option for nft_osf, from Fernando Fernandez Mancera.

4) Use xfrm family to deal with IPv6-in-IPv4 packets from nft_xfrm,
   from Florian Westphal.

5) Simplify xt_osf_match_packet().

6) Missing ct helper alias definition in snmp_trap helper, from Taehee Yoo.

7) Remove unnecessary parameter in nf_flow_table_cleanup(), from Taehee Yoo.

8) Remove unused variable definitions in nft_{dup,fwd}, from Weongyo Jeong.

9) Remove empty net/netfilter/nfnetlink_log.h file, from Taehee Yoo.

10) Revert xt_quota updates remain option due to problems in the listing
    path for 32-bit arches, from Maze.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Greg Kroah-Hartman [Sat, 20 Oct 2018 13:04:23 +0000 (15:04 +0200)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Ingo writes:
  "x86 fixes:

   It's 4 misc fixes, 3 build warning fixes and 3 comment fixes.

   In hindsight I'd have left out the 3 comment fixes to make the pull
   request look less scary at such a late point in the cycle. :-/"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/swiotlb: Enable swiotlb for > 4GiG RAM on 32-bit kernels
  x86/fpu: Fix i486 + no387 boot crash by only saving FPU registers on context switch if there is an FPU
  x86/fpu: Remove second definition of fpu in __fpu__restore_sig()
  x86/entry/64: Further improve paranoid_entry comments
  x86/entry/32: Clear the CS high bits
  x86/boot: Add -Wno-pointer-sign to KBUILD_CFLAGS
  x86/time: Correct the attribute on jiffies' definition
  x86/entry: Add some paranoid entry/exit CR3 handling comments
  x86/percpu: Fix this_cpu_read()
  x86/tsc: Force inlining of cyc2ns bits

3 years agoMerge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Greg Kroah-Hartman [Sat, 20 Oct 2018 13:03:45 +0000 (15:03 +0200)]
Merge branch 'sched-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Ingo writes:
  "scheduler fixes:

   Two fixes: a CFS-throttling bug fix, and an interactivity fix."

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/fair: Fix the min_vruntime update logic in dequeue_entity()
  sched/fair: Fix throttle_list starvation with low CFS quota