linux/net
Vasily Averin 425b9c7f51 memcg: accounting for objects allocated for new netdevice
Creating a new netdevice allocates at least ~50Kb of memory for various
kernel objects, but only ~5Kb of them are accounted to memcg. As a result,
creating an unlimited number of netdevice inside a memcg-limited container
does not fall within memcg restrictions, consumes a significant part
of the host's memory, can cause global OOM and lead to random kills of
host processes.

The main consumers of non-accounted memory are:
 ~10Kb   80+ kernfs nodes
 ~6Kb    ipv6_add_dev() allocations
  6Kb    __register_sysctl_table() allocations
  4Kb    neigh_sysctl_register() allocations
  4Kb    __devinet_sysctl_register() allocations
  4Kb    __addrconf_sysctl_register() allocations

Accounting of these objects allows to increase the share of memcg-related
memory up to 60-70% (~38Kb accounted vs ~54Kb total for dummy netdevice
on typical VM with default Fedora 35 kernel) and this should be enough
to somehow protect the host from misuse inside container.

Other related objects are quite small and may not be taken into account
to minimize the expected performance degradation.

It should be separately mentonied ~300 bytes of percpu allocation
of struct ipstats_mib in snmp6_alloc_dev(), on huge multi-cpu nodes
it can become the main consumer of memory.

This patch does not enables kernfs accounting as it affects
other parts of the kernel and should be discussed separately.
However, even without kernfs, this patch significantly improves the
current situation and allows to take into account more than half
of all netdevice allocations.

Signed-off-by: Vasily Averin <vvs@openvz.org>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/354a0a5f-9ec3-a25c-3215-304eab2157bc@openvz.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-04 19:16:46 -07:00
..
6lowpan net: don't include ndisc.h from ipv6.h 2022-02-04 14:15:11 -08:00
9p xen/grant-table: remove readonly parameter from functions 2022-03-15 20:34:40 -05:00
802 net: 802: Use memset_startat() to clear struct fields 2021-11-19 11:23:23 +00:00
8021q vlan: use correct format characters 2022-03-17 16:34:49 -07:00
appletalk net: remove noblock parameter from skb_recv_datagram() 2022-04-06 13:45:26 +01:00
atm net: SO_RCVMARK socket option for SO_MARK with recvmsg() 2022-04-28 13:08:15 -07:00
ax25 net: remove noblock parameter from skb_recv_datagram() 2022-04-06 13:45:26 +01:00
batman-adv batman-adv: Use netif_rx(). 2022-03-07 11:40:41 +00:00
bluetooth net: SO_RCVMARK socket option for SO_MARK with recvmsg() 2022-04-28 13:08:15 -07:00
bpf Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-04-28 13:02:01 -07:00
bpfilter uaccess: remove CONFIG_SET_FS 2022-02-25 09:36:06 +01:00
bridge Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-04-28 13:02:01 -07:00
caif net: remove noblock parameter from skb_recv_datagram() 2022-04-06 13:45:26 +01:00
can net: SO_RCVMARK socket option for SO_MARK with recvmsg() 2022-04-28 13:08:15 -07:00
ceph libceph: drop else branches in prepare_read_data{,_cont} 2022-03-01 18:26:36 +01:00
core memcg: accounting for objects allocated for new netdevice 2022-05-04 19:16:46 -07:00
dcb net: dcb: disable softirqs in dcbnl_flush_dev() 2022-03-03 08:01:55 -08:00
dccp ipv4: Avoid using RTO_ONLINK with ip_route_connect(). 2022-04-22 13:06:03 +01:00
decnet net: decnet: use time_is_before_jiffies() instead of open coding it 2022-02-28 13:21:32 +00:00
dns_resolver
dsa Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-04-28 13:02:01 -07:00
ethernet net: ethernet: set default assignment identifier to NET_NAME_ENUM 2022-04-07 21:04:03 -07:00
ethtool ethtool: Add 10base-T1L link mode entry 2022-05-01 17:45:35 +01:00
hsr net: add per-cpu storage and net->core_stats 2022-03-11 23:17:24 -08:00
ieee802154 net: SO_RCVMARK socket option for SO_MARK with recvmsg() 2022-04-28 13:08:15 -07:00
ife
ipv4 memcg: accounting for objects allocated for new netdevice 2022-05-04 19:16:46 -07:00
ipv6 memcg: accounting for objects allocated for new netdevice 2022-05-04 19:16:46 -07:00
iucv net: remove noblock parameter from skb_recv_datagram() 2022-04-06 13:45:26 +01:00
kcm net: Don't include filter.h from net/sock.h 2021-12-29 08:48:14 -08:00
key net: SO_RCVMARK socket option for SO_MARK with recvmsg() 2022-04-28 13:08:15 -07:00
l2tp net: remove noblock parameter from recvmsg() entities 2022-04-12 15:00:25 +02:00
l3mdev l3mdev: l3mdev_master_upper_ifindex_by_index_rcu should be using netdev_master_upper_dev_get_rcu 2022-04-15 14:27:24 -07:00
lapb
llc llc: only change llc->dev when bind() succeeds 2022-03-25 16:55:41 -07:00
mac80211 wireless-next patches for v5.19 2022-05-03 17:27:51 -07:00
mac802154 net: mac802154: Fix symbol durations 2022-04-30 20:29:47 +02:00
mctp net: SO_RCVMARK socket option for SO_MARK with recvmsg() 2022-04-28 13:08:15 -07:00
mpls net: mpls: fix memdup.cocci warning 2022-04-07 21:06:41 -07:00
mptcp mptcp: netlink: allow userspace-driven subflow establishment 2022-05-04 10:49:32 +01:00
ncsi all: replace find_next{,_zero}_bit with find_first{,_zero}_bit where appropriate 2022-01-15 08:47:31 -08:00
netfilter net: sysctl: introduce sysctl SYSCTL_THREE 2022-05-03 10:15:06 +02:00
netlabel netlabel: fix out-of-bounds memory accesses 2022-03-21 10:59:11 +00:00
netlink Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-04-22 09:56:00 +02:00
netrom net: remove noblock parameter from skb_recv_datagram() 2022-04-06 13:45:26 +01:00
nfc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-04-15 09:26:00 +02:00
nsh
openvswitch openvswitch: fix OOB access in reserve_sfa_size() 2022-04-15 11:50:02 +01:00
packet net: SO_RCVMARK socket option for SO_MARK with recvmsg() 2022-04-28 13:08:15 -07:00
phonet net: remove noblock parameter from recvmsg() entities 2022-04-12 15:00:25 +02:00
psample
qrtr net: remove noblock parameter from skb_recv_datagram() 2022-04-06 13:45:26 +01:00
rds Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-12-16 16:13:19 -08:00
rfkill rfkill: make new event layout opt-in 2022-03-18 13:09:17 +02:00
rose net: remove noblock parameter from skb_recv_datagram() 2022-04-06 13:45:26 +01:00
rxrpc rxrpc: Restore removed timer deletion 2022-04-15 10:54:49 +01:00
sched Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-04-22 09:56:00 +02:00
sctp net: SO_RCVMARK socket option for SO_MARK with recvmsg() 2022-04-28 13:08:15 -07:00
smc net/smc: Fix slab-out-of-bounds issue in fallback 2022-04-25 11:03:48 -07:00
strparser bpf: sockmap, strparser, and tls are reusing qdisc_skb_cb and colliding 2021-11-09 01:05:28 +01:00
sunrpc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-04-15 09:26:00 +02:00
switchdev net: switchdev: remove lag_mod_cb from switchdev_handle_fdb_event_to_device 2022-02-24 21:31:43 -08:00
tipc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-03-23 10:53:49 -07:00
tls Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-04-28 13:02:01 -07:00
unix net: remove noblock parameter from recvmsg() entities 2022-04-12 15:00:25 +02:00
vmw_vsock vsock/virtio: add support for device suspend/resume 2022-05-02 16:04:34 -07:00
wireless wireless-next patches for v5.19 2022-05-03 17:27:51 -07:00
x25 net: remove noblock parameter from skb_recv_datagram() 2022-04-06 13:45:26 +01:00
xdp Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-04-28 13:02:01 -07:00
xfrm Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-04-22 09:56:00 +02:00
Kconfig page_pool: Add allocation stats 2022-03-03 09:55:28 +00:00
Kconfig.debug net: add networking namespace refcount tracker 2021-12-10 06:38:26 -08:00
Makefile
compat.c
devres.c
socket.c net: SO_RCVMARK socket option for SO_MARK with recvmsg() 2022-04-28 13:08:15 -07:00
sysctl_net.c sections: move and rename core_kernel_data() to is_kernel_core_data() 2021-11-09 10:02:50 -08:00