linux

mirror of https://github.com/hardkernel/linux.git synced 2026-06-08 03:40:35 +09:00

Author	SHA1	Message	Date
Martijn Coenen	2a4a4e0587	android: binder: move global binder state into context struct. This change moves all global binder state into the context struct, thereby completely separating the state and the locks between two different contexts. The debugfs entries remain global, printing entries from all contexts. Change-Id: If8e3e2bece7bc6f974b66fbcf1d91d529ffa62f0 Signed-off-by: Martijn Coenen <maco@google.com>	2017-04-10 13:12:16 +05:30
Martijn Coenen	687c683aa6	android: binder: add padding to binder_fd_array_object. binder_fd_array_object starts with a 4-byte header, followed by a few fields that are 8 bytes when ANDROID_BINDER_IPC_32BIT=N. This can cause alignment issues in a 64-bit kernel with a 32-bit userspace, as on x86_32 an 8-byte primitive may be aligned to a 4-byte address. Pad with a __u32 to fix this. Change-Id: I4374ed2cc3ccd3c6a1474cb7209b53ebfd91077b Signed-off-by: Martijn Coenen <maco@android.com>	2017-04-10 13:12:16 +05:30
Martijn Coenen	0ffb1bdf34	binder: use group leader instead of open thread The binder allocator assumes that the thread that called binder_open will never die for the lifetime of that proc. That thread is normally the group_leader, however it may not be. Use the group_leader instead of current. Bug: 35707103 Test: Created test case to open with temporary thread Change-Id: Id693f74b3591f3524a8c6e9508e70f3e5a80c588 Signed-off-by: Todd Kjos <tkjos@google.com> Signed-off-by: Martijn Coenen <maco@android.com>	2017-04-10 13:12:16 +05:30
Subash Abhinov Kasiviswanathan	216682ad26	nf: IDLETIMER: Use fullsock when querying uid sock_i_uid() acquires the sk_callback_lock which does not exist for sockets in TCP_NEW_SYN_RECV state. This results in errors showing up as spinlock bad magic. Fix this by looking for the full sock as suggested by Eric. Callstack for reference - -003\|rwlock_bug -004\|arch_read_lock -004\|do_raw_read_lock -005\|raw_read_lock_bh -006\|sock_i_uid -007\|from_kuid_munged(inline) -007\|reset_timer -008\|idletimer_tg_target -009\|ipt_do_table -010\|iptable_mangle_hook -011\|nf_iterate -012\|nf_hook_slow -013\|NF_HOOK_COND(inline) -013\|ip_output -014\|ip_local_out -015\|ip_build_and_send_pkt -016\|tcp_v4_send_synack -017\|atomic_sub_return(inline) -017\|reqsk_put(inline) -017\|tcp_conn_request -018\|tcp_v4_conn_request -019\|tcp_rcv_state_process -020\|tcp_v4_do_rcv -021\|tcp_v4_rcv -022\|ip_local_deliver_finish -023\|NF_HOOK_THRESH(inline) -023\|NF_HOOK(inline) -023\|ip_local_deliver -024\|ip_rcv_finish -025\|NF_HOOK_THRESH(inline) -025\|NF_HOOK(inline) -025\|ip_rcv -026\|deliver_skb(inline) -026\|deliver_ptype_list_skb(inline) -026\|__netif_receive_skb_core -027\|__netif_receive_skb -028\|netif_receive_skb_internal -029\|netif_receive_skb Change-Id: Ic8f3a3d2d7af31434d1163b03971994e2125d552 Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Cc: Eric Dumazet <edumazet@google.com>	2017-04-10 13:12:16 +05:30
Subash Abhinov Kasiviswanathan	f2688d5b1c	nf: IDLETIMER: Fix use after free condition during work schedule_work(&timer->work) appears to be called after cancel_work_sync(&info->timer->work) is completed. Work can be scheduled from the PM_POST_SUSPEND notification event even after cancel_work_sync is called. Call stack -004\|notify_netlink_uevent( \| [X19] timer = 0xFFFFFFC0A5DFC780 -> ( \| ... \| [NSD:0xFFFFFFC0A5DFC800] kobj = 0x6B6B6B6B6B6B6B6B, \| [NSD:0xFFFFFFC0A5DFC868] timeout = 0x6B6B6B6B, \| [NSD:0xFFFFFFC0A5DFC86C] refcnt = 0x6B6B6B6B, \| [NSD:0xFFFFFFC0A5DFC870] work_pending = 0x6B, \| [NSD:0xFFFFFFC0A5DFC871] send_nl_msg = 0x6B, \| [NSD:0xFFFFFFC0A5DFC872] active = 0x6B, \| [NSD:0xFFFFFFC0A5DFC874] uid = 0x6B6B6B6B, \| [NSD:0xFFFFFFC0A5DFC878] suspend_time_valid = 0x6B)) -005\|idletimer_tg_work( -006\|__read_once_size(inline) -006\|static_key_count(inline) -006\|static_key_false(inline) -006\|trace_workqueue_execute_end(inline) -006\|process_one_work( -007\|worker_thread( -008\|kthread( -009\|ret_from_fork(asm) ---\|end of frame Force any pending idletimer_tg_work() to complete before freeing the associated work struct and after unregistering to the pm_notifier callback. Change-Id: I4c5f0a1c142f7d698c092cf7bcafdb0f9fbaa9c1 Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>	2017-04-10 13:12:16 +05:30
Greg Hackmann	6d2d31fe1b	ANDROID: dm: android-verity: fix table_make_digest() error handling If table_make_digest() fails, verify_verity_signature() would try to pass the returned ERR_PTR() to kfree(). This fixes the smatch error: drivers/md/dm-android-verity.c:601 verify_verity_signature() error: 'pks' dereferencing possible ERR_PTR() Change-Id: I9b9b7764b538cb4a5f94337660e9b0f149b139be Signed-off-by: Greg Hackmann <ghackmann@google.com>	2017-04-10 13:12:16 +05:30
Anson Jacob	e5a34995d0	ANDROID: usb: gadget: function: Fix commenting style Fix checkpatch.pl warning: Block comments use * on subsequent lines Change-Id: I9c92f128fdb3aeeb6ab9c7039e11f857bebb9539 Signed-off-by: Anson Jacob <ansonjacob.aj@gmail.com>	2017-04-10 13:12:16 +05:30
Chris Redpath	3894650a05	cpufreq: interactive governor drops bits in time calculation Keep time calculation in 64-bit throughout. If we have long times between idle calculations this can result in deltas > 32 bits which causes incorrect load percentage calculations and selecting the wrong frequencies if we truncate here. Signed-off-by: Chris Redpath <chris.redpath@arm.com>	2017-04-10 13:12:16 +05:30
Daniel Rosenberg	ba832b0760	ANDROID: sdcardfs: support direct-IO (DIO) operations This comes from the wrapfs patch 2e346c83b26e Wrapfs: support direct-IO (DIO) operations Signed-off-by: Li Mengyang <li.mengyang@stonybrook.edu> Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu> Signed-off-by: Daniel Rosenberg <drosen@google.com> Bug: 34133558 Change-Id: I3fd779c510ab70d56b1d918f99c20421b524cdc4	2017-04-10 13:12:16 +05:30
Daniel Rosenberg	b20531a034	ANDROID: sdcardfs: implement vm_ops->page_mkwrite This comes from the wrapfs patch 3dfec0ffe5e2 Wrapfs: implement vm_ops->page_mkwrite Some file systems (e.g., ext4) require it. Reported by Ted Ts'o. Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu> Signed-off-by: Daniel Rosenberg <drosen@google.com> Bug: 34133558 Change-Id: I1a389b2422c654a6d3046bb8ec3e20511aebfa8e	2017-04-10 13:12:16 +05:30
Daniel Rosenberg	bab06182f0	ANDROID: sdcardfs: Don't bother deleting freelist There is no point deleting entries from dlist, as that is a temporary list on the stack from which contains only entries that are being deleted. Not all code paths set up dlist, so those that don't were performing invalid accesses in hash_del_rcu. As an additional means to prevent any other issue, we null out the list entries when we allocate from the cache. Signed-off-by: Daniel Rosenberg <drosen@google.com> Bug: 35666680 Change-Id: Ibb1e28c08c3a600c29418d39ba1c0f3db3bf31e5	2017-04-10 13:12:16 +05:30
Daniel Rosenberg	ddce85b238	ANDROID: sdcardfs: Add missing path_put "ANDROID: sdcardfs: Add GID Derivation to sdcardfs" introduced an unbalanced pat_get, leading to storage space not being freed after deleting a file until rebooting. This adds the missing path_put. Signed-off-by: Daniel Rosenberg <drosen@google.com> Bug: 34691169 Change-Id: Ia7ef97ec2eca2c555cc06b235715635afc87940e	2017-04-10 13:12:16 +05:30
Daniel Rosenberg	9068b415fe	ANDROID: sdcardfs: Fix incorrect hash This adds back the hash calculation removed as part of the previous patch, as it is in fact necessary. Signed-off-by: Daniel Rosenberg <drosen@google.com> Bug: 35307857 Change-Id: Ie607332bcf2c5d2efdf924e4060ef3f576bf25dc	2017-04-10 13:12:16 +05:30
Eric Biggers	e178cb830d	ANDROID: ext4: add a non-reversible key derivation method Add a new per-file key derivation method to ext4 encryption defined as: derived_key[0:127] = AES-256-ENCRYPT(master_key[0:255], nonce) derived_key[128:255] = AES-256-ENCRYPT(master_key[0:255], nonce ^ 0x01) derived_key[256:383] = AES-256-ENCRYPT(master_key[256:511], nonce) derived_key[384:511] = AES-256-ENCRYPT(master_key[256:511], nonce ^ 0x01) ... where the derived key and master key are both 512 bits, the nonce is 128 bits, AES-256-ENCRYPT takes the arguments (key, plaintext), and 'nonce ^ 0x01' denotes flipping the low order bit of the last byte. The existing key derivation method is 'derived_key = AES-128-ECB-ENCRYPT(key=nonce, plaintext=master_key)'. We want to make this change because currently, given a derived key you can easily compute the master key by computing 'AES-128-ECB-DECRYPT(key=nonce, ciphertext=derived_key)'. This was formerly OK because the previous threat model assumed that the master key and derived keys are equally hard to obtain by an attacker. However, we are looking to move the master key into secure hardware in some cases, so we want to make sure that an attacker with access to a derived key cannot compute the master key. We are doing this instead of increasing the nonce to 512 bits because it's important that the per-file xattr fit in the inode itself. By default, inodes are 256 bytes, and on Android we're already pretty close to that limit. If we increase the nonce size, we end up allocating a new filesystem block for each and every encrypted file, which has a substantial performance and disk utilization impact. Another option considered was to use the HMAC-SHA512 of the nonce, keyed by the master key. However this would be a little less performant, would be less extensible to other key sizes and MAC algorithms, and would pull in a dependency (security-wise and code-wise) on SHA-512. Due to the use of "aes" rather than "ecb(aes)" in the implementation, the new key derivation method is actually about twice as fast as the old one, though the old one could be optimized similarly as well. This patch makes the new key derivation method be used whenever HEH is used to encrypt filenames. Although these two features are logically independent, it was decided to bundle them together for now. Note that neither feature is upstream yet, and it cannot be guaranteed that the on-disk format won't change if/when these features are upstreamed. For this reason, and as noted in the previous patch, the features are both behind a special mode number for now. Signed-off-by: Eric Biggers <ebiggers@google.com> Change-Id: Iee4113f57e59dc8c0b7dc5238d7003c83defb986	2017-04-10 13:12:16 +05:30
Eric Biggers	0516ccdb14	ANDROID: ext4: allow encrypting filenames using HEH algorithm Update ext4 encryption to allow filenames to be encrypted using the Hash-Encrypt-Hash (HEH) block cipher mode of operation, which is believed to be more secure than CBC, particularly within the constant initialization vector (IV) constraint of filename encryption. Notably, HEH avoids the "common prefix" problem of CBC. Both algorithms use AES-256 as the underlying block cipher and take a 256-bit key. We assign mode number 126 to HEH, just below 127 (EXT4_ENCRYPTION_MODE_PRIVATE) which in some kernels is reserved for inline encryption on MSM chipsets. Note that these modes are not yet upstream, which is why these numbers are being used; it's preferable to avoid collisions with modes that may be added upstream. Also, although HEH is not hardware-specific, we aren't currently reserving mode number 5 for HEH upstream, since for now we are tying HEH to the new key derivation method which might become an independent flag upstream, and there's also a chance that details of HEH will change after it gets wider review. Bug: 32975945 Signed-off-by: Eric Biggers <ebiggers@google.com> Change-Id: I81418709d47da0e0ac607ae3f91088063c2d5dd4	2017-04-10 13:12:16 +05:30
Eric Biggers	830a526837	ANDROID: arm64/crypto: add ARMv8-CE optimized poly_hash algorithm poly_hash is part of the HEH (Hash-Encrypt-Hash) encryption mode, proposed in Internet Draft https://tools.ietf.org/html/draft-cope-heh-01. poly_hash is very similar to GHASH; besides the swapping of the last two coefficients which we opted to handle in the HEH template, poly_hash just uses a different finite field representation. As with GHASH, poly_hash becomes much faster and more secure against timing attacks when implemented using carryless multiplication instructions instead of tables. This patch adds an ARMv8-CE optimized version of poly_hash, based roughly on the existing ARMv8-CE optimized version of GHASH. Benchmark results are shown below, but note that the resistance to timing attacks may be even more important than the performance gain. poly_hash only: poly_hash-generic: 1,000,000 setkey() takes 1185 ms hashing is 328 MB/s poly_hash-ce: 1,000,000 setkey() takes 8 ms hashing is 1756 MB/s heh(aes) with 4096-byte inputs (this is the ideal case, as the improvement is less significant with smaller inputs): encryption with "heh_base(cmac(aes-ce),poly_hash-generic,ecb-aes-ce)": 118 MB/s decryption with "heh_base(cmac(aes-ce),poly_hash-generic,ecb-aes-ce)": 120 MB/s encryption with "heh_base(cmac(aes-ce),poly_hash-ce,ecb-aes-ce)": 291 MB/s decryption with "heh_base(cmac(aes-ce),poly_hash-ce,ecb-aes-ce)": 293 MB/s Bug: 32508661 Signed-off-by: Eric Biggers <ebiggers@google.com> Change-Id: I621ec0e1115df7e6f5cbd7e864a4a9d8d2e94cf2	2017-04-10 13:12:16 +05:30
Eric Biggers	eba753c13d	ANDROID: crypto: heh - factor out poly_hash algorithm Factor most of poly_hash() out into its own keyed hash algorithm so that optimized architecture-specific implementations of it will be possible. For now we call poly_hash through the shash API, since HEH already had an example of using shash for another algorithm (CMAC), and we will not be adding any poly_hash implementations that require ahash yet. We can however switch to ahash later if it becomes useful. Bug: 32508661 Signed-off-by: Eric Biggers <ebiggers@google.com> Change-Id: I8de54ddcecd1d7fa6e9842a09506a08129bae0b6	2017-04-10 13:12:16 +05:30
Alex Cope	a2f9a7b9fe	ANDROID: crypto: heh - Add Hash-Encrypt-Hash (HEH) algorithm Hash-Encrypt-Hash (HEH) is a proposed block cipher mode of operation which extends the strong pseudo-random permutation property of block ciphers (e.g. AES) to arbitrary length input strings. This provides a stronger notion of security than existing block cipher modes of operation (e.g. CBC, CTR, XTS), though it is usually less performant. It uses two keyed invertible hash functions with a layer of ECB encryption applied in-between. The algorithm is currently specified by the following Internet Draft: https://tools.ietf.org/html/draft-cope-heh-01 This patch adds HEH as a symmetric cipher only. Support for HEH as an AEAD is not yet implemented. HEH will use an existing accelerated ecb(block_cipher) implementation for the encrypt step if available. Accelerated versions of the hash step are planned but will be left for later patches. This patch backports HEH to the 4.4 Android kernel, initially for use by ext4 filenames encryption. Note that HEH is not yet upstream; however, patches have been made available on linux-crypto, and as noted there is also a draft specification available. This backport required updating the code to conform to the legacy ablkcipher API rather than the skcipher API, which wasn't complete in 4.4. Signed-off-by: Alex Cope <alexcope@google.com> Bug: 32975945 Signed-off-by: Eric Biggers <ebiggers@google.com> Change-Id: I945bcc9c0115916824d701bae91b86e3f059a1a9	2017-04-10 13:12:16 +05:30
Alex Cope	2c3ce60584	ANDROID: crypto: gf128mul - Add ble multiplication functions Adding ble multiplication to GF128mul, and fixing up comments. The ble multiplication functions multiply GF(2^128) elements in the ble format. This format is preferable because the bits within each byte map to polynomial coefficients in the natural order (lowest order bit = coefficient of lowest degree polynomial term), and the bytes are stored in little endian order which matches the endianness of most modern CPUs. These new functions will be used by the HEH algorithm. Signed-off-by: Alex Cope <alexcope@google.com> Bug: 32975945 Signed-off-by: Eric Biggers <ebiggers@google.com> Change-Id: I39a58e8ee83e6f9b2e6bd51738f816dbfa2f3a47	2017-04-10 13:12:16 +05:30
Eric Biggers	65513b9e83	ANDROID: crypto: gf128mul - Refactor gf128 overflow macros and tables Rename and clean up the GF(2^128) overflow macros and tables. Their usage is more general than the name suggested, e.g. what was previously known as the "bbe" table can actually be used for both "bbe" and "ble" multiplication. Bug: 32975945 Signed-off-by: Eric Biggers <ebiggers@google.com> Change-Id: Ie6c47b4075ca40031eb1767e9b468cfd7bf1b2e4	2017-04-10 13:12:16 +05:30
Alex Cope	a2d2edaf30	UPSTREAM: crypto: gf128mul - Zero memory when freeing multiplication table GF(2^128) multiplication tables are typically used for secret information, so it's a good idea to zero them on free. Signed-off-by: Alex Cope <alexcope@google.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> (cherry-picked from `75aa0a7caf`) Bug: 32975945 Signed-off-by: Eric Biggers <ebiggers@google.com> Change-Id: I37b1ae9544158007f9ee2caf070120f4a42153ab	2017-04-10 13:12:16 +05:30
Eric Biggers	f9e82f17e1	ANDROID: crypto: shash - Add crypto_grab_shash() and crypto_spawn_shash_alg() Analogous to crypto_grab_skcipher() and crypto_spawn_skcipher_alg(), these are useful for algorithms that need to use a shash sub-algorithm, possibly in addition to other sub-algorithms. Bug: 32975945 Signed-off-by: Eric Biggers <ebiggers@google.com> Change-Id: I44e5a519d73f5f839e3b6ecbf8c66e36ec569557	2017-04-10 13:12:16 +05:30
Eric Biggers	c1bba8c93f	ANDROID: crypto: allow blkcipher walks over ablkcipher data Add a function blkcipher_ablkcipher_walk_virt() which allows ablkcipher algorithms to use the blkcipher_walk API to walk over their data. This will be used by the HEH algorithm, which to support asynchronous ECB algorithms will be an ablkcipher, but it also needs to make other passes over the data. Bug: 32975945 Signed-off-by: Eric Biggers <ebiggers@google.com> Change-Id: I05f9a0e5473ba6115fcc72d5122d6b0b18b2078b	2017-04-10 13:12:16 +05:30
Jeremy Linton	ea6520c1c8	UPSTREAM: arm/arm64: crypto: assure that ECB modes don't require an IV ECB modes don't use an initialization vector. The kernel /proc/crypto interface doesn't reflect this properly. Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> (cherry picked from `bee038a4bd`) Signed-off-by: Eric Biggers <ebiggers@google.com> Change-Id: Ief9558d2b41be58a2d845d2033a141b5ef7b585f	2017-04-10 13:12:16 +05:30
Mohan Srinivasan	db94337285	ANDROID: Refactor fs readpage/write tracepoints. Refactor the fs readpage/write tracepoints to move the inode->path lookup outside the tracepoint code, and pass a pointer to the path into the tracepoint code instead. This is necessary because the tracepoint code runs non-preemptible. Thanks to Trilok Soni for catching this in 4.4. Change-Id: I7486c5947918d155a30c61d6b9cd5027cf8fbe15 Signed-off-by: Mohan Srinivasan <srmohan@google.com>	2017-04-10 13:12:16 +05:30
Adrien Schildknecht	0c2f831ee7	Squashfs: optimize reading uncompressed data When dealing with uncompressed data, there is no need to read a whole block (default 128K) to get the desired page: the pages are independent from each others. This patch change the readpages logic so that reading uncompressed data only read the number of pages advised by the readahead algorithm. Moreover, if the page actor contains holes (i.e. pages that are already up-to-date), squashfs skips the buffer_head associated to those pages. This patch greatly improve the performance of random reads for uncompressed files because squashfs only read what is needed. It also reduces the number of unnecessary reads. Signed-off-by: Adrien Schildknecht <adriens@google.com> Change-Id: I1850150fbf4b45c9dd138d88409fea1ab44054c0	2017-04-10 13:12:16 +05:30
Adrien Schildknecht	9c6d9abc8f	Squashfs: implement .readpages() Squashfs does not implement .readpages(), so the kernel just repeatedly calls .readpage(). The readpages function tries to pack as much pages as possible in the same page actor so that only 1 read request is issued. Now that the read requests are asynchronous, the kernel can truly prefetch pages using its readahead algorithm. Signed-off-by: Adrien Schildknecht <adriens@google.com> Change-Id: Ice70e029dc24526f61e4e5a1a902588be2212498	2017-04-10 13:12:16 +05:30
Adrien Schildknecht	832771453f	Squashfs: replace buffer_head with BIO The 'll_rw_block' has been deprecated and BIO is now the basic container for block I/O within the kernel. Switching to BIO offers 2 advantages: 1/ It removes synchronous wait for the up-to-date buffers: SquashFS now deals with decompressions/copies asynchronously. Implementing an asynchronous mechanism to read data is needed to efficiently implement .readpages(). 2/ Prior to this patch, merging the read requests entirely depends on the IO scheduler. SquashFS has more information than the IO scheduler about what could be merged. Moreover, merging the reads at the FS level means that we rely less on the IO scheduler. Signed-off-by: Adrien Schildknecht <adriens@google.com> Change-Id: I775d2e11f017476e1899518ab52d9d0a8a0bce28	2017-04-10 13:12:16 +05:30
Adrien Schildknecht	4bc7d97903	Squashfs: refactor page_actor This patch essentially does 3 things: 1/ Always use an array of page to store the data instead of a mix of buffers and pages. 2/ It is now possible to have 'holes' in a page actor, i.e. NULL pages in the array. When reading a block (default 128K), squashfs tries to grab all the pages covering this block. If a single page is up-to-date or locked, it falls back to using an intermediate buffer to do the read and then copy the pages in the actor. Allowing holes in the page actor remove the need for this intermediate buffer. 3/ Refactor the wrappers to share code that deals with page actors. Signed-off-by: Adrien Schildknecht <adriens@google.com> Change-Id: I98128bed5d518cf31b67e788a85b275e9a323bec	2017-04-10 13:12:16 +05:30
Adrien Schildknecht	3de0af4df5	Squashfs: remove the FILE_CACHE option FILE_DIRECT is working fine and offers faster results and lower memory footprint. Removing FILE_CACHE makes our life easier because we don't have to maintain 2 differents function that does the same thing. Signed-off-by: Adrien Schildknecht <adriens@google.com> Change-Id: I6689ba74d0042c222a806f9edc539995e8e04c6b	2017-04-10 13:12:16 +05:30
Sami Tolvanen	2445eaaabe	ANDROID: android-recommended.cfg: CONFIG_CPU_SW_DOMAIN_PAN=y Bug: 31374660 Change-Id: Id2710a5fa2694da66d3f34cbcc0c2a58a006cec5 Signed-off-by: Sami Tolvanen <samitolvanen@google.com>	2017-04-10 13:12:16 +05:30
Cong Wang	89613b1da8	FROMLIST: 9p: fix a potential acl leak (https://lkml.org/lkml/2016/12/13/579) posix_acl_update_mode() could possibly clear 'acl', if so we leak the memory pointed by 'acl'. Save this pointer before calling posix_acl_update_mode() and release the memory if 'acl' really gets cleared. Reported-by: Mark Salyzyn <salyzyn@android.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Greg Kurz <groug@kaod.org> Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Ron Minnich <rminnich@sandia.gov> Cc: Latchesar Ionkov <lucho@ionkov.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Bug: 32458736 Change-Id: Ia78da401e6fd1bfd569653bd2cd0ebd3f9c737a0	2017-04-10 13:12:16 +05:30
Pratyush Anand	2f090863ff	UPSTREAM: arm64: Allow hw watchpoint of length 3,5,6 and 7 (cherry picked from commit `0ddb8e0b78`) Since, arm64 can support all offset within a double word limit. Therefore, now support other lengths within that range as well. Signed-off-by: Pratyush Anand <panand@redhat.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Pavel Labath <labath@google.com> Change-Id: Ibcb263a3903572336ccbf96e0180d3990326545a Bug: 30919905	2017-04-10 13:12:16 +05:30
Pavel Labath	98912f6825	BACKPORT: arm64: hw_breakpoint: Handle inexact watchpoint addresses (cherry picked from commit `fdfeff0f9e`) Arm64 hardware does not always report a watchpoint hit address that matches one of the watchpoints set. It can also report an address "near" the watchpoint if a single instruction access both watched and unwatched addresses. There is no straight-forward way, short of disassembling the offending instruction, to map that address back to the watchpoint. Previously, when the hardware reported a watchpoint hit on an address that did not match our watchpoint (this happens in case of instructions which access large chunks of memory such as "stp") the process would enter a loop where we would be continually resuming it (because we did not recognise that watchpoint hit) and it would keep hitting the watchpoint again and again. The tracing process would never get notified of the watchpoint hit. This commit fixes the problem by looking at the watchpoints near the address reported by the hardware. If the address does not exactly match one of the watchpoints we have set, it attributes the hit to the nearest watchpoint we have. This heuristic is a bit dodgy, but I don't think we can do much more, given the hardware limitations. Signed-off-by: Pavel Labath <labath@google.com> [panand: reworked to rebase on his patches] Signed-off-by: Pratyush Anand <panand@redhat.com> [will: use __ffs instead of ffs - 1] Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Pavel Labath <labath@google.com> [pavel: trivial fixup in hw_breakpoint.c:watchpoint_handler] Change-Id: I714dfaa3947d89d89a9e9a1ea84914d44ba0faa3 Bug: 30919905	2017-04-10 13:12:16 +05:30
Pratyush Anand	29dbff68c9	UPSTREAM: arm64: Allow hw watchpoint at varied offset from base address ARM64 hardware supports watchpoint at any double word aligned address. However, it can select any consecutive bytes from offset 0 to 7 from that base address. For example, if base address is programmed as 0x420030 and byte select is 0x1C, then access of 0x420032,0x420033 and 0x420034 will generate a watchpoint exception. Currently, we do not have such modularity. We can only program byte, halfword, word and double word access exception from any base address. This patch adds support to overcome above limitations. Signed-off-by: Pratyush Anand <panand@redhat.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Pavel Labath <labath@google.com> Change-Id: I28b1ca63f63182c10c3d6b6b3bacf6c56887ddbe Bug: 30919905	2017-04-10 13:12:16 +05:30
Pratyush Anand	48e38a939e	BACKPORT: hw_breakpoint: Allow watchpoint of length 3,5,6 and 7 (cherry picked from commit `651be3cb08`) We only support breakpoint/watchpoint of length 1, 2, 4 and 8. If we can support other length as well, then user may watch more data with less number of watchpoints (provided hardware supports it). For example: if we have to watch only 4th, 5th and 6th byte from a 64 bit aligned address, we will have to use two slots to implement it currently. One slot will watch a half word at offset 4 and other a byte at offset 6. If we can have a watchpoint of length 3 then we can watch it with single slot as well. ARM64 hardware does support such functionality, therefore adding these new definitions in generic layer. Signed-off-by: Pratyush Anand <panand@redhat.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Pavel Labath <labath@google.com> [pavel: tools/include/uapi/linux/hw_breakpoint.h is not present in this branch] Change-Id: Ie17ed89ca526e4fddf591bb4e556fdfb55fc2eac Bug: 30919905	2017-04-10 13:12:16 +05:30
Alex Shi	f75c8ea7d1	Merge branch 'linux-linaro-lsk-v4.4' into linux-linaro-lsk-v4.4-android	2017-04-09 12:01:26 +08:00
Alex Shi	af15ae4785	Merge tag 'v4.4.60' into linux-linaro-lsk-v4.4 This is the 4.4.60 stable release	2017-04-09 12:01:24 +08:00
Greg Kroah-Hartman	8f8ee9706b	Linux 4.4.60	2017-04-08 09:53:53 +02:00
Jason A. Donenfeld	84bd21a708	padata: avoid race in reordering commit `de5540d088` upstream. Under extremely heavy uses of padata, crashes occur, and with list debugging turned on, this happens instead: [87487.298728] WARNING: CPU: 1 PID: 882 at lib/list_debug.c:33 __list_add+0xae/0x130 [87487.301868] list_add corruption. prev->next should be next (ffffb17abfc043d0), but was ffff8dba70872c80. (prev=ffff8dba70872b00). [87487.339011] [<ffffffff9a53d075>] dump_stack+0x68/0xa3 [87487.342198] [<ffffffff99e119a1>] ? console_unlock+0x281/0x6d0 [87487.345364] [<ffffffff99d6b91f>] __warn+0xff/0x140 [87487.348513] [<ffffffff99d6b9aa>] warn_slowpath_fmt+0x4a/0x50 [87487.351659] [<ffffffff9a58b5de>] __list_add+0xae/0x130 [87487.354772] [<ffffffff9add5094>] ? _raw_spin_lock+0x64/0x70 [87487.357915] [<ffffffff99eefd66>] padata_reorder+0x1e6/0x420 [87487.361084] [<ffffffff99ef0055>] padata_do_serial+0xa5/0x120 padata_reorder calls list_add_tail with the list to which its adding locked, which seems correct: spin_lock(&squeue->serial.lock); list_add_tail(&padata->list, &squeue->serial.list); spin_unlock(&squeue->serial.lock); This therefore leaves only place where such inconsistency could occur: if padata->list is added at the same time on two different threads. This pdata pointer comes from the function call to padata_get_next(pd), which has in it the following block: next_queue = per_cpu_ptr(pd->pqueue, cpu); padata = NULL; reorder = &next_queue->reorder; if (!list_empty(&reorder->list)) { padata = list_entry(reorder->list.next, struct padata_priv, list); spin_lock(&reorder->lock); list_del_init(&padata->list); atomic_dec(&pd->reorder_objects); spin_unlock(&reorder->lock); pd->processed++; goto out; } out: return padata; I strongly suspect that the problem here is that two threads can race on reorder list. Even though the deletion is locked, call to list_entry is not locked, which means it's feasible that two threads pick up the same padata object and subsequently call list_add_tail on them at the same time. The fix is thus be hoist that lock outside of that block. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-04-08 09:53:32 +02:00
NeilBrown	5cca175b6c	blk: Ensure users for current->bio_list can see the full list. commit `f5fe1b5190` upstream. Commit `79bd99596b` ("blk: improve order of bio handling in generic_make_request()") changed current->bio_list so that it did not contain all of the queued bios, but only those submitted by the currently running make_request_fn. There are two places which walk the list and requeue selected bios, and others that check if the list is empty. These are no longer correct. So redefine current->bio_list to point to an array of two lists, which contain all queued bios, and adjust various code to test or walk both lists. Signed-off-by: NeilBrown <neilb@suse.com> Fixes: `79bd99596b` ("blk: improve order of bio handling in generic_make_request()") Signed-off-by: Jens Axboe <axboe@fb.com> [jwang: backport to 4.4] Signed-off-by: Jack Wang <jinpu.wang@profitbricks.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [bwh: Restore changes in device-mapper from upstream version] Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>	2017-04-08 09:53:32 +02:00
NeilBrown	2cbd78f423	blk: improve order of bio handling in generic_make_request() commit `79bd99596b` upstream. To avoid recursion on the kernel stack when stacked block devices are in use, generic_make_request() will, when called recursively, queue new requests for later handling. They will be handled when the make_request_fn for the current bio completes. If any bios are submitted by a make_request_fn, these will ultimately be handled seqeuntially. If the handling of one of those generates further requests, they will be added to the end of the queue. This strict first-in-first-out behaviour can lead to deadlocks in various ways, normally because a request might need to wait for a previous request to the same device to complete. This can happen when they share a mempool, and can happen due to interdependencies particular to the device. Both md and dm have examples where this happens. These deadlocks can be erradicated by more selective ordering of bios. Specifically by handling them in depth-first order. That is: when the handling of one bio generates one or more further bios, they are handled immediately after the parent, before any siblings of the parent. That way, when generic_make_request() calls make_request_fn for some particular device, we can be certain that all previously submited requests for that device have been completely handled and are not waiting for anything in the queue of requests maintained in generic_make_request(). An easy way to achieve this would be to use a last-in-first-out stack instead of a queue. However this will change the order of consecutive bios submitted by a make_request_fn, which could have unexpected consequences. Instead we take a slightly more complex approach. A fresh queue is created for each call to a make_request_fn. After it completes, any bios for a different device are placed on the front of the main queue, followed by any bios for the same device, followed by all bios that were already on the queue before the make_request_fn was called. This provides the depth-first approach without reordering bios on the same level. This, by itself, it not enough to remove all deadlocks. It just makes it possible for drivers to take the extra step required themselves. To avoid deadlocks, drivers must never risk waiting for a request after submitting one to generic_make_request. This includes never allocing from a mempool twice in the one call to a make_request_fn. A common pattern in drivers is to call bio_split() in a loop, handling the first part and then looping around to possibly split the next part. Instead, a driver that finds it needs to split a bio should queue (with generic_make_request) the second part, handle the first part, and then return. The new code in generic_make_request will ensure the requests to underlying bios are processed first, then the second bio that was split off. If it splits again, the same process happens. In each case one bio will be completely handled before the next one is attempted. With this is place, it should be possible to disable the punt_bios_to_recover() recovery thread for many block devices, and eventually it may be possible to remove it completely. Ref: http://www.spinics.net/lists/raid/msg54680.html Tested-by: Jinpu Wang <jinpu.wang@profitbricks.com> Inspired-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com> [jwang: backport to 4.4] Signed-off-by: Jack Wang <jinpu.wang@profitbricks.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-04-08 09:53:32 +02:00
Alexandre Belloni	063d30f187	power: reset: at91-poweroff: timely shutdown LPDDR memories commit `0b0408745e` upstream. LPDDR memories can only handle up to 400 uncontrolled power off. Ensure the proper power off sequence is used before shutting down the platform. Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Signed-off-by: Sebastian Reichel <sre@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-04-08 09:53:32 +02:00
David Hildenbrand	42462d23e6	KVM: kvm_io_bus_unregister_dev() should never fail commit `90db10434b` upstream. No caller currently checks the return value of kvm_io_bus_unregister_dev(). This is evil, as all callers silently go on freeing their device. A stale reference will remain in the io_bus, getting at least used again, when the iobus gets teared down on kvm_destroy_vm() - leading to use after free errors. There is nothing the callers could do, except retrying over and over again. So let's simply remove the bus altogether, print an error and make sure no one can access this broken bus again (returning -ENOMEM on any attempt to access it). Fixes: `e93f8a0f82` ("KVM: convert io_bus to SRCU") Reported-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-04-08 09:53:32 +02:00
Uwe Kleine-König	3a1246b46d	rtc: s35390a: improve irq handling commit `3bd32722c8` upstream. On some QNAP NAS devices the rtc can wake the machine. Several people noticed that once the machine was woken this way it fails to shut down. That's because the driver fails to acknowledge the interrupt and so it keeps active and restarts the machine immediatly after shutdown. See https://bugs.debian.org/794266 for a bug report. Doing this correctly requires to interpret the INT2 flag of the first read of the STATUS1 register because this bit is cleared by read. Note this is not maximally robust though because a pending irq isn't detected when the STATUS1 register was already read (and so INT2 is not set) but the irq was not disabled. But that is a hardware imposed problem that cannot easily be fixed by software. Signed-off-by: Uwe Kleine-König <uwe@kleine-koenig.org> Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-04-08 09:53:32 +02:00
Uwe Kleine-König	a55ae9d193	rtc: s35390a: implement reset routine as suggested by the reference commit `8e6583f1b5` upstream. There were two deviations from the reference manual: you have to wait half a second when POC is active and you might have to repeat initialization when POC or BLD are still set after the sequence. Note however that as POC and BLD are cleared by read the driver might not be able to detect that a reset is necessary. I don't have a good idea how to fix this. Additionally report the value read from STATUS1 to the caller. This prepares the next patch. Signed-off-by: Uwe Kleine-König <uwe@kleine-koenig.org> Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-04-08 09:53:32 +02:00
Uwe Kleine-König	fdd4bc9313	rtc: s35390a: make sure all members in the output are set The rtc core calls the .read_alarm with all fields initialized to 0. As the s35390a driver doesn't touch some fields the returned date is interpreted as a date in January 1900. So make sure all fields are set to -1; some of them are then overwritten with the right data depending on the hardware state. In mainline this is done by commit `d68778b80d` ("rtc: initialize output parameter for read alarm to "uninitialized"") in the core. This is considered to dangerous for stable as it might have side effects for other rtc drivers that might for example rely on alarm->time.tm_sec being initialized to 0. Signed-off-by: Uwe Kleine-König <uwe@kleine-koenig.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-04-08 09:53:32 +02:00
Uwe Kleine-König	b3ed386491	rtc: s35390a: fix reading out alarm commit `f87e904ddd` upstream. There are several issues fixed in this patch: - When alarm isn't enabled, set .enabled to zero instead of returning -EINVAL. - Ignore how IRQ1 is configured when determining if IRQ2 is on. - The three alarm registers have an enable flag which must be evaluated. - The chip always triggers when the seconds register gets 0. Note that the rtc framework however doesn't handle the result correctly because it doesn't check wday being initialized and so interprets an alarm being set for 10:00 AM in three days as 10:00 AM tomorrow (or today if that's not over yet). Signed-off-by: Uwe Kleine-König <uwe@kleine-koenig.org> Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-04-08 09:53:32 +02:00
Felix Fietkau	6280ac931a	MIPS: Lantiq: Fix cascaded IRQ setup commit `6c356eda22` upstream. With the IRQ stack changes integrated, the XRX200 devices started emitting a constant stream of kernel messages like this: [ 565.415310] Spurious IRQ: CAUSE=0x1100c300 This is caused by IP0 getting handled by plat_irq_dispatch() rather than its vectored interrupt handler, which is fixed by commit de856416e714 ("MIPS: IRQ Stack: Fix erroneous jal to plat_irq_dispatch"). Fix plat_irq_dispatch() to handle non-vectored IPI interrupts correctly by setting up IP2-6 as proper chained IRQ handlers and calling do_IRQ for all MIPS CPU interrupts. Signed-off-by: Felix Fietkau <nbd@nbd.name> Acked-by: John Crispin <john@phrozen.org> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/15077/ [james.hogan@imgtec.com: tweaked commit message] Signed-off-by: James Hogan <james.hogan@imgtec.com> Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-04-08 09:53:32 +02:00
Naoya Horiguchi	47e2fe17d1	mm, hugetlb: use pte_present() instead of pmd_present() in follow_huge_pmd() commit `c9d398fa23` upstream. I found the race condition which triggers the following bug when move_pages() and soft offline are called on a single hugetlb page concurrently. Soft offlining page 0x119400 at 0x700000000000 BUG: unable to handle kernel paging request at ffffea0011943820 IP: follow_huge_pmd+0x143/0x190 PGD 7ffd2067 PUD 7ffd1067 PMD 0 [61163.582052] Oops: 0000 [#1] SMP Modules linked in: binfmt_misc ppdev virtio_balloon parport_pc pcspkr i2c_piix4 parport i2c_core acpi_cpufreq ip_tables xfs libcrc32c ata_generic pata_acpi virtio_blk 8139too crc32c_intel ata_piix serio_raw libata virtio_pci 8139cp virtio_ring virtio mii floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: cap_check] CPU: 0 PID: 22573 Comm: iterate_numa_mo Tainted: P OE 4.11.0-rc2-mm1+ #2 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 RIP: 0010:follow_huge_pmd+0x143/0x190 RSP: 0018:ffffc90004bdbcd0 EFLAGS: 00010202 RAX: 0000000465003e80 RBX: ffffea0004e34d30 RCX: 00003ffffffff000 RDX: 0000000011943800 RSI: 0000000000080001 RDI: 0000000465003e80 RBP: ffffc90004bdbd18 R08: 0000000000000000 R09: ffff880138d34000 R10: ffffea0004650000 R11: 0000000000c363b0 R12: ffffea0011943800 R13: ffff8801b8d34000 R14: ffffea0000000000 R15: 000077ff80000000 FS: 00007fc977710740(0000) GS:ffff88007dc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffea0011943820 CR3: 000000007a746000 CR4: 00000000001406f0 Call Trace: follow_page_mask+0x270/0x550 SYSC_move_pages+0x4ea/0x8f0 SyS_move_pages+0xe/0x10 do_syscall_64+0x67/0x180 entry_SYSCALL64_slow_path+0x25/0x25 RIP: 0033:0x7fc976e03949 RSP: 002b:00007ffe72221d88 EFLAGS: 00000246 ORIG_RAX: 0000000000000117 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc976e03949 RDX: 0000000000c22390 RSI: 0000000000001400 RDI: 0000000000005827 RBP: 00007ffe72221e00 R08: 0000000000c2c3a0 R09: 0000000000000004 R10: 0000000000c363b0 R11: 0000000000000246 R12: 0000000000400650 R13: 00007ffe72221ee0 R14: 0000000000000000 R15: 0000000000000000 Code: 81 e4 ff ff 1f 00 48 21 c2 49 c1 ec 0c 48 c1 ea 0c 4c 01 e2 49 bc 00 00 00 00 00 ea ff ff 48 c1 e2 06 49 01 d4 f6 45 bc 04 74 90 <49> 8b 7c 24 20 40 f6 c7 01 75 2b 4c 89 e7 8b 47 1c 85 c0 7e 2a RIP: follow_huge_pmd+0x143/0x190 RSP: ffffc90004bdbcd0 CR2: ffffea0011943820 ---[ end trace e4f81353a2d23232 ]--- Kernel panic - not syncing: Fatal exception Kernel Offset: disabled This bug is triggered when pmd_present() returns true for non-present hugetlb, so fixing the present check in follow_huge_pmd() prevents it. Using pmd_present() to determine present/non-present for hugetlb is not correct, because pmd_present() checks multiple bits (not only _PAGE_PRESENT) for historical reason and it can misjudge hugetlb state. Fixes: `e66f17ff71` ("mm/hugetlb: take page table lock in follow_huge_pmd()") Link: http://lkml.kernel.org/r/1490149898-20231-1-git-send-email-n-horiguchi@ah.jp.nec.com Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Hugh Dickins <hughd@google.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-04-08 09:53:32 +02:00

1 2 3 4 5 ...

568088 Commits