Commit Graph

268387 Commits

Author SHA1 Message Date
David Herrmann
715ee7c3e5 HID: uhid: forward raw output reports to user-space
Some drivers that use non-standard HID features require raw output reports
sent to the device. We now forward these requests directly to user-space
so the transport-level driver can correctly send it to the device or
handle it correspondingly.

There is no way to signal back whether the transmission was successful,
moreover, there might be lots of messages coming out from the driver
flushing the output-queue. However, there is currently no driver that
causes this so we are safe. If some drivers need to transmit lots of data
this way, we need a method to synchronize this and can implement another
UHID_OUTPUT_SYNC event.

Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-08-16 12:44:29 -07:00
David Herrmann
bd46634b0b HID: uhid: forward output request to user-space
If the hid-driver wants to send standardized data to the device it uses a
linux input_event. We forward this to the user-space transport-level
driver so they can perform the requested action on the device.

Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-08-16 12:44:29 -07:00
David Herrmann
e458139e9e HID: uhid: forward open/close events to user-space
HID core notifies us with *_open/*_close callbacks when there is an actual
user of our device. We forward these to user-space so they can react on
this. This allows user-space to skip I/O unless they receive an OPEN
event. When they receive a CLOSE event they can stop I/O again to save
energy.

Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-08-16 12:44:28 -07:00
David Herrmann
245fd8c6b7 HID: uhid: add UHID_START and UHID_STOP events
We send UHID_START and UHID_STOP events to user-space when the HID core
starts/stops the device. This notifies user-space about driver readiness
and data-I/O can start now.

This directly forwards the callbacks from hid-core to user-space.

Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-08-16 12:44:28 -07:00
David Herrmann
66454600e6 HID: uhid: forward hid report-descriptor to hid core
When the uhid_hid_parse callback is called we simply forward it to
hid_parse_report() with the data that we got in the UHID_CREATE event.

Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-08-16 12:44:28 -07:00
David Herrmann
e0ad83067d HID: uhid: allow feeding input data into uhid devices
This adds a new event type UHID_INPUT which allows user-space to feed raw
HID reports into the HID subsystem. We copy the data into kernel memory
and directly feed it into the HID core.

There is no error handling of the events couldn't be parsed so user-space
should consider all events successfull unless read() returns an error.

Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-08-16 12:44:28 -07:00
David Herrmann
6b1a4c7c9a HID: uhid: add UHID_CREATE and UHID_DESTROY events
UHID_CREATE and UHID_DESTROY are used to create and destroy a device on an
open uhid char-device. Internally, we allocate and register an HID device
with the HID core and immediately start the device. From now on events may
be received or sent to the device.

The UHID_CREATE event has a payload similar to the data used by
Bluetooth-HIDP when creating a new connection.

Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-08-16 12:44:28 -07:00
David Herrmann
d3444a842a HID: uhid: implement write() on uhid devices
Similar to read() you can only write() a single event with one call to an
uhid device. To write multiple events use writev() which is supported by
uhid.

We currently always return -EOPNOTSUPP but other events will be added in
later patches.

Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-08-16 12:44:28 -07:00
David Herrmann
b8dd3d95ca HID: uhid: implement read() on uhid devices
User-space can use read() to get a single event from uhid devices. read()
does never return multiple events. This allows us to extend the event
structure and still keep backwards compatibility.

If user-space wants to get multiple events in one syscall, they should use
the readv()/writev() syscalls which are supported by uhid.

This introduces a new lock which helps us synchronizing simultaneous reads
from user-space. We also correctly return -EINVAL/-EFAULT only on errors
and retry the read() when some other thread captured the event faster than
we did.

Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-08-16 12:44:27 -07:00
David Herrmann
191b03260f HID: uhid: allow poll()'ing on uhid devices
As long as the internal buffer is not empty, we return POLLIN to
user-space.

uhid->head and uhid->tail are no atomics so the comparison may return
inexact results. However, this doesn't matter here as user-space would
need to poll() in two threads simultaneously to trigger this. And in this
case it doesn't matter if a cached result is returned or the exact new
result as user-space does not know which thread returns first from poll()
and the following read(). So it is safe to compare the values without
locking.

Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-08-16 12:44:27 -07:00
David Herrmann
ef5006bca1 HID: uhid: add internal message buffer
When receiving messages from the HID subsystem, we need to process them
and store them in an internal buffer so user-space can read() on the char
device to retrieve the messages.

This adds a static buffer for 32 messages to each uhid device. Each
message is dynamically allocated so the uhid_device structure does not get
too big.

uhid_queue() adds a message to the buffer. If the buffer is full, the
message is discarded. uhid_queue_event() is an helper for messages without
payload.

This also adds a public header: uhid.h. It contains the declarations for
the user-space API. It is built around "struct uhid_event" which contains
a type field which specifies the event type and each event can then add a
variable-length payload. For now, there is only a dummy event but later
patches will add new event types and payloads.

Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-08-16 12:44:27 -07:00
David Herrmann
4409981837 HID: uhid: introduce user-space I/O driver support for HID
This adds a dummy driver that will support user-space I/O drivers for the
HID subsystem. This allows to write transport-level drivers like USB-HID
and Bluetooth-HID in user-space.

Low-Energy Bluetooth needs this to feed HID data that is parsed in
user-space back into the kernel.

Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-08-16 12:44:27 -07:00
Dmitry Shmidt
efadaa2555 mmc: Make sure host is disabled on suspend
Change-Id: Ie0bf2004e173cef8dad66722a152658d7727ab65
Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>
2012-08-16 11:27:59 -07:00
xbw
b4996adcf8 sdmmc: modify the card is easy to run overtime due to data-busy. 2012-08-16 20:56:48 +08:00
CMY
101aadaefe 3G: Support for more 3g devices 2012-08-16 19:20:03 +08:00
ywj
2c6254864f Merge branch 'develop-3.0' of ssh://10.10.10.29/rk/kernel into develop-3.0 2012-08-16 18:23:33 +08:00
ywj
333cf5284c merge new gsensor driver 2012-08-16 17:59:08 +08:00
宋秀杰
4cf1f94c06 phonepad: codec set slave when BT incall, and set pll. 2012-08-16 17:58:48 +08:00
Zheng Yang
ab49eaa2e0 rk30 hdmi:
1. Need not take tmds_clk pull up to 3.3V as a hdmi connection condition.
	2. When parse unkown edid extensions, return false and set it as a hdmi sink.
2012-08-16 17:45:11 +08:00
黄涛
9e0aa7b6bb rga: refactor rga_drv.c for support rk30/rk31/rk2928 2012-08-16 15:55:57 +08:00
黄涛
28ef1d8c93 Input: rk29_keys: close debug print, open by 9ea1329 2012-08-16 15:46:15 +08:00
zsq
b7e56ead76 Merge branch 'develop-3.0-rk2928' of ssh://10.10.10.29/rk/kernel into develop-3.0-rk2928 2012-08-16 14:43:23 +08:00
zsq
475f235898 fix no async bug 2012-08-16 14:43:06 +08:00
ywj
170a300423 new add bmp logo for factorytool 2012-08-16 14:33:38 +08:00
ywj
1f3bc5a919 new add LCD driver for factorytool 2012-08-16 14:31:09 +08:00
ywj
c353c8fed9 new add rk30_factory_adc_battery driver for factorytool 2012-08-16 14:29:57 +08:00
ywj
03d6b5fe19 add node of rk29_backlight and rk29_keys for factorytool 2012-08-16 14:27:56 +08:00
ywj
0f08408447 add gsensor orientation node for factorytool 2012-08-16 14:25:47 +08:00
黄涛
278d2a1ab2 gpio: rk2928: enable support rk30_gpiolib_pull_updown 2012-08-16 11:51:17 +08:00
黄涛
c5066ca248 rk2928: add initial pm support 2012-08-16 09:43:53 +08:00
黄涛
7db28b5ce2 rk2928: sram: fix loop use Thumb instruction set 2012-08-16 09:22:17 +08:00
黄涛
bbc68b65d9 i2c: rk: only rk29/rk30 need idle lock 2012-08-16 09:22:16 +08:00
Greg Kroah-Hartman
a422ca75bd Linux 3.0.41 v3.0.41 2012-08-15 12:05:01 -07:00
Stanislaw Gruszka
931d5990ed rt61pci: fix NULL pointer dereference in config_lna_gain
commit deee0214de upstream.

We can not pass NULL libconf->conf->channel to rt61pci_config() as it
is dereferenced unconditionally in rt61pci_config_lna_gain() subroutine.

Resolves:
https://bugzilla.kernel.org/show_bug.cgi?id=44361

Reported-and-tested-by: <dolohow@gmail.com>
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-15 12:04:30 -07:00
Chris Bagwell
ab7029e676 Input: wacom - Bamboo One 1024 pressure fix
commit 6dc463511d upstream.

Bamboo One's with ID of 0x6a and 0x6b were added with correct
indication of 1024 pressure levels but the Graphire packet routine
was only looking at 9 bits.  Increased to 10 bits.

This bug caused these devices to roll over to zero pressure at half
way mark.

The other devices using this routine only support 256 or 512 range
and look to fix unused bits at zero.

Signed-off-by: Chris Bagwell <chris@cnpbagwell.com>
Reported-by: Tushant Mirchandani <tushantin@gmail.com>
Reviewed-by: Ping Cheng <pingc@wacom.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-15 12:04:30 -07:00
Tushar Dave
5a4cebe94b e1000e: NIC goes up and immediately goes down
commit b7ec70be01 upstream.

Found that commit d478eb44 was a bad commit.
If the link partner is transmitting codeword (even if NULL codeword),
then the RXCW.C bit will be set so check for RXCW.CW is unnecessary.
Ref: RH BZ 840642

Reported-by: Fabio Futigami <ffutigam@redhat.com>
Signed-off-by: Tushar Dave <tushar.n.dave@intel.com>
CC: Marcelo Ricardo Leitner <mleitner@redhat.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-15 12:04:30 -07:00
Liang Li
4f5a5866aa cfg80211: fix interface combinations check for ADHOC(IBSS)
partial of commit 8e8b41f9d8 upstream.

As part of commit 463454b5db ("cfg80211: fix interface
combinations check"), this extra check was introduced:

       if ((all_iftypes & used_iftypes) != used_iftypes)
               goto cont;

However, most wireless NIC drivers did not advertise ADHOC in
wiphy.iface_combinations[i].limits[] and hence we'll get -EBUSY
when we bring up a ADHOC wlan with commands similar to:

 # iwconfig wlan0 mode ad-hoc && ifconfig wlan0 up

In commit 8e8b41f9d8 ("cfg80211: enforce lack of interface
combinations"), the change below fixes the issue:

       if (total == 1)
               return 0;

But it also introduces other dependencies for stable. For example,
a full cherry pick of 8e8b41f9d8 would introduce additional
regressions unless we also start cherry picking driver specific
fixes like the following:

  9b4760e  ath5k: add possible wiphy interface combinations
  1ae2fc2  mac80211_hwsim: advertise interface combinations
  20c8e8d  ath9k: add possible wiphy interface combinations

And the purpose of the 'if (total == 1)' is to cover the specific
use case (IBSS, adhoc) that was mentioned above. So we just pick
the specific part out from 8e8b41f9d8 here.

Doing so gives stable kernels a way to fix the change introduced
by 463454b5db, without having to make cherry picks specific to
various NIC drivers.

Signed-off-by: Liang Li <liang.li@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-15 12:04:30 -07:00
Daniel Drake
b27c59d2c2 cfg80211: process pending events when unregistering net device
commit 1f6fc43e62 upstream.

libertas currently calls cfg80211_disconnected() when it is being
brought down. This causes an event to be allocated, but since the
wdev is already removed from the rdev by the time that the event
processing work executes, the event is never processed or freed.
http://article.gmane.org/gmane.linux.kernel.wireless.general/95666

Fix this leak, and other possible situations, by processing the event
queue when a device is being unregistered. Thanks to Johannes Berg for
the suggestion.

Signed-off-by: Daniel Drake <dsd@laptop.org>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-15 12:04:30 -07:00
Arnd Bergmann
9f75ebd871 ARM: pxa: remove irq_to_gpio from ezx-pcap driver
commit 59ee93a528 upstream.

The irq_to_gpio function was removed from the pxa platform
in linux-3.2, and this driver has been broken since.

There is actually no in-tree user of this driver that adds
this platform device, but the driver can and does get enabled
on some platforms.

Without this patch, building ezx_defconfig results in:

drivers/mfd/ezx-pcap.c: In function 'pcap_isr_work':
drivers/mfd/ezx-pcap.c:205:2: error: implicit declaration of function 'irq_to_gpio' [-Werror=implicit-function-declaration]

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Haojian Zhuang <haojian.zhuang@gmail.com>
Cc: Samuel Ortiz <sameo@linux.intel.com>
Cc: Daniel Ribeiro <drwyrm@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-15 12:04:30 -07:00
Marek Vasut
c75f1f090e ARM: mxs: Remove MMAP_MIN_ADDR setting from mxs_defconfig
commit 3bed491c8d upstream.

The CONFIG_DEFAULT_MMAP_MIN_ADDR was set to 65536 in mxs_defconfig,
this caused severe breakage of userland applications since the upper
limit for ARM is 32768. By default CONFIG_DEFAULT_MMAP_MIN_ADDR is
set to 4096 and can also be changed via /proc/sys/vm/mmap_min_addr
if needed.

Quoting Russell King [1]:

"4096 is also fine for ARM too. There's not much point in having
defconfigs change it - that would just be pure noise in the config
files."

the CONFIG_DEFAULT_MMAP_MIN_ADDR can be removed from the defconfig
altogether.

This problem was introduced by commit cde7c41 (ARM: configs: add
defconfig for mach-mxs).

[1] http://marc.info/?l=linux-arm-kernel&m=134401593807820&w=2

Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Wolfgang Denk <wd@denx.de>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-15 12:04:29 -07:00
Mel Gorman
4c9682c526 mm: hugetlbfs: close race during teardown of hugetlbfs shared page tables
commit d833352a43 upstream.

If a process creates a large hugetlbfs mapping that is eligible for page
table sharing and forks heavily with children some of whom fault and
others which destroy the mapping then it is possible for page tables to
get corrupted.  Some teardowns of the mapping encounter a "bad pmd" and
output a message to the kernel log.  The final teardown will trigger a
BUG_ON in mm/filemap.c.

This was reproduced in 3.4 but is known to have existed for a long time
and goes back at least as far as 2.6.37.  It was probably was introduced
in 2.6.20 by [39dde65c: shared page table for hugetlb page].  The messages
look like this;

[  ..........] Lots of bad pmd messages followed by this
[  127.164256] mm/memory.c:391: bad pmd ffff880412e04fe8(80000003de4000e7).
[  127.164257] mm/memory.c:391: bad pmd ffff880412e04ff0(80000003de6000e7).
[  127.164258] mm/memory.c:391: bad pmd ffff880412e04ff8(80000003de0000e7).
[  127.186778] ------------[ cut here ]------------
[  127.186781] kernel BUG at mm/filemap.c:134!
[  127.186782] invalid opcode: 0000 [#1] SMP
[  127.186783] CPU 7
[  127.186784] Modules linked in: af_packet cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf ext3 jbd dm_mod coretemp crc32c_intel usb_storage ghash_clmulni_intel aesni_intel i2c_i801 r8169 mii uas sr_mod cdrom sg iTCO_wdt iTCO_vendor_support shpchp serio_raw cryptd aes_x86_64 e1000e pci_hotplug dcdbas aes_generic container microcode ext4 mbcache jbd2 crc16 sd_mod crc_t10dif i915 drm_kms_helper drm i2c_algo_bit ehci_hcd ahci libahci usbcore rtc_cmos usb_common button i2c_core intel_agp video intel_gtt fan processor thermal thermal_sys hwmon ata_generic pata_atiixp libata scsi_mod
[  127.186801]
[  127.186802] Pid: 9017, comm: hugetlbfs-test Not tainted 3.4.0-autobuild #53 Dell Inc. OptiPlex 990/06D7TR
[  127.186804] RIP: 0010:[<ffffffff810ed6ce>]  [<ffffffff810ed6ce>] __delete_from_page_cache+0x15e/0x160
[  127.186809] RSP: 0000:ffff8804144b5c08  EFLAGS: 00010002
[  127.186810] RAX: 0000000000000001 RBX: ffffea000a5c9000 RCX: 00000000ffffffc0
[  127.186811] RDX: 0000000000000000 RSI: 0000000000000009 RDI: ffff88042dfdad00
[  127.186812] RBP: ffff8804144b5c18 R08: 0000000000000009 R09: 0000000000000003
[  127.186813] R10: 0000000000000000 R11: 000000000000002d R12: ffff880412ff83d8
[  127.186814] R13: ffff880412ff83d8 R14: 0000000000000000 R15: ffff880412ff83d8
[  127.186815] FS:  00007fe18ed2c700(0000) GS:ffff88042dce0000(0000) knlGS:0000000000000000
[  127.186816] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  127.186817] CR2: 00007fe340000503 CR3: 0000000417a14000 CR4: 00000000000407e0
[  127.186818] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  127.186819] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  127.186820] Process hugetlbfs-test (pid: 9017, threadinfo ffff8804144b4000, task ffff880417f803c0)
[  127.186821] Stack:
[  127.186822]  ffffea000a5c9000 0000000000000000 ffff8804144b5c48 ffffffff810ed83b
[  127.186824]  ffff8804144b5c48 000000000000138a 0000000000001387 ffff8804144b5c98
[  127.186825]  ffff8804144b5d48 ffffffff811bc925 ffff8804144b5cb8 0000000000000000
[  127.186827] Call Trace:
[  127.186829]  [<ffffffff810ed83b>] delete_from_page_cache+0x3b/0x80
[  127.186832]  [<ffffffff811bc925>] truncate_hugepages+0x115/0x220
[  127.186834]  [<ffffffff811bca43>] hugetlbfs_evict_inode+0x13/0x30
[  127.186837]  [<ffffffff811655c7>] evict+0xa7/0x1b0
[  127.186839]  [<ffffffff811657a3>] iput_final+0xd3/0x1f0
[  127.186840]  [<ffffffff811658f9>] iput+0x39/0x50
[  127.186842]  [<ffffffff81162708>] d_kill+0xf8/0x130
[  127.186843]  [<ffffffff81162812>] dput+0xd2/0x1a0
[  127.186845]  [<ffffffff8114e2d0>] __fput+0x170/0x230
[  127.186848]  [<ffffffff81236e0e>] ? rb_erase+0xce/0x150
[  127.186849]  [<ffffffff8114e3ad>] fput+0x1d/0x30
[  127.186851]  [<ffffffff81117db7>] remove_vma+0x37/0x80
[  127.186853]  [<ffffffff81119182>] do_munmap+0x2d2/0x360
[  127.186855]  [<ffffffff811cc639>] sys_shmdt+0xc9/0x170
[  127.186857]  [<ffffffff81410a39>] system_call_fastpath+0x16/0x1b
[  127.186858] Code: 0f 1f 44 00 00 48 8b 43 08 48 8b 00 48 8b 40 28 8b b0 40 03 00 00 85 f6 0f 88 df fe ff ff 48 89 df e8 e7 cb 05 00 e9 d2 fe ff ff <0f> 0b 55 83 e2 fd 48 89 e5 48 83 ec 30 48 89 5d d8 4c 89 65 e0
[  127.186868] RIP  [<ffffffff810ed6ce>] __delete_from_page_cache+0x15e/0x160
[  127.186870]  RSP <ffff8804144b5c08>
[  127.186871] ---[ end trace 7cbac5d1db69f426 ]---

The bug is a race and not always easy to reproduce.  To reproduce it I was
doing the following on a single socket I7-based machine with 16G of RAM.

$ hugeadm --pool-pages-max DEFAULT:13G
$ echo $((18*1048576*1024)) > /proc/sys/kernel/shmmax
$ echo $((18*1048576*1024)) > /proc/sys/kernel/shmall
$ for i in `seq 1 9000`; do ./hugetlbfs-test; done

On my particular machine, it usually triggers within 10 minutes but
enabling debug options can change the timing such that it never hits.
Once the bug is triggered, the machine is in trouble and needs to be
rebooted.  The machine will respond but processes accessing proc like "ps
aux" will hang due to the BUG_ON.  shutdown will also hang and needs a
hard reset or a sysrq-b.

The basic problem is a race between page table sharing and teardown.  For
the most part page table sharing depends on i_mmap_mutex.  In some cases,
it is also taking the mm->page_table_lock for the PTE updates but with
shared page tables, it is the i_mmap_mutex that is more important.

Unfortunately it appears to be also insufficient. Consider the following
situation

Process A					Process B
---------					---------
hugetlb_fault					shmdt
  						LockWrite(mmap_sem)
    						  do_munmap
						    unmap_region
						      unmap_vmas
						        unmap_single_vma
						          unmap_hugepage_range
      						            Lock(i_mmap_mutex)
							    Lock(mm->page_table_lock)
							    huge_pmd_unshare/unmap tables <--- (1)
							    Unlock(mm->page_table_lock)
      						            Unlock(i_mmap_mutex)
  huge_pte_alloc				      ...
    Lock(i_mmap_mutex)				      ...
    vma_prio_walk, find svma, spte		      ...
    Lock(mm->page_table_lock)			      ...
    share spte					      ...
    Unlock(mm->page_table_lock)			      ...
    Unlock(i_mmap_mutex)			      ...
  hugetlb_no_page									  <--- (2)
						      free_pgtables
						        unlink_file_vma
							hugetlb_free_pgd_range
						    remove_vma_list

In this scenario, it is possible for Process A to share page tables with
Process B that is trying to tear them down.  The i_mmap_mutex on its own
does not prevent Process A walking Process B's page tables.  At (1) above,
the page tables are not shared yet so it unmaps the PMDs.  Process A sets
up page table sharing and at (2) faults a new entry.  Process B then trips
up on it in free_pgtables.

This patch fixes the problem by adding a new function
__unmap_hugepage_range_final that is only called when the VMA is about to
be destroyed.  This function clears VM_MAYSHARE during
unmap_hugepage_range() under the i_mmap_mutex.  This makes the VMA
ineligible for sharing and avoids the race.  Superficially this looks like
it would then be vunerable to truncate and madvise issues but hugetlbfs
has its own truncate handlers so does not use unmap_mapping_range() and
does not support madvise(DONTNEED).

This should be treated as a -stable candidate if it is merged.

Test program is as follows. The test case was mostly written by Michal
Hocko with a few minor changes to reproduce this bug.

==== CUT HERE ====

static size_t huge_page_size = (2UL << 20);
static size_t nr_huge_page_A = 512;
static size_t nr_huge_page_B = 5632;

unsigned int get_random(unsigned int max)
{
	struct timeval tv;

	gettimeofday(&tv, NULL);
	srandom(tv.tv_usec);
	return random() % max;
}

static void play(void *addr, size_t size)
{
	unsigned char *start = addr,
		      *end = start + size,
		      *a;
	start += get_random(size/2);

	/* we could itterate on huge pages but let's give it more time. */
	for (a = start; a < end; a += 4096)
		*a = 0;
}

int main(int argc, char **argv)
{
	key_t key = IPC_PRIVATE;
	size_t sizeA = nr_huge_page_A * huge_page_size;
	size_t sizeB = nr_huge_page_B * huge_page_size;
	int shmidA, shmidB;
	void *addrA = NULL, *addrB = NULL;
	int nr_children = 300, n = 0;

	if ((shmidA = shmget(key, sizeA, IPC_CREAT|SHM_HUGETLB|0660)) == -1) {
		perror("shmget:");
		return 1;
	}

	if ((addrA = shmat(shmidA, addrA, SHM_R|SHM_W)) == (void *)-1UL) {
		perror("shmat");
		return 1;
	}
	if ((shmidB = shmget(key, sizeB, IPC_CREAT|SHM_HUGETLB|0660)) == -1) {
		perror("shmget:");
		return 1;
	}

	if ((addrB = shmat(shmidB, addrB, SHM_R|SHM_W)) == (void *)-1UL) {
		perror("shmat");
		return 1;
	}

fork_child:
	switch(fork()) {
		case 0:
			switch (n%3) {
			case 0:
				play(addrA, sizeA);
				break;
			case 1:
				play(addrB, sizeB);
				break;
			case 2:
				break;
			}
			break;
		case -1:
			perror("fork:");
			break;
		default:
			if (++n < nr_children)
				goto fork_child;
			play(addrA, sizeA);
			break;
	}
	shmdt(addrA);
	shmdt(addrB);
	do {
		wait(NULL);
	} while (--n > 0);
	shmctl(shmidA, IPC_RMID, NULL);
	shmctl(shmidB, IPC_RMID, NULL);
	return 0;
}

[akpm@linux-foundation.org: name the declaration's args, fix CONFIG_HUGETLBFS=n build]
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-15 12:04:29 -07:00
Borislav Petkov
bb014c405d x86, microcode: Sanitize per-cpu microcode reloading interface
commit c9fc3f778a upstream.

Microcode reloading in a per-core manner is a very bad idea for both
major x86 vendors. And the thing is, we have such interface with which
we can end up with different microcode versions applied on different
cores of an otherwise homogeneous wrt (family,model,stepping) system.

So turn off the possibility of doing that per core and allow it only
system-wide.

This is a minimal fix which we'd like to see in stable too thus the
more-or-less arbitrary decision to allow system-wide reloading only on
the BSP:

$ echo 1 > /sys/devices/system/cpu/cpu0/microcode/reload
...

and disable the interface on the other cores:

$ echo 1 > /sys/devices/system/cpu/cpu23/microcode/reload
-bash: echo: write error: Invalid argument

Also, allowing the reload only from one CPU (the BSP in
that case) doesn't allow the reload procedure to degenerate
into an O(n^2) deal when triggering reloads from all
/sys/devices/system/cpu/cpuX/microcode/reload sysfs nodes
simultaneously.

A more generic fix will follow.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1340280437-7718-2-git-send-email-bp@amd64.org
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-15 12:04:29 -07:00
Shuah Khan
d426c78930 x86, microcode: microcode_core.c simple_strtoul cleanup
commit e826abd523 upstream.

Change reload_for_cpu() in kernel/microcode_core.c to call kstrtoul()
instead of calling obsoleted simple_strtoul().

Signed-off-by: Shuah Khan <shuahkhan@gmail.com>
Reviewed-by: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/1336324264.2897.9.camel@lorien2
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-15 12:04:29 -07:00
H. Peter Anvin
9f6082404e random: mix in architectural randomness in extract_buf()
commit d2e7c96af1 upstream.

Mix in any architectural randomness in extract_buf() instead of
xfer_secondary_buf().  This allows us to mix in more architectural
randomness, and it also makes xfer_secondary_buf() faster, moving a
tiny bit of additional CPU overhead to process which is extracting the
randomness.

[ Commit description modified by tytso to remove an extended
  advertisement for the RDRAND instruction. ]

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: DJ Johnston <dj.johnston@intel.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-15 12:04:29 -07:00
Tony Luck
4f4cb6f72c dmi: Feed DMI table to /dev/random driver
commit d114a33387 upstream.

Send the entire DMI (SMBIOS) table to the /dev/random driver to
help seed its pools.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-15 12:04:29 -07:00
Tony Luck
dbbdd2bb89 random: Add comment to random_initialize()
commit cbc96b7594 upstream.

Many platforms have per-machine instance data (serial numbers,
asset tags, etc.) squirreled away in areas that are accessed
during early system bringup. Mixing this data into the random
pools has a very high value in providing better random data,
so we should allow (and even encourage) architecture code to
call add_device_randomness() from the setup_arch() paths.

However, this limits our options for internal structure of
the random driver since random_initialize() is not called
until long after setup_arch().

Add a big fat comment to rand_initialize() spelling out
this requirement.

Suggested-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-15 12:04:29 -07:00
Theodore Ts'o
b6b847a93b random: remove rand_initialize_irq()
commit c5857ccf29 upstream.

With the new interrupt sampling system, we are no longer using the
timer_rand_state structure in the irq descriptor, so we can stop
initializing it now.

[ Merged in fixes from Sedat to find some last missing references to
  rand_initialize_irq() ]

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-15 12:04:28 -07:00
Mark Brown
f99ef862a7 mfd: wm831x: Feed the device UUID into device_add_randomness()
commit 27130f0cc3 upstream.

wm831x devices contain a unique ID value. Feed this into the newly added
device_add_randomness() to add some per device seed data to the pool.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-15 12:04:28 -07:00
Mark Brown
2fcadd9362 rtc: wm831x: Feed the write counter into device_add_randomness()
commit 9dccf55f4c upstream.

The tamper evident features of the RTC include the "write counter" which
is a pseudo-random number regenerated whenever we set the RTC. Since this
value is unpredictable it should provide some useful seeding to the random
number generator.

Only do this on boot since the goal is to seed the pool rather than add
useful entropy.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-15 12:04:28 -07:00
Theodore Ts'o
0789520922 MAINTAINERS: Theodore Ts'o is taking over the random driver
commit 330e0a01d5 upstream.

Matt Mackall stepped down as the /dev/random driver maintainer last
year, so Theodore Ts'o is taking back the /dev/random driver.

Cc: Matt Mackall <mpm@selenic.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-15 12:04:16 -07:00